Improve diagnosability and reduce flakiness in OutOfProcRarNode_Tests.RunsOutOfProcIfAllFlagsAreEnabled #12670
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
The
OutOfProcRarNode_Tests.RunsOutOfProcIfAllFlagsAreEnabledtest was experiencing intermittent failures on Linux whererar.Execute()would return false. The root cause is a race condition: the test manually constructs anOutOfProcRarNodeEndpointand startsRunAsync, then immediately callsrar.Execute(). If the endpoint server loop has not begun waiting for connections, the task may fail to connect and fall back or fail altogether.Additionally, there was minimal diagnostic information in assertions when the result was false or the expected
OutOfProcRarClientwas not registered, making it difficult to troubleshoot these intermittent failures in CI.Solution
This PR adds comprehensive diagnostic logging and introduces a startup delay to reduce the race condition:
1. Environment Diagnostics
Logs environment information at test start to help correlate failures:
RuntimeInformation.OSDescription)RuntimeInformation.FrameworkDescription)2. Startup Delay
Introduces a 200ms delay using
Thread.Sleep()after starting the endpoint but before callingrar.Execute(). This allows the async endpoint server to start accepting connections, significantly reducing the race condition. The delay value balances test speed with reliability.3. Execution Timing
Uses
Stopwatchto measure and log the execution time ofrar.Execute(), providing correlation data for diagnosing performance issues.4. Enhanced Failure Diagnostics
Added a
DumpDiagnostics()helper method that outputs comprehensive information when the test fails:5. Success Logging
On successful test runs, logs confirmation of
OutOfProcRarClientregistration and the resolved file path for correlation.Example Output
Success case:
Testing
OutOfProcRarNode_TestspassImpact
Original prompt
This pull request was created as a result of the following prompt from Copilot chat.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.