DRIVERS-3134: pre-populate pool in flaky CSOT runCommandCursor test#1769
Closed
baileympearson wants to merge 1 commit intomongodb:masterfrom
Closed
DRIVERS-3134: pre-populate pool in flaky CSOT runCommandCursor test#1769baileympearson wants to merge 1 commit intomongodb:masterfrom
baileympearson wants to merge 1 commit intomongodb:masterfrom
Conversation
35f8b87 to
6a1c9fb
Compare
Contributor
|
@baileympearson do you plan to pick this back up? |
6a1c9fb to
29de6e5
Compare
Contributor
Author
|
@ShaneHarvey This PR fell off my radar (Daria reminded me about it). We haven't seen this test flake for a while, although we do see other CSOT tests flake. I think I'll just close this for now, and if it starts flaking again I can re-open it. |
Contributor
|
Thanks, should we also close https://jira.mongodb.org/browse/DRIVERS-3134? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The "Non-tailable cursor lifetime remaining timeoutMS applied to getMore if timeoutMode is unset" flakes fairly regularly in Node with the following error:
Turns out connection establishment is taking roughly 40ms worth of time. This seems unusually high, measuring connection establishment while running this test repeatedly usually shows establishment taking 2-8 ms. I also observe that, as the tests run, we occasionally hit periods where establishment increases to between 20-40ms for a few iterations of the tests, and then drops back down to the usual 2-8ms for establishment.
The delay in connection establishment causes that the
findto time out, not thegetMore. As a result, we still get a timeout error in the test but we have too few CommandStartedEvents and the test fails.Node has seen this in other non-spec tests we wrote while implementing CSOT. The most reliable way to resolve the flakiness is to pre-populate the pool with a connection before starting the test. For any test that relies on failpoints to cause timeouts, it makes sense to pre-populate the pool with a connection so that we can as much variance in timing before the connection layer as possible.
Please complete the following before merging:
clusters, and serverless).