-
Notifications
You must be signed in to change notification settings - Fork 63
feat: add option to rebuild gRPC connection on error #1668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
2f46dcf to
83036d6
Compare
Summary of ChangesHello @toddbaert, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust mechanism to automatically rebuild gRPC connections when the provider encounters an error, particularly useful for maintaining stability in environments with intermittent network issues or complex service mesh configurations. The feature is opt-in, providing administrators with a powerful tool for enhanced resilience without altering default behavior. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
83036d6 to
112607a
Compare
Signed-off-by: Todd Baert <[email protected]>
112607a to
b429a68
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature for rebuilding the gRPC connection on error, which can significantly aid in troubleshooting network issues. The implementation is generally well-done, with corresponding unit tests. However, I've identified a potential race condition related to thread safety during connection re-initialization, a minor logic issue in the new error handler, and some improvements for the new tests to make them more robust. My detailed comments and suggestions are provided below.
...e/contrib/providers/flagd/resolver/process/storage/connector/sync/SyncStreamQueueSource.java
Outdated
Show resolved
Hide resolved
...rc/main/java/dev/openfeature/contrib/providers/flagd/resolver/process/InProcessResolver.java
Outdated
Show resolved
Hide resolved
...est/java/dev/openfeature/contrib/providers/flagd/resolver/process/InProcessResolverTest.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Todd Baert <[email protected]>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature to rebuild the gRPC connection on error, enhancing the provider's resilience in unstable network environments. The implementation is well-designed, using a GrpcComponents wrapper for atomic updates and proper synchronization to ensure thread safety. The feature is configurable and disabled by default, which is a sensible choice. The unit tests are thorough in covering the new logic. I've pointed out a couple of potential improvements in the test suite to prevent a possible NullPointerException and a resource leak. Overall, this is a strong contribution that improves the robustness of the flagd provider.
...e/contrib/providers/flagd/resolver/process/storage/connector/sync/SyncStreamQueueSource.java
Outdated
Show resolved
Hide resolved
...ntrib/providers/flagd/resolver/process/storage/connector/sync/SyncStreamQueueSourceTest.java
Outdated
Show resolved
Hide resolved
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Todd Baert <[email protected]>
...e/contrib/providers/flagd/resolver/process/storage/connector/sync/SyncStreamQueueSource.java
Show resolved
Hide resolved
...e/contrib/providers/flagd/resolver/process/storage/connector/sync/SyncStreamQueueSource.java
Show resolved
Hide resolved
Signed-off-by: Todd Baert <[email protected]>
|
|
||
| try { | ||
| // create new channel components first | ||
| initializeChannelComponents(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to call grpcComponents.channelConnector.initialize(); after reinitializing the GrpcComponents object, like we do in the init() method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good eye... but... I think I discovered something earlier about this that I lost track of and forgot to mention:
I think the initialize method in the ChannelConnector is actually very well-hidden dead code. I noticed earlier that it basically ONLY sets up a channel monitor that emits events for channel disconnections, but we already will get such events from the stream handler itself.
I think this is why my manual testing had no issue not including the call you suggest above, and same with our e2e test suite. In fact, if I completely comment out all the functionality of the ChannelConnector.initialize method, our entire e2e suite runs fine, including the assertion that we get disconnect events and other stream events.
Unless I'm missing something which we have no tests for, I think this method can be deleted.
cc @aepfli do you know anything about this?
...e/contrib/providers/flagd/resolver/process/storage/connector/sync/SyncStreamQueueSource.java
Outdated
Show resolved
Hide resolved
...e/contrib/providers/flagd/resolver/process/storage/connector/sync/SyncStreamQueueSource.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Todd Baert <[email protected]>
This PR adds an experimental option which rebuilds the entire gRPC connection if the provider goes into error state (after grace period). This can be useful for troublehsooting network issues, especially in the context of service meshes/proxies.
I added unit tests, but I also tested locally, and it works as expected. The feature can be enabled through config, but also with an env var:
FLAGD_REINITIALIZE_ON_ERROR. It defaults to off.