-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Fix node bootstrap error when enable stream transport and remote cluster state #19948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix node bootstrap error when enable stream transport and remote cluster state #19948
Conversation
|
❌ Gradle check result for 566d7f6: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
566d7f6 to
b64bdc7
Compare
…ransport Signed-off-by: bowenlan-amzn <[email protected]>
b64bdc7 to
68fb9d8
Compare
|
❌ Gradle check result for 68fb9d8: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #19948 +/- ##
============================================
+ Coverage 73.25% 73.26% +0.01%
- Complexity 71555 71592 +37
============================================
Files 5785 5785
Lines 326828 326831 +3
Branches 47295 47295
============================================
+ Hits 239429 239464 +35
+ Misses 68163 68133 -30
+ Partials 19236 19234 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| RemoteStoreNodeService remoteStoreNodeService, | ||
| boolean useStreamTransport |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The weird thing here is that "useStreamTransport" doesn't make any sense because there's nothing about stream transport in the LocalNodeFactory. I think you can use composition to make this more clear, if a bit more verbose. You can refactor LocalNodeFactory into two separate functions. First, remove RemoteStoreNodeService from this implementation, then create one that looks like:
private static class RemoteStoreVerifyingLocalNodeFactory extends LocalNodeFactory {
private final RemoteStoreNodeService remoteStoreNodeService;
private RemoteStoreVerifyingLocalNodeFactory(Settings settings, String persistentNodeId, RemoteStoreNodeService remoteStoreNodeService) {
super(settings, persistentNodeId);
this.remoteStoreNodeService = remoteStoreNodeService;
}
@Override
public DiscoveryNode apply(BoundTransportAddress boundTransportAddress) {
final DiscoveryNode discoveryNode = super.apply(boundTransportAddress);
if (isRemoteStoreAttributePresent(settings)) {
remoteStoreNodeService.createAndVerifyRepositories(discoveryNode);
}
return discoveryNode;
}
}In the normal case you construct a RemoteStoreVerifyingLocalNodeFactory instance, and in the stream transport case you just construct a plain LocalNodeFactory instance.
Description
This PR fixes a bootstrap failure that occurs when streaming transport is used with remote cluster state. Without this fix, nodes fail to start with the error:
Root Cause
When streaming transport is enabled, the node bootstrap process creates two separate
LocalNodeFactoryinstances:TransportServiceStreamTransportServiceDuring node startup, both transport services are started sequentially:
Both services inherit
TransportService.doStart()which calls:Each call to
LocalNodeFactory.apply()triggers:DiscoveryNodecreationremoteStoreNodeService.createAndVerifyRepositories()Since both factories attempt to register the same repositories (configured via node attributes), the second call fails with "can't overwrite as repositories are already present".
Solution
We don't do Remote store repository creation and verification when it's for stream transport
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.