feat: Raft-based High Availability using Apache Ratis#3731
feat: Raft-based High Availability using Apache Ratis#3731
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a fundamental shift in ArcadeDB's high availability architecture by integrating Apache Ratis, a robust Raft consensus implementation. This change aims to provide stronger consistency guarantees, improved fault tolerance, and a more reliable distributed system. The new design replaces custom replication logic with a proven consensus algorithm, backed by extensive testing for various failure scenarios and network conditions. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
2 similar comments
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
There was a problem hiding this comment.
Code Review
This pull request introduces a new Raft-based High Availability (HA) implementation for ArcadeDB, replacing the existing custom protocol. It includes a new ha-raft Maven module with components like RaftLogEntryCodec for serializing WAL and schema changes, ArcadeStateMachine for applying these changes, RaftHAServer for managing the Ratis cluster, and RaftReplicatedDatabase for intercepting database operations. New global configuration properties have been added for Raft-specific settings, and the server startup logic is updated to integrate the new HA plugin. Extensive unit and integration tests have been added to cover various replication, failover, and network partition scenarios, including a new /api/v1/cluster endpoint for status. The .gitignore file was updated. Feedback includes a dependency misconfiguration in ha-raft/pom.xml, a suggestion for .gitignore hygiene, a potential edge case in TransactionManager.applyChanges regarding bucket count updates, an incomplete replica lag monitoring implementation in RaftHAServer, and a recommendation for more specific error messages in AbstractServerHttpHandler for cluster token validation.
| </dependencies> | ||
|
|
||
| <build> | ||
| <plugins> | ||
| <plugin> | ||
| <groupId>org.apache.maven.plugins</groupId> |
There was a problem hiding this comment.
There appears to be a dependency misconfiguration in this pom.xml. The arcadedb-server artifact is included twice: once with provided scope (which is correct for a plugin) and again here with compile scope. The compile scope dependency also uses a hardcoded version (26.4.1-SNAPSHOT) instead of the ${project.parent.version} property.
This can lead to unpredictable build behavior and dependency conflicts. The compile scope dependency on arcadedb-server should be removed to avoid these issues.
| *.lsmvecidx | ||
| *.metadata.json | ||
|
|
||
| notes.txt |
There was a problem hiding this comment.
The file notes.txt appears to be a personal note file. It's generally better to add such user-specific files to your global .gitignore file (e.g., ~/.config/git/ignore) or the repository's local exclude file (.git/info/exclude) rather than the project's shared .gitignore. This helps keep the project's ignore list clean and focused on project-specific generated files and artifacts.
| if (changed) { | ||
| for (final Map.Entry<Integer, Integer> entry : bucketRecordDelta.entrySet()) { | ||
| final LocalBucket bucket = (LocalBucket) database.getSchema().getBucketById(entry.getKey()); |
There was a problem hiding this comment.
This loop over bucketRecordDelta is now inside if (changed). This is a good optimization to avoid unnecessary work when no pages were actually modified in the transaction. However, the changed flag is only set to true inside the loop over tx.pages. If a transaction only contains bucket count changes but no page modifications (which might be a rare edge case), this change would prevent the bucket counts from being updated.
If it's guaranteed that any transaction affecting bucket counts will also have modified pages, then this is fine. Otherwise, it might be safer to update the bucket counts regardless of the changed flag, as was the case previously.
| private void checkReplicaLag() { | ||
| try { | ||
| final var division = raftServer.getDivision(raftGroup.getGroupId()); | ||
| final var info = division.getInfo(); | ||
| if (!info.isLeader()) | ||
| return; | ||
| final long commitIndex = info.getLastAppliedIndex(); | ||
| clusterMonitor.updateLeaderCommitIndex(commitIndex); | ||
| } catch (final Exception e) { | ||
| LogManager.instance().log(this, Level.FINE, "Error checking replica lag", e); | ||
| } | ||
| } |
There was a problem hiding this comment.
The implementation of checkReplicaLag seems incomplete. As the comment inside the method notes, the Ratis public API may not expose per-follower match indexes, so this implementation only updates the leader's commit index in the ClusterMonitor. This means the lag monitoring feature will not function as intended because it cannot calculate or report any lag for individual replicas.
Consider either removing this monitoring logic until it can be fully implemented or adding a more prominent // TODO and logging a warning at startup that replica lag monitoring is not yet functional.
| sendErrorResponse(exchange, 401, "Invalid cluster token", null, null); | ||
| return null; | ||
| } |
There was a problem hiding this comment.
The response for an invalid cluster token is a generic 401 error. For improved diagnostics in a misconfigured cluster, it would be helpful to provide a more specific error message in the response body, such as {"error": "Invalid cluster token"}. This would make it easier for operators to identify the root cause of authentication failures when forwarding requests between nodes.
Code Review: Raft-based HA using Apache RatisThis is a well-architected and thoroughly tested addition. The use of Apache Ratis is a sound choice for proven consensus semantics, and the plugin-based design with ServiceLoader is clean. Below are issues found, ordered by severity. Critical1. Duplicate At the bottom of 2. Project guidelines require updating ATTRIBUTIONS.md when adding dependencies. Apache Ratis 3.2.0 ( High Priority3. Nine Nine AI-generated implementation plan documents are being added to 4. The 5. The format changed from 6. No TLS for Raft gRPC communication All inter-node Raft traffic is unencrypted. This should be documented clearly, and a follow-up issue should track TLS support. Medium Priority7. The build section only references 8. If file creation succeeds for some files but throws on a later one, the state machine is left in a partially-applied state. Created files are not rolled back, potentially leaving a replica in an inconsistent schema state. 9. This flag is documented as not intended for production use. Having it in 10.
Minor / Style11. Typo in test class filename: 12. 13. Summary
The critical The core architecture and test coverage are excellent - the Raft consensus approach is the right foundation for long-term HA correctness. |
Code Review: Raft-based High Availability (Apache Ratis)This PR introduces a new Critical / Data-Correctness Bugs1. Leadership-change race in if (raftHAServer != null && raftHAServer.isLeader()) {
// Skip – leader already committed locally
return;
}Leadership is checked at apply time, not at commit time. Ratis delivers 2. Two-phase commit is not atomic under concurrent leadership change In final RaftClientReply reply = raftHAServer.getClient().io().send(...);
if (!reply.isSuccess()) throw ...;
if (leader)
tx.commit2ndPhase(phase1); // <- leadership may have changed hereIf the node loses leadership between 3. Non-atomic schema version increment schemaJson.put("schemaVersion", schemaJson.getLong("schemaVersion") + 1);The version is read and incremented client-side before replication. Concurrent DDL operations can both read the same base version and both submit 4. No bounds checking in final int pageCount = buf.getInt();
tx.pages = new WALFile.WALPage[pageCount];
for (int i = 0; i < pageCount; i++) {
final int deltaSize = page.changesTo - page.changesFrom + 1;
final byte[] content = new byte[deltaSize];A corrupted or malicious log entry with a large Security Concerns5. Cluster token stored without file permission hardening
tokenFile.toPath().toFile().setReadable(false, false);
tokenFile.toPath().toFile().setReadable(true, true);Or use 6. Cluster-internal auth bypasses authorisation, not just authentication A successfully validated cluster token causes the handler to resolve the user from 7. Leader-forwarding uses plain HTTP URI.create("http://" + leaderHttpAddress + "/api/v1/command/" + databaseName)The cluster token and original Code Quality8.
9. Nine markdown files under 10. final boolean isRaftPlugin = autoDiscoverRaft && "RaftHAPlugin".equals(name);If the class is renamed this check silently breaks. A marker interface or an annotation would be more robust. 11. Wildcard imports in several files
12. Commented-out code in // private ServerMonitor serverMonitor;Dead code like this should be removed before merging. Performance13. Synchronous Raft final RaftClientReply reply = raftHAServer.getClient().io().send(Message.valueOf(entry));Every transaction commit blocks the calling thread waiting for quorum acknowledgement. For workloads with many small transactions this will be a severe throughput regression compared to the existing pipelined async HA. Consider documenting this trade-off and, if possible, exploring a pipeline or batch-commit option in Ratis ( 14. Duplicate static There is a Completeness15. Snapshot-based resync is not implemented
Test Coverage16. No test for leadership-change race during commit The two critical races described in items 1 and 2 above are not tested. The integration tests cover clean failovers, but not leadership changes that occur mid-commit. 17. No test for token file persistence across node restarts There is no test verifying that after a restart, Licensing Compliance18. Per
Apache Ratis is Apache 2.0 and has its own Summary
The architectural direction is solid. The critical correctness and security issues should be resolved before this merges to avoid data-loss scenarios in production clusters. |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
|
Reviewed the automated bot feedback — no actionable items: github-advanced-security:
gemini-code-assist:
|
|
Thank you for your detailed response, @robfrank. I'd like to clarify a couple of points regarding the feedback provided by
|
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
PR Review: feat: Raft-based High Availability using Apache RatisThis is a well-structured and ambitious PR. The TDD approach is followed, separation of concerns is clean, the That said, here are specific concerns worth addressing before merge: Critical / Correctness1. The state machine skips applying entries on the leader because "it was already applied via
The correct design is for all nodes (including the leader) to apply every committed log entry. If the leader already applied it locally before the commit, Ratis's idempotency (via log index) and 2.
3. Commit protocol has a failure window on the leader In
If step 3 fails (e.g., disk error), the followers have committed the entry but the leader's local database is in a half-applied state. On leader restart it will skip re-applying (per issue #1), making the leader permanently inconsistent with its own log. Security4. Cluster auth token forwarding over plain HTTP
Missing ATTRIBUTIONS.md update5. Apache Ratis dependency not added to ATTRIBUTIONS.md CLAUDE.md explicitly requires updating Design / Robustness6. Two-node MAJORITY quorum will cause writes to block whenever one node is down. NONE quorum on a multi-node cluster allows split-brain. Currently the plugin only logs warnings. For cases that are definitively unsafe (e.g., NONE quorum with ≥3 nodes), consider throwing a 7.
8. Leader failover during HTTP command forwarding has no retry
9. Schema entry decoder silently swallows The WAL section decoder inside Minor / Style10. Typo in test utility filename
11. Six AI implementation plan files ( 12. Over-broad GitHub Actions permissions in The 13. The handler calls SummaryThe architectural approach (Ratis + WAL page diffs as log entries + wrapping |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
|
PR Review posted via separate file. |
|
FULL REVIEW POSTED BELOW - please see the next comment for the complete review content |
PR Review: Raft-based High Availability using Apache RatisThis is an ambitious and well-structured PR. Architecture is sound and test coverage is impressive (unit + integration + e2e with Toxiproxy). Issues found, ordered by severity: Critical 1: Snapshot resync incomplete but log purging enabled Critical 2: deserializeWalTransaction duplicates WAL internal format Critical 3: ignoreErrors=true swallows replay failures Concurrency 4: databaseWrapper field is not volatile Concurrency 5: rewrapDatabases() is not atomic Concurrency 6: Static HTTP_CLIENT in RaftReplicatedDatabase Design 7: HA_SERVER_LIST format change breaks existing users Design 8: Peer identity tied to server name numeric suffix Design 9: maven-shade-plugin has no configuration Security 10: /api/v1/cluster authentication Security 11: Cluster token in plaintext Minor 12: ATTRIBUTIONS.md and NOTICE not updated for Apache Ratis 3.2.0. The primary blockers are the incomplete snapshot resync with enabled log purging (1), WAL deserialization duplication (2), ignoreErrors=true during replay (3), and the HA_SERVER_LIST breaking change (7). Concurrency issues (4-6) and the committed plan documents (13) should also be fixed before merge. Generated with Claude Code |
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesFootnotes
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3731 +/- ##
==========================================
- Coverage 64.66% 63.78% -0.88%
==========================================
Files 1579 1579
Lines 116503 117137 +634
Branches 24707 24862 +155
==========================================
- Hits 75335 74720 -615
- Misses 30871 32079 +1208
- Partials 10297 10338 +41 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Code Review: Raft-Based High Availability (#3731)This is an ambitious and well-structured PR that replaces the custom replication protocol with Apache Ratis (Raft consensus). The architecture is sound and the code is generally clean. Below is my detailed review. Summary82 files changed, ~15,900 additions. Introduces a new Critical Issues1. Snapshot-based recovery not implemented ( The comments explicitly acknowledge that 2. Race condition in cluster token initialization ( If multiple nodes start simultaneously, they may both attempt to create/write the cluster token file without any locking or atomic file creation. Use High Priority Issues3. ThreadLocal buffer leak ( private static final ThreadLocal<List<byte[]>> schemaWalBuffer = ...
private static final ThreadLocal<List<Map<Integer, Integer>>> schemaBucketDeltaBuffer = ...If an exception is thrown between the buffer-clear and buffer-send operations, the buffers are left populated. Wrap buffer usage in try-finally to guarantee cleanup. In a thread-pool environment this causes state to leak into the next operation on the same thread. 4. Blocking HTTP call without timeout ( final HttpResponse<String> response = HTTP_CLIENT.send(builder.build(), HttpResponse.BodyHandlers.ofString());This blocking call has no timeout. If the leader is slow or unreachable, the calling thread hangs indefinitely. Configure 5. Silent error handling in forwarded command responses ( When the leader returns an error response (e.g. Medium Priority Issues6. Database name not URL-encoded in forwarded HTTP URI "http://" + leaderHttpAddress + "/api/v1/command/" + getName()While ArcadeDB name validation likely prevents special characters, URL-encode the database name as defense-in-depth (e.g. using 7. Server name parsing is fragile ( Parsing the peer ID by looking for the last hyphen/separator assumes a specific naming convention. A server name like 8. Leader forwarding idempotency gap When a replica forwards a DDL command to the leader and the leader crashes mid-execution, a retry from the replica could re-execute the same DDL on the new leader, causing unexpected side effects. Consider adding an idempotency/request ID header to forwarded commands. Code Style / Minor Issues9. Broad 10. 11. Test assertions — Per project conventions, prefer Security Notes
Performance Notes
Test CoverageGood breadth of testing. Specific gaps worth addressing before production:
Summary AssessmentThe architecture is solid and the Raft integration is well-done. The code is readable and follows project conventions well. The main concerns are:
Happy to discuss any of these points further. |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
2 similar comments
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
Code Review: Raft-based HA using Apache RatisOverall, this is a well-structured and well-tested implementation of a Raft-based HA stack. The use of Apache Ratis (Apache 2.0, battle-tested) is a sound choice, and the module structure correctly follows the project's That said, there are several issues that should be addressed before merge. BLOCKERS 1.
2. The project requires: When adding a dependency, you MUST update ATTRIBUTIONS.md and, if Apache-licensed with a NOTICE file, incorporate required notices into the main NOTICE file. Apache Ratis is Apache-licensed and ships a NOTICE file. Neither 3. Unresolved TODO in codec — empty bucket deltas
IMPORTANT ISSUES 4.
5. One test is disabled: 6. Typo in test class name
MINOR ISSUES 7. maven-shade-plugin — verify no classpath conflicts The 8. Verify the token is never logged at any level in 9. Peer ID resolution relies on the server name having a numeric suffix separated by 10. E2E workflow passes The e2e-ha.yml workflow runs POSITIVE HIGHLIGHTS
|
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferences |
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
PR Review: Raft-based High Availability using Apache RatisThis is a substantial architectural change. The overall approach is sound and the test coverage is commendable. Here are the issues found. CRITICAL / BLOCKERS 1. ATTRIBUTIONS.md and NOTICE not updated 2. Snapshot-based resync is not wired — incomplete feature 3. Inter-node command forwarding uses plain HTTP with no auth fallback 4. rewrapDatabases() has a non-atomic replace race SIGNIFICANT ISSUES 5. Wildcard imports in RaftReplicatedDatabase 6. Commented-out code left in ArcadeDBServer.java 7. Typo in ClusterDatatbaseChecker.java 8. forwardCommandToLeaderViaRaft — misleading name 9. persistLastApplied uses StandardOpenOption.SYNC on every transaction 10. ClusterMonitor.getReplicaLags() creates a ConcurrentHashMap unnecessarily 11. startLagMonitor / stopLagMonitor — no synchronization on executor lifecycle 12. ArcadeStateMachine.applyTxEntry — leader may double-apply transactions MINOR / STYLE 13. Plan documents committed to the repo 14. findLocalPeerId — poor error message on non-numeric suffix 15. ha-raft not included in package/distribution 16. e2e-ha workflow uses --fail-never POSITIVE HIGHLIGHTS
Reviewed with Claude Code |
Code ReviewThis is a substantial and ambitious PR — replacing a custom HA protocol with Apache Ratis (a proven Raft implementation) is the right architectural direction. The module structure is clean, the test coverage is broad, and the e2e container test setup with Toxiproxy is excellent for realistic failure scenarios. That said, there are several blocking issues that need to be addressed before merge. 🔴 Critical — Blockers1. Data integrity: Leadership-change race in
|
Introduces a dedicated exception subclass of ConcurrentModificationException thrown when a WAL page version is ahead of the database page version by more than 1. This allows the HA state machine to distinguish recoverable replays from dangerous state divergence. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nException MajorityCommittedAllFailedException distinguishes ALL-quorum watch failures (where MAJORITY already committed) from plain QuorumNotReachedException (where nothing committed). This allows the commit path to apply local pages rather than rolling back when MAJORITY succeeded. ReplicationException signals replication errors that require snapshot resync, such as WAL version gaps indicating state divergence. Also enhance QuorumNotReachedException to support both single-argument and two-argument constructors (with cause) by delegating to parent NeedRetryException which already provides this support. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…m watch failure When MAJORITY quorum succeeds but the ALL-quorum watch fails, throw MajorityCommittedAllFailedException instead of QuorumNotReachedException. This lets RaftReplicatedDatabase.commit() distinguish between 'committed but not fully replicated' and 'not committed at all'. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…in commit() When ALL-quorum watch fails after MAJORITY commit, apply pages locally via applyLocallyAfterMajorityCommit() to prevent permanent leader divergence. Enhance phase-2 error handling: distinguish ConcurrentModificationException (locking bug) from generic errors, include txId in logs, and force server stop if step-down fails to prevent leader-follower divergence. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…emergency stop - applyTxEntry: catch WALVersionGapException separately from ConcurrentModificationException. Version gaps trigger ReplicationException for snapshot resync. Benign replays are logged as WARNING and skipped. - Persisted applied-index: write last-applied index to disk via temp+atomic rename after each apply. On reinitialize, detect snapshot gaps and trigger download with 30s watchdog. - Emergency stop: unexpected errors (NPE, OOM) in applyTransaction trigger server.stop() on a separate thread to prevent silent state divergence. - Lifecycle executor: single-threaded daemon for async tasks that cannot run on Ratis callback threads. - Hot resync detection: log when followers are catching up and when they complete. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nload on leader change - applyLocallyAfterMajorityCommit: spawn emergency stop thread if step-down fails, matching the normal phase-2 failure path. - notifyLeaderChanged: consume needsSnapshotDownload flag as the primary trigger path for snapshot downloads detected during reinitialize(). The 30s watchdog remains as fallback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move ParsedPeerList record and peer address parsing logic (parsePeerList, findLocalPeerId, findLastSeparatorIndex) into a dedicated class. RaftHAServer now delegates peer resolution instead of handling it inline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move Ratis RaftProperties construction into a dedicated builder class. RaftHAServer delegates property setup instead of handling it inline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move addPeer, removePeer, leaveCluster, and transferLeadership into a dedicated cluster management class. RaftHAServer delegates membership operations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract KubernetesAutoJoin from RaftHAServer. Copy ClusterTokenProvider, RaftClusterStatusExporter from apache-ratis. Remove inline initClusterToken method. Also: remove em dashes and copyright symbols, replace inline fully-qualified names with imports across all modified files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…L checks - Real-path symlink-escape check on parent directories during extraction - File-level symlink rejection at extraction targets - copyWithLimit with 10 GB cap per entry to defend against zip-bombs - SSL context handling for HTTPS snapshot downloads Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…imeout - Root-only authorization (403 for non-root users) - Database name validation rejecting path traversal and control characters - Content-Disposition header sanitization - Write-timeout watchdog that kills stalled connections - Plain-HTTP warning log (once per server start) - Extended SnapshotSwapRecoveryTest for completion marker verification Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…sites Replace direct configuration reads of HA_CLUSTER_TOKEN with injected ClusterTokenProvider. Add ClusterTokenProviderTest. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change fromId() to return null instead of throwing for unknown type codes. Unknown entries are acknowledged as applied but skip database mutation, enabling safe rolling upgrades. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Port test files covering origin-skip optimization, graph ingestion stability, comprehensive HA lifecycle, RaftHAServer management, core replication, peer address validation, and schema file replication. Adapted class names and conventions to match ha-redesign codebase: - Added validatePeerAddress() to RaftPeerAddressResolver - Added VALID_DATABASE_NAME pattern to PostVerifyDatabaseHandler - Extracted createNewFiles() from ArcadeStateMachine.applySchemaEntry() for idempotent file creation (crash-safe schema replay) - Replaced legacy HA_REPLICATION_INCOMING_HOST/PORTS config keys with HA_RAFT_PORT in test cluster setup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add ForkJoinPool isolation documentation in ArcadeStateMachine. Apply linter formatting fixes across modified files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ThreadLocal leak: move read consistency setup inside try block (DatabaseAbstractHandler) - IPv6 bracket parsing: support RFC 3986 [addr]:port notation (HostUtil) - Schema dirty generation: replace boolean with AtomicLong to prevent lost-flag races (LocalSchema) - Vector index replication: skip quantization bytes in applyReplicatedPageUpdate (LSMVectorIndex) - GAV race condition: wrap async status updates in synchronized blocks with notifyAll (GraphAnalyticalView) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After a crash/restart, a node could become the new Raft leader before all pending log entries were replayed through the state machine. The isLeader() guard in applyTxEntry/applySchemaEntry skipped those entries, causing page divergence (e.g. 1 page vs 5 in bucket comparisons). Added recoveryWatermark (persisted applied index at startup) so entries beyond the watermark are applied even on the leader. Page-version guards in TransactionManager.applyChanges() ensure idempotency for any double-applies. Also fixed RaftHAComprehensiveIT: - test07: follower writes succeed via Raft forwarding (not rejected) - test10: accept "Timeout" in quorum error message matching Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Include HTTP port in server list (host:raftPort:httpPort) so httpAddresses map is populated for testPeerReplicaAddresses - Update testWriteOnFollowerRedirects: follower writes succeed via Raft client forwarding, not rejected with ServerIsNotTheLeaderException Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace PendingEntry with CancellablePendingEntry that uses an AtomicInteger state machine (PENDING/DISPATCHED/CANCELLED) to prevent phantom commits when a client times out before the flusher dispatches the entry.
…State() Code review fixes: restrict inner class visibility and remove dead code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Centralized broker for all Raft entry submission with typed methods per entry type. Owns RaftGroupCommitter internally. Encoding handled via RaftLogEntryCodec so callers never touch the codec directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace direct RaftGroupCommitter.submitAndWait() + RaftLogEntryCodec calls in RaftReplicatedDatabase and RaftHAPlugin with typed broker methods. Encoding is now fully encapsulated inside the broker.
… attributions - Fix TOCTOU race in ArcadeStateMachine: replace isLeader() check with origin-tracking via startTransaction/StateMachineContext to correctly identify entries pre-applied by this node regardless of leadership changes - Add bounds checks in RaftLogEntryCodec to prevent OOM from corrupt or malicious log entries (64 MB max per allocation, 1M max collection size) - Add Apache Ratis entries to ATTRIBUTIONS.md and NOTICE (license compliance) - Harden forwardCommandToLeader: throw SecurityException on null user instead of silently escalating to root - Mask HA_CLUSTER_TOKEN and password-containing config keys in startup dump - Clarify HA_RAFT_PERSIST_STORAGE description for production use - Remove docs/plans and docs/superpowers (AI planning artifacts) - Fix FQNs in ArcadeStateMachine catch blocks - Rename ClusterDatatbaseChecker to ClusterDatabaseChecker (typo) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📜 License Compliance Check✅ License check passed. See artifacts for full report. License Summary (first 50 lines) |
Summary
Redesigns the ArcadeDB HA stack using Apache Ratis (Raft consensus) for leader election, log replication, and cluster management. Replaces the custom replication protocol with proven consensus semantics.
Closes #3730
What's included
New module:
ha-raft/api/v1/clusterendpoint for cluster statusNew module:
e2e-haEnd-to-end container tests using TestContainers + Toxiproxy for realistic cluster scenarios.
Key features
host:raftPort:httpPort:priority)Test plan
mvn test -pl ha-raftandmvn compile test-compile -pl e2e-ha)🤖 Generated with Claude Code