fix(raft): fence read indexes#4
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e8c3318b45
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if let Some(pending) = &self.pending_read_index { | ||
| return Err(ReadIndexError::ReadInProgress { read_index: pending.read_index }); | ||
| } |
There was a problem hiding this comment.
Retry or expire pending read-index rounds
request_read_index returns ReadInProgress whenever pending_read_index is set, but this pending state is only cleared in handle_read_index_response; there is no retry or timeout path on ticks. If the initial read-index RPC round misses quorum (e.g., dropped/ignored responses), the leader can stay permanently stuck in ReadInProgress and reject all future linearizable reads until a role change occurs, even after the network recovers.
Useful? React with 👍 / 👎.
| } | ||
|
|
||
| pending.acks.insert(from); | ||
| if core.effective_config().has_quorum(|voter| pending.acks.contains(&voter)) { |
There was a problem hiding this comment.
Fix quorum drift for in-flight read-index rounds
The round is sent only to voters from the configuration at request time, but quorum is later checked against core.effective_config() at response time. If membership changes while a round is pending (especially joint-consensus transitions), the required voter set can change to include nodes that never received this round’s ReadIndexRequest, making the round impossible to complete and leaving reads blocked behind ReadInProgress.
Useful? React with 👍 / 👎.
e8c3318 to
ef0c9f0
Compare
3a9a6bf to
b2fb2de
Compare
7d03499 to
e5045eb
Compare
e5045eb to
094b8bb
Compare
Pull Request
Summary
What: Implement a fenced read-index round instead of treating current-term commit as sufficient for linearizable reads.
Why: The dissertation requires a read to save the commit index, confirm leadership with a fresh heartbeat quorum after the read starts, then wait for the state machine to apply through that index.
Lines added: +289
Thesis Cross-Reference
Current-term commit prerequisite:
clients/clients.tex#L322-L329
Key phrase: "commit an entry from its term".
Context: a leader first needs to know its commit index is not stale for its term.
Save
readIndex:clients/clients.tex#L331-L334
Key phrase: "saves its current commit index".
Context: the read operates against at least this committed state.
Fresh heartbeat quorum:
clients/clients.tex#L336-L344
Key phrase: "new round of heartbeats".
Context: current-term commit alone is not the read fence; the quorum response must happen after the read starts.
Applied-state requirement:
clients/clients.tex#L346-L351
Key phrase: "advance at least as far".
Context: after quorum confirmation, the state machine must be applied through
readIndexbefore serving the read.Performance motivation:
clients/clients.tex#L355-L359
Key phrase: "avoids synchronous disk writes".
Context: this preserves the efficient read-index path instead of committing reads as log entries.
Test Plan
cargo test -p cloud9-raftTests Added
Notes for Reviewers
The old
can_serve_readsshape encoded only the prerequisite. This patch makes the actual read fence explicit: request, quorum-confirm, apply, then read.