Skip to content

fix(raft): fence read indexes#4

Merged
windsornguyen merged 1 commit into
mainfrom
fix/raft-read-index
May 9, 2026
Merged

fix(raft): fence read indexes#4
windsornguyen merged 1 commit into
mainfrom
fix/raft-read-index

Conversation

@windsornguyen

Copy link
Copy Markdown
Owner

Pull Request

Summary

What: Implement a fenced read-index round instead of treating current-term commit as sufficient for linearizable reads.

Why: The dissertation requires a read to save the commit index, confirm leadership with a fresh heartbeat quorum after the read starts, then wait for the state machine to apply through that index.

Lines added: +289

Thesis Cross-Reference

  • Current-term commit prerequisite:
    clients/clients.tex#L322-L329
    Key phrase: "commit an entry from its term".
    Context: a leader first needs to know its commit index is not stale for its term.

  • Save readIndex:
    clients/clients.tex#L331-L334
    Key phrase: "saves its current commit index".
    Context: the read operates against at least this committed state.

  • Fresh heartbeat quorum:
    clients/clients.tex#L336-L344
    Key phrase: "new round of heartbeats".
    Context: current-term commit alone is not the read fence; the quorum response must happen after the read starts.

  • Applied-state requirement:
    clients/clients.tex#L346-L351
    Key phrase: "advance at least as far".
    Context: after quorum confirmation, the state machine must be applied through readIndex before serving the read.

  • Performance motivation:
    clients/clients.tex#L355-L359
    Key phrase: "avoids synchronous disk writes".
    Context: this preserves the efficient read-index path instead of committing reads as log entries.

Test Plan

  • cargo test -p cloud9-raft

Tests Added

  • Read-index request fails before current-term commit.
  • Single-node read-index confirms immediately once applied.
  • Multi-node read-index waits for a matching heartbeat quorum.
  • Stale read-index responses are ignored.
  • Read readiness requires the state machine to apply through the read index.

Notes for Reviewers

The old can_serve_reads shape encoded only the prerequisite. This patch makes the actual read fence explicit: request, quorum-confirm, apply, then read.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e8c3318b45

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +131 to +133
if let Some(pending) = &self.pending_read_index {
return Err(ReadIndexError::ReadInProgress { read_index: pending.read_index });
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Retry or expire pending read-index rounds

request_read_index returns ReadInProgress whenever pending_read_index is set, but this pending state is only cleared in handle_read_index_response; there is no retry or timeout path on ticks. If the initial read-index RPC round misses quorum (e.g., dropped/ignored responses), the leader can stay permanently stuck in ReadInProgress and reject all future linearizable reads until a role change occurs, even after the network recovers.

Useful? React with 👍 / 👎.

}

pending.acks.insert(from);
if core.effective_config().has_quorum(|voter| pending.acks.contains(&voter)) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fix quorum drift for in-flight read-index rounds

The round is sent only to voters from the configuration at request time, but quorum is later checked against core.effective_config() at response time. If membership changes while a round is pending (especially joint-consensus transitions), the required voter set can change to include nodes that never received this round’s ReadIndexRequest, making the round impossible to complete and leaving reads blocked behind ReadInProgress.

Useful? React with 👍 / 👎.

@windsornguyen windsornguyen force-pushed the fix/raft-read-index branch from e8c3318 to ef0c9f0 Compare May 9, 2026 03:44
@windsornguyen windsornguyen force-pushed the fix/rename-cloud9-raft branch from 3a9a6bf to b2fb2de Compare May 9, 2026 03:45
@windsornguyen windsornguyen force-pushed the fix/raft-read-index branch 2 times, most recently from 7d03499 to e5045eb Compare May 9, 2026 07:02
@windsornguyen windsornguyen changed the base branch from fix/rename-cloud9-raft to main May 9, 2026 07:05
@windsornguyen windsornguyen force-pushed the fix/raft-read-index branch from e5045eb to 094b8bb Compare May 9, 2026 07:29
@windsornguyen windsornguyen merged commit 450619e into main May 9, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant