Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cassandra-11565 Prevents replaying commit log segments with invalid mutations from being replayed over and over #1587

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

chatterjeesubarnadatastax
Copy link

@chatterjeesubarnadatastax chatterjeesubarnadatastax commented Feb 19, 2025

What is the issue

Fixes #11565

What does this PR fix and why was it fixed

...

Copy link

Checklist before you submit for review

  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits

Copy link

@sbtourist sbtourist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the patch would be cleaner and simpler if we didn't introduce the InvalidMutationRelocator abstraction here, as relocating mutations is really a CNDB concern, and as a proof of that, your default implementation here does nothing (which is confusing, since the name suggests differently).

I would rather "promote" the concept of "replayed segments handler" to its own abstraction. Practically speaking that means something along the lines of:

  1. Introduce a new (say) CommitLogSegmentHandler abstraction.
  2. Make it "injectable", similarly to how you've done already.
  3. Invoke the new abstraction from AbstractCommitLogSegmentManager#handleReplayedSegment, basically moving its current logic into the new abstraction.

This would leave the C* code functionally the same, but with a new abstraction we can override on the CNDB side.

WDYT?

@chatterjeesubarnadatastax
Copy link
Author

chatterjeesubarnadatastax commented Mar 31, 2025

I think the patch would be cleaner and simpler if we didn't introduce the InvalidMutationRelocator abstraction here, as relocating mutations is really a CNDB concern, and as a proof of that, your default implementation here does nothing (which is confusing, since the name suggests differently).

I would rather "promote" the concept of "replayed segments handler" to its own abstraction. Practically speaking that means something along the lines of:

  1. Introduce a new (say) CommitLogSegmentHandler abstraction.
  2. Make it "injectable", similarly to how you've done already.
  3. Invoke the new abstraction from AbstractCommitLogSegmentManager#handleReplayedSegment, basically moving its current logic into the new abstraction.

This would leave the C* code functionally the same, but with a new abstraction we can override on the CNDB side.

WDYT?

Yeah I agree that InvalidMutationRelocator is a no-op on the Cassandra side. So to clarify your suggestion, do you mean something as follows?

I'll introduce CommitLogSegmentHandler as a separate abstraction on the Cassandra side (and overriding it on cndb) but not within any existing class. This abstraction will contain the necessary methods (e.g. moveSegmentsWithInvalidMutationsToHostSubDirectory) related to moving segments with invalid must and which will get called from AbstractCommitLogSegmentManager#handleReplayedSegment. Lmk if this is what you meant.

If yes, I'm thinking that the logic for these methods on the Cassandra side will still stay as a no-op. Lmk if I'm missing something here.

Copy link

sonarqubecloud bot commented Apr 3, 2025

Quality Gate Failed Quality Gate failed

Failed conditions
71.4% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-1587 rejected by Butler


3 new test failure(s) in 4 builds
See build details here


Found 3 new test failures

Test Explanation Branch history Upstream history
r.TestReplaceAddress.test_revive_endpoint regression 🔴🔴 🔵🔵🔵
o.a.c.i.s.d.v.VectorCompressionTest.testAda002 regression 🔴🔴🔴🔴 🔵🔵🔵🔵🔵🔵🔵
o.a.c.u.b.BinLogTest.testTruncationReleasesLogS... regression 🔴🔵🔵🔴 🔵🔵🔵🔵🔵🔵🔵

Found 8 known test failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants