Skip to content

CASSANDRA-20639 Replica filtering protection can trigger short-read protection too aggressively when the LIMIT is less than the number of results in a partition #4150

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

maedhroz
Copy link
Contributor

@maedhroz maedhroz commented May 9, 2025

No description provided.

@maedhroz maedhroz changed the title very rough, likely not efficient WIP on CASSANDRA-20639 very rough WIP on CASSANDRA-20639 May 12, 2025
@maedhroz maedhroz force-pushed the CASSANDRA-20639-trunk branch 2 times, most recently from 530fd9c to 3058c0f Compare May 13, 2025 03:23
@maedhroz maedhroz changed the title very rough WIP on CASSANDRA-20639 CASSANDRA-20639 Replica filtering protection can trigger short-read protection too aggressively when the LIMIT is less than the number of results in a partition May 13, 2025
@maedhroz maedhroz force-pushed the CASSANDRA-20639-trunk branch from 3058c0f to 5d60103 Compare May 13, 2025 03:34
@@ -24,6 +24,8 @@
public interface CloseableIterator<T> extends Iterator<T>, AutoCloseable
{
public void close();

public default void checkpoint() { }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very awkward, although it pokes the necessary hole in the contract for RFP to be able to do what it needs to do without closing the merged iterator and therefore the merge listener. Any other ideas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of adding this to the CloseableIterator interface, why don't you make the anonymous MergeListener in ReplicaFilteringProtection a class that you can manipulate or "checkpoint" from queryProtectedPartitions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done, and if preliminary testing is any indication, it's working reasonably well.

// Example: builder.withSeed(42L);
// builder.withSeed(1210048824849624538L).withExamples(1);
// builder.withSeed(-7862021736520593557L).withExamples(1);
// builder.withSeed(602472426346856339L).withExamples(1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seeds that failed variously during development

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will just revert these before commit...


String select = withKeyspace("SELECT * FROM %s.short_read_no_srp WHERE k = 0 AND a = 1 LIMIT 1");
Iterator<Object[]> initialRows = CLUSTER.coordinator(1).executeWithPaging(select, ConsistencyLevel.ALL, 2);
assertRows(initialRows, row(0, 2, 1));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine a partition with 10,000 rows, 100 of which are matches for an index or filtering query. With a limit of 10, the logic before this patch would have to keep issuing SRP reads to find all 100 matches, which kills performance...almost to a comical extent.

@maedhroz maedhroz force-pushed the CASSANDRA-20639-trunk branch 2 times, most recently from 50b8ae0 to 75fbd7a Compare May 13, 2025 05:33
partition.next();
}
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inlined this to RFP to have a complete view of everything happening in queryProtectedPartitions(). Open to reorganizing again once we're convinced things are correct.

@@ -270,6 +277,11 @@ public void onMergedRangeTombstoneMarkers(RangeTombstoneMarker merged, RangeTomb

@Override
public void close()
{
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should close call checkpoint also?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, as it stands, there's no situation where calling close() on the merge listener would do anything useful, given we're manually checkpointing in queryProtectedPartitions(), which used to close the iterator before this patch (hence the checkpoint logic being in close()). Looking at the other comment below about whether we even want to modify the iterator contracts, let's come back to this...

@maedhroz maedhroz force-pushed the CASSANDRA-20639-trunk branch 3 times, most recently from 170839a to 4c261cc Compare May 22, 2025 20:06
@maedhroz maedhroz requested a review from bdeggleston May 22, 2025 20:07
Copy link
Member

@bdeggleston bdeggleston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 thanks

if (partitions.isEmpty())
{
PartitionIterators.consumeNext(merged);
if (consumeEntirePartitions)
Copy link
Contributor

@jasonstack jasonstack May 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason that non-null currentRowIterator is not closed here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, currentRowIterator is used when consumeEntirePartitions is false


Long srpRequestsBefore = CLUSTER.get(1).callOnInstance(() -> Keyspace.open(KEYSPACE).getColumnFamilyStore("no_srp_at_limit").metric.shortReadProtectionRequests.getCount());

String select = withKeyspace("SELECT * FROM %s.no_srp_at_limit WHERE k = 0 AND a = 1 LIMIT 1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is RFP triggered here? I assume yes, just in case "node1" has different a value for k=1 and c=3

Copy link
Contributor Author

@maedhroz maedhroz May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Neither node has two complete rows for c = 2 and c = 3 in k = 0.

Copy link
Contributor

@jasonstack jasonstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@maedhroz maedhroz force-pushed the CASSANDRA-20639-trunk branch 2 times, most recently from 4511780 to fc3d8d2 Compare May 27, 2025 21:59
…t read protection reads

patch by Caleb Rackliffe; reviewed by Blake Eggleston and Zhao Yang for CASSANDRA-20639
@maedhroz maedhroz force-pushed the CASSANDRA-20639-trunk branch from fc3d8d2 to aa6d77b Compare May 28, 2025 20:21
@maedhroz
Copy link
Contributor Author

Committed as fe13726

@maedhroz maedhroz closed this May 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants