Skip to content

trunk: Ensure prepared_statement INSERT timestamp precedes eviction DELETE #4160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

tolbertam
Copy link
Contributor

@tolbertam tolbertam commented May 14, 2025

Updates SystemKeyspace.writePreparedStatement to accept a timestamp associated with the Prepared creation time. Using this timestamp will ensure that an INSERT into system.prepared_statements will always precede the timestamp for the same Prepared in SystemKeyspace.removePreparedStatement.

This is needed because Caffeine 2.9.2 may evict an entry as soon as it is inserted if the maximum weight of the cache is exceeded causing the DELETE to be executed before the INSERT.

Additionally, any clusters currently experiencing a leaky system.prepared_statements table from this bug may struggle to bounce into a version with this fix as
SystemKeyspace.loadPreparedPreparedStatements currently does not paginate the query to system.prepared_statements, causing heap OOMs. To fix this this patch adds pagination at 5000 rows and aborts loading once the cache size is loaded. This should allow nodes to come up and delete older prepared statements that may no longer be used as the cache fills up (which should happen immediately).

This patch does not address the issue of Caffeine immediately evicting a prepared statement, however it will prevent the
system.prepared_statements table from growing unbounded. For most users this should be adequate, as the cache should only be filled when there are erroneously many unique prepared statements. In such a case we can expect that clients will constantly prepare statements regardless of whether or not the cache is evicting statements.

patch by Andy Tolbert; reviewed by Berenguer Blasi and Caleb Rackliffe for CASSANDRA-19703

@tolbertam tolbertam force-pushed the CASSANDRA-19703-trunk branch from b87d38f to 271606f Compare May 15, 2025 02:21
@tolbertam
Copy link
Contributor Author

tolbertam commented May 15, 2025

something interesting going on with trunk that I want to dig a bit more into; noticed testAsyncPstmtInvalidation takes 2-3x longer to load prepared statements than 5.0 (5-7 seconds vs. ~2.5s), and as result 175/500 test executions failed when i tried them in circle. Very likely it's unrelated to these changes, but I want to understand why thats the case.

testPstmtInvalidation takes ~500-800ms on 5.0, 1.5s on trunk. Will dig into this more.

Updates SystemKeyspace.writePreparedStatement to accept a timestamp
associated with the Prepared creation time. Using this timestamp
will ensure that an INSERT into system.prepared_statements will
always precede the timestamp for the same Prepared in
SystemKeyspace.removePreparedStatement.

This is needed because Caffeine 2.9.2 may evict an entry as soon
as it is inserted if the maximum weight of the cache is exceeded
causing the DELETE to be executed before the INSERT.

Additionally, any clusters currently experiencing a leaky
system.prepared_statements table from this bug may struggle to
bounce into a version with this fix as
SystemKeyspace.loadPreparedPreparedStatements currently does
not paginate the query to system.prepared_statements, causing heap
OOMs.  To fix this this patch adds pagination at 5000 rows and
aborts loading once the cache size is loaded. This should allow
nodes to come up and delete older prepared statements that may no
longer be used as the cache fills up (which should happen immediately).

This patch does not address the issue of Caffeine immediately evicting
a prepared statement, however it will prevent the
system.prepared_statements table from growing unbounded.  For most users
this should be adequate, as the cache should only be filled when there
are erroneously many unique prepared statements. In such a case we can
expect that clients will constantly prepare statements regardless
of whether or not the cache is evicting statements.

patch by Andy Tolbert; reviewed by Berenguer Blasi and Caleb Rackliffe for CASSANDRA-19703
@tolbertam tolbertam force-pushed the CASSANDRA-19703-trunk branch from 271606f to f7126fd Compare May 18, 2025 21:31
@tolbertam
Copy link
Contributor Author

tolbertam commented May 19, 2025

Did some profiling and the performance delta I was seeing was influenced by the use of -XX:-TieredCompilation and -XX:-BackgroundCompilation when running tests in my IDE, which was not the case when running ant testsome -Duse.jdk11=true -Dtest.name=org.apache.cassandra.cql3.PstmtPersistenceTest. As demonstrated in this wall clock profile, compilation is contributing to time spent in the test.

image

I've updated testAsyncPstmtInvalidation c85d156 only load as many statements are necessary to validate the behavior change and it now runs very quickly (~400-800ms) like the other tests do. I'd like to do a little more profiling to understand why things are slower on trunk generally, but I've done enough to rule out these changes as a contributor so I think we can move on.

@tolbertam tolbertam closed this May 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant