Skip to content

Replace timestamp-based cache validation with last_changed check. #8728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: trunk
Choose a base branch
from

Conversation

spacedmonkey
Copy link
Member

@spacedmonkey spacedmonkey commented Apr 23, 2025

This update introduces a wp_is_valid_last_changed() function to validate cache data against the last_changed value, improving cache reliability and removing timestamp dependency. Cache keys have been simplified by removing last_changed from their structure, and the cache values now store last_changed for consistency. Additionally, functions across core query classes are updated to implement these changes.

Trac ticket: https://core.trac.wordpress.org/ticket/59592


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

This update introduces a `wp_is_valid_last_changed()` function to validate cache data against the `last_changed` value, improving cache reliability and removing timestamp dependency. Cache keys have been simplified by removing `last_changed` from their structure, and the cache values now store `last_changed` for consistency. Additionally, functions across core query classes are updated to implement these changes.
Reverses the condition in the `wp_is_valid_last_changed` check to ensure cached results are correctly validated. This prevents bypassing valid cache data and improves query performance.
Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

Replaced incorrect use of `$args` with `$qv` in multiple locations to align with the expected query variable structure. This prevents potential issues with parsing query parameters and ensures consistency in handling input values.
Replaced `wp_cache_add` with `wp_cache_set` for term query caching to ensure proper cache overwriting. This change mitigates potential issues with stale cache data and aligns with best practices for consistent cache behavior.
Updated multiple query classes to use wp_cache_set() instead of wp_cache_add() for handling cache storage. This ensures that existing cache entries are overridden as intended, improving consistency and preventing potential caching issues.
Introduce PHPUnit tests to validate the behavior of the wp_is_valid_last_changed function. Tests cover various scenarios, including valid usage, missing or mismatched timestamps, and invalid input types. These additions ensure robustness and prevent potential regressions.
@tillkruss
Copy link
Member

LGTM! I'd approve this if I'd had the permissions.

@spacedmonkey spacedmonkey marked this pull request as ready for review April 26, 2025 18:59
Copy link

github-actions bot commented Apr 26, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props spacedmonkey, flixos90, tillkruess.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Copy link
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spacedmonkey @tillkruss I really like where this is going.

I think we should consider implementing two helper functions to read and write cache keys with $last_changed awareness, so that that logic is abstracted and doesn't need to be duplicated in all these places. Doing so would make the logic updates needed in each query class minimal.

Comment on lines -454 to 455
$cache_key = "get_comments:$key:$last_changed";
$cache_key = "get_comments:$key";
$cache_value = wp_cache_get( $cache_key, 'comment-queries' );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth considering: When sites update to this version, all query caches become invalid at once, which for large sites can cause cache stampedes.

I think we should consider for a temporary period (e.g. 1 release cycle) to retain a backward compatible check for the old key - just as a lookup of course, never to create new such caches.

To keep the code simple despite the BC workaround, we could introduce a little private helper function to use throughout the query classes that considers both variants and, in case of the old key resulting in a cache hit, "polyfill" the data. For example a private function like this:

function _wp_get_query_cache_with_last_changed( $cache_key, $cache_group, $last_changed ) {
	$cache_value = wp_cache_get( $cache_key, $cache_group );
	if ( false !== $cache_value ) {
		return $cache_value;
	}

	// The below is for backward compatibility.
	$cache_value = wp_cache_get( "$cache_key:$last_changed", $cache_group );
	if ( is_array( $cache_value ) && ! wp_is_numeric_array( $cache_value ) ) {
		$cache_value['last_changed'] = $last_changed;
	}

	return $cache_value;
}

Then this code here for example could become:

$cache_value = _wp_get_query_cache_with_last_changed( $cache_key, 'comment-queries', $last_changed );

And similarly in other query classes.

$cache_value = wp_cache_get( $cache_key, 'comment-queries' );
if ( false === $cache_value ) {
if ( ! wp_is_valid_last_changed( $cache_value, $last_changed ) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about using the word "valid". Any last changed time is a "valid" time, but it might mean the cache should be considered expired.

Maybe we should rename the function to something like wp_is_cache_value_last_changed( $cache_value, $last_changed )? That IMO clarifies the purpose better, especially in combination with its parameters.

Comment on lines -1049 to 1052
$cache_keys[ $parent_id ] = "get_comment_child_ids:$parent_id:$key:$last_changed";
$cache_keys[ $parent_id ] = "get_comment_child_ids:$parent_id:$key";
}
$cache_data = wp_cache_get_multiple( array_values( $cache_keys ), 'comment-queries' );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my above note on backward compatibility with existing cache keys. My above proposed private function wouldn't work for this, but we could easily adapt it to work with wp_cache_get_multiple calls, or have a separate version of it.

Comment on lines -1053 to +1056
$parent_child_ids = $cache_data[ $cache_keys[ $parent_id ] ];
if ( false !== $parent_child_ids ) {
$child_ids = array_merge( $child_ids, $parent_child_ids );
$cache_value = $cache_data[ $cache_keys[ $parent_id ] ];
if ( wp_is_valid_last_changed( $cache_value, $last_changed ) ) {
$child_ids = array_merge( $child_ids, $cache_value['child_ids'] );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, since here the shape of the actual cached value changes to (array with child_ids key, instead of just the child IDs list), maybe we should consider having an actual helper function, both to set and get the value while considering the $last_changed value. This might be better than the private helper function for BC, it would also avoid the need for separately calling wp_is_valid_last_changed() everywhere, and therefore it would be able to reduce complexity in all the duplicated code paths further.

Something like this:

  • wp_cache_get_with_last_changed( $cache_key, $cache_group, $last_changed ): This would do something similar to the private function I proposed above, but in addition it would return just the actual cached value (if it has the correct last_changed), with the last_changed value removed.
    • We could simply enforce a shape where the actual value will always be wrapped in an associative array with last_changed key and value key. That way it's predictable for every case and can be easily polyfilled on cache values under the old keys that may still be relevant.
    • The function would return the actual value if there was a cache hit and its last_changed matches the given $last_changed. Otherwise it would return false, just like wp_cache_get().
  • wp_cache_set_with_last_changed( $cache_key, $cache_value, $cache_group, $last_changed ): This would call wp_cache_set(), but first wrap the cache value and $last_changed according to the defined format.

I think if we go for a consistent format, that would make things really straightforward. To clarify the above with code, I mean to always write the cache values like this:

array(
	'value'        => $cache_value,
	'last_changed' => $last_changed,
);

With that format, and the two functions above, all the code would be able to remain the same. We would only need to replace wp_cache_get() and wp_cache_set() calls with the new respective functions.

Sure, if $cache_value is an associative array, you might be able to just add the last_changed key to it, and that would look nicer as a data shape, but at the end of the day, that doesn't matter much, and it would still be unreliable to assume it's okay - we don't know if the caller will iterate over the associative array or expect specific keys.

@tillkruss
Copy link
Member

I think we should consider implementing two helper functions to read and write cache keys with $last_changed awareness, so that that logic is abstracted and doesn't need to be duplicated in all these places.

Excellent idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants