Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rely on server-side breadcrumbs for detection and optimization #892

Merged

Conversation

westonruter
Copy link
Member

@westonruter westonruter commented Dec 4, 2023

Summary

See #869.

Eliminate client-side computation of element breadcrumbs in favor of generating them server-side while processing the output buffer. Breadcrumbs are serialized as XPath and added to image elements as a data-ilo-xpath attribute. This data attribute is used by the detection script instead of generating breadcrumbs client-side. Lastly, the detection script is now injected as part of the output buffer processing instead of a separate wp_footer code path.

Relevant technical choices

This improves the reliability of connecting elements from client-side detection with server-side optimization by exclusively computing breadcrumbs XPath on the server. This eliminates the issue encountered in #884 in which JavaScript injecting a new element. Namely it address the following todo from #884 (comment):

Improve handling of breadcrumb generation when there the document structure varies either due to admin bar being present or due to client-side logic. Currently breadcrumbs have to explicitly skip over the admin bar element to ignore it. Similarly, the client-side breadcrumb logic has to skip over .skip-link.screen-reader-text which is injected by wp_enqueue_block_template_skip_link(). This issue of client-side mutated elements is particularly problematic, which may require server-side tagging of elements with their breadcrumbs which would eliminate needing to also do that calculation on the client. But this would mean extraneous data- attributes on all images. When detection isn't needed, the tag processor could strip those attributes out before sending back the buffer. We may need to monitor this to see how much of a problem it is, such as by checking various themes to see what degree they inject elements to mess up breadcrumbs. As another alternative, we may consider relying on IDs and class names for breadcrumbs rather than element indices, although JS here too can cause problems by modifying the class names.

As part of this, XPath strings are used throughout the module rather than comparing breadcrumbs arrays.

Additionally, instead of injecting the detection script at wp_footer and then having some of the same conditions during output buffer processing for optimization, the detection script is now injected during optimization.

Checklist

  • PR has either [Focus] or Infrastructure label.
  • PR has a [Type] label.
  • PR has a milestone or the no milestone label.

@westonruter westonruter added no milestone PRs that do not have a defined milestone for release [Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature labels Dec 4, 2023
@westonruter westonruter marked this pull request as ready for review December 4, 2023 21:27
@felixarntz
Copy link
Member

@westonruter Would you like this PR to be reviewed and merged into the other PR's branch, or did you simply start work on this before and prefer #884 to be merged first?

@westonruter
Copy link
Member Author

@felixarntz I'd like #884 to be merged first so that the PRs don't get excessively large.

…ver-side-breadcrumbs

* add/image-loading-optimization-optimizing:
  Add comment explaining why for loop is used
  Clarify LCP element language in comment
Base automatically changed from add/image-loading-optimization-optimizing to feature/image-loading-optimization December 7, 2023 19:04
@westonruter
Copy link
Member Author

Note: I'm working on the tests for this in a sub-PR: #898.

In the meantime, this PR can be reviewed (and merged).

Comment on lines +242 to +243
* It would be nicer if this were like `/html[1]/body[2]` but in XPath the position() here refers to the
* index of the preceding node set. So it has to rather be written `/*[1][self::html]/*[2][self::body]`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This being said, the breadcrumbs could be updated to keep track of the index of the tag of a given type rather than the index of the tag in general. This would make breadcrumbs more in line with the natural XPath syntax, so instead /*[1][self::html]/*[2][self::body] we could do just /html[1]/body[1].

Nevertheless, what is implemented now is what the :nth-child() pseudo-selector implements in CSS, as opposed to :nth-of-type().

But doing this would take some extra bookkeeping, and I don't see a clear benefit.

I also just came across a Chromium doc that is looking at the same problem space: Identifying element consistently across reloads.

Copy link
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me! It's more reliable now that the source of truth is on the server - and while I don't like XPath, at least its usage here is simple enough (mostly just to compare against it) that I feel it doesn't make the code unapproachable. Also +1 to combining the logic for when to print the detection script to avoid duplicate logic.

Just a few minor points below.

Comment on lines -238 to +235
$breadcrumbs[] = array(
'tag' => $breadcrumb_tag_name,
'index' => $this->open_stack_indices[ $i ],
);
yield array( $breadcrumb_tag_name, $this->open_stack_indices[ $i ] );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for the data type change here? I find an associative array with named keys easier to understand than an indexed array that isn't really for a list of something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly because it makes it more compact and slightly easier to iterate over per below:

foreach ( $this->get_breadcrumbs() as list( $tag_name, $index ) ) {

Originally too I had envisioned that a breadcrumbs could have other keys, like a class name or maybe some attributes to make it more like a CSS selector. But this didn't end up being the case, as we only ever need the tag name and index. So making it a tuple seems to make sense to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize it's a few fewer lines of code, but it's harder to understand. Keys allow clarifying what something is, while now that's less apparent, particularly when looking at where the function is used and only finding list.

I'd prefer to go for clarity over compactness, but not a blocker as long as this function remains purely internal.

Comment on lines -118 to +99
if ( ! $post ) {
return $buffer;
}
$url_metrics = $post ? ilo_parse_stored_url_metrics( $post ) : array();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the early return here? Isn't that more efficient if no URL metrics post is found?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Because now if no URL metrics are found, we need to proceed and add breadcrumbs to the elements so that we can gather the URL metrics. You can see below too that the detection script is now injected in this function as well.

),
),
),
'pattern' => '^(/\*\[\d+\]\[self::.+?\])+$', // See ILO_HTML_Tag_Processor::get_xpath() for format.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion: Make this a const on ILO_HTML_Tag_Processor since that's where the definition really comes from.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Done in 2df915d

Copy link
Member

@adamsilverstein adamsilverstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splendid

@westonruter westonruter merged commit 7c0eb74 into feature/image-loading-optimization Dec 21, 2023
8 checks passed
@westonruter westonruter deleted the add/server-side-breadcrumbs branch December 21, 2023 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no milestone PRs that do not have a defined milestone for release [Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants