Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment: disable per-block stylesheet to measure the impact on TTFB and LCP #48728

Closed
wants to merge 1 commit into from

Conversation

oandregal
Copy link
Member

@oandregal oandregal commented Mar 3, 2023

THIS PR IS AN EXPERIMENT. NOT TO MERGE.

What?

This PR disables per-block stylesheet, making all core block styles be enqueued as part of the block-library stylesheet instead.

Why?

I'd like to see whether it impacts front-end metrics (TTFB and LCP) in our CI environment.

How?

The implementation is a hack, it's not to be shipped.

This disables the should_load_separate_core_block_assets filter.

Testing Instructions

Using TT3, go to the front-end and verify that the block-library stylesheet is present and that there is no per-block stylesheets.

@oandregal oandregal self-assigned this Mar 3, 2023
@oandregal oandregal added [Status] In Progress Tracking issues with work in progress [Type] Experimental Experimental feature or API. labels Mar 3, 2023
@oandregal
Copy link
Member Author

According to the e2e tests, disabling this feature is making tt1 a bit slower (why?) and tt3 (the only that should be affected) faster:

image

@oandregal
Copy link
Member Author

This is comparing the PR vs trunk (at 666574b) using WordPress 6.2-beta4-55456 with WP_DEBUG and WP_SCRIPT disabled:

image

The test was loading the homepage 300 times via this script: seq 300 | xargs -Iz curl -o /dev/null -H 'Cache-Control: no-cache' -s -w "%{time_starttransfer}\n" http://localhost:10035 | xclip -selection clipboard.

I'd like to try the numbers for LCP locally.

@aristath
Copy link
Member

aristath commented Mar 7, 2023

Thank you for running these tests @oandregal, amazing results! I wonder why classic themes become faster 🤔

I just wanted to add some notes here:
TTFB is only one of the metrics that will have an impact on the performance... and admittedly, splitting the stylesheets, inlining them and so on, does require running some additional PHP, which adds a few milliseconds to TTFB.
The end result, however, is transmitting less data to end-users which is more important than TTFB when someone is not using a fast connection, in remote areas, etc. The TTFB increase is mitigated by a decrease in the time it gets to load the actual styles, and it's a huge sustainability win.
Looking at TTFB indeed it's better to load a single stylesheet instead of splitting them & inlining them. But that's just one piece of the puzzle.
What this test highlights however, is that we could probably work on optimizing the PHP that does the splitting/inlining/parsing etc ❤️

@oandregal
Copy link
Member Author

This is comparing the PR vs trunk (at f72ebec) using WordPress 6.2-beta4-55456 with WP_DEBUG and WP_SCRIPT disabled.

I've used a tool by @felixarntz and others with the following configuration: npm run research -- benchmark-web-vitals -u <homepage_to_test> -n 300 -o csv.

image

Notes:

  • By improving TTFB, LCP also improves significantly.
  • TTFB is reported to be 45% of the LCP for classic themes and up to 60% for block themes in this analysis. It's inline with what I've found in other scenarios and the reason we also track LCP-TTFB: we need the ability to track how a change affects the client performance as well.
  • LCP-TTFB: I couldn't extract real data for, so subtracted the medians of LCP and TTFB. This datapoint seems to suggest that client-side performance does degrade from this change.

@oandregal
Copy link
Member Author

oandregal commented Mar 7, 2023

Disclaimer: I won't pursue this approach, this PR is just a testing ground :)

Looking at TTFB indeed it's better to load a single stylesheet instead of splitting them & inlining them. But that's just one piece of the puzzle.

Oh, absolutely! That's why I added LCP-TTFB to our performance tests. Though, I still wanted to test our assumptions based on the data we have now.

What this test highlights however, is that we could probably work on optimizing the PHP that does the splitting/inlining/parsing etc

Yeah, given the amount of time it still takes from LCP, it'd be nice if we can spend some time looking for optimizations to TTFB. However, and specifically to per-block stylesheets, I wonder if there are wins to LCP-TTFB (the time it takes in the client) we can still do.

One thing that's in my mind is the way the per-block styles are generated and whether changes to that would improve performance. At the moment, we add them as an embedded stylesheet inline. Would performance improve if we create an external stylesheet instead? External stylesheets have the benefit that can be cached by browsers, for example. A solution for that would also be useful for the global styles of the site.

@aristath
Copy link
Member

aristath commented Mar 7, 2023

One of the things I think we should pursue is to use the style-engine...
We have the implementation but don't use it as much as we should, which is definitely a lost opportunity as it has great potential to help us improve and minimize the amount of styles we're shipping. We were working on it intensively a few months ago but then priorities shifted for a while and it's currently half-baked in the sense that it's not used where we wanted to use it (yet). It's only used in a couple of places instead of being used for all printed styles in a block theme.

Regarding the use of .css files vs inlining them, there's a very interesting video from the Google Chrome Developers channel on https://www.youtube.com/watch?v=3sMflOp5kiQ, and there was a discussion on #41793 last year where I did my best to add some explanations (though admittedly a bit over-simplified for the sake of clarity)

I still believe we can achieve a lot by inlining small assets, it's just an ongoing project that hasn't finished yet. Maybe we should restart it and try to optimize things?

@oandregal
Copy link
Member Author

Regarding the use of .css files vs inlining them, there's a very interesting video from the Google Chrome Developers channel on https://www.youtube.com/watch?v=3sMflOp5kiQ, and there was a discussion on #41793 last year where I did my best to add some explanations (though admittedly a bit over-simplified for the sake of clarity)

Thanks for sharing! I'll think about this.

Ironically, one of the "examples in the field" they mention as following the inline approach to validate what they are saying is a WordPress site (meaning they're highlighting how global styles work) 😅

I still believe we can achieve a lot by inlining small assets, it's just an ongoing project that hasn't finished yet. Maybe we should restart it and try to optimize things?

What's left here? Is there a tracking issue/conversation I can look at?

Other than optimizing how that code works, the main thing I can think of is the fact that many block styles still live in CSS and are not absorbed by the global styles algorithm, see #45198 This means they are shipped even if they'll be ignored later by the styles coming from theme.json.

@oandregal
Copy link
Member Author

oandregal commented Mar 7, 2023

@felixarntz following the conversations we had at https://core.trac.wordpress.org/ticket/57648 and GoogleChromeLabs/wpp-research#40 (comment)

I've tried 3 approaches for measuring the changes in this PR:

  • existing CI results coming from Gutenberg (16 requests) data
  • using curl to drive requests (300 requests) data
  • benchmark-web-vitals script (300 requests) data

While they show different base data (e.g. TTFB for TT1 theme: 32.7ms vs 74.79ms vs 75.40ms), they essentially report the same variance between themes or changes. One thing I've noticed is that the benchmark-web-vitals script seems to have better "stability". The results for classic themes in this experiment are not expected to change, and the variance reported by each one was:

  • benchmark-web-vitals reported <1ms for TTFB and <2ms for LCP.
  • curl reported up to 6ms for TT1 and <2ms for TT (only TTFB).
  • the CI job reported <1ms for TTFB and <4ms for LCP, though it was only 16 runs compared to 300 of the others

I've been frustrated by how the curl approach report so small numbers (15ms for classic and 30ms for block themes in some configuration of my laptopt), which makes it difficult to find smaller changes that affect performance. benchmark-web-vitals is really nice and if it adds LCP-TTFB it can become my tool of choice for this sort of testing!

@oandregal
Copy link
Member Author

Closing this PR, as I consider the experiment finished: separating assets is still relevant for performance, and our metrics reflect that.

Conversation can continue, though :)

@oandregal oandregal closed this Mar 7, 2023
@oandregal oandregal deleted the try/disable-per-block-embedded-stylesheet branch March 7, 2023 18:01
@gziolo gziolo added the [Type] Performance Related to performance efforts label Mar 8, 2023
@gziolo
Copy link
Member

gziolo commented Mar 8, 2023

Looking at the dev note shared when rolling out per-block (it applies to core blocks) stylesheet, I was wondering whether the part that inlines CSS styles has the biggest impact here:

We then start inlining these assets by going from smallest to largest, until a 20kb limit is reached.

That's the part that is probably the heaviest computation wise and it also impacts the size of the HTML document sent to users. So maybe another test would be useful here but instead of disabling granular serving of styles for core blocks, we could check different thresholds for inlining styles (applies to all blocks):

add_filter( 'styles_inline_size_limit', '__return_zero' );
add_filter( 'styles_inline_size_limit', function() {
    return 50000; // Size in bytes.
});

@oandregal oandregal removed the [Status] In Progress Tracking issues with work in progress label Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Type] Experimental Experimental feature or API. [Type] Performance Related to performance efforts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants