Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics: Add a new e2e/metrics folder to start writing some test to collect performance metrics #97780

Open
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

youknowriad
Copy link
Contributor

The ultimate goal is to have a wp.com or calypso dashboard in https://www.codevitals.run
This needs to be done in three steps (like we did before for woo, Gutenberg and core):

1- Add a script to be able to run and connect performance metrics on a given calypso instance.
2- Add a CI job to be able to run these metrics tests (and compare two different commits on that same run)
3- Log the results of that job result in codevitals.run

This PR attempts to address the first step here. It is based on this similar PR from Woo. woocommerce/woocommerce#42005 but it also uses the exact same setup as the calypso e2e tests (using these as a basis)

The PR is not ready yet because I want to add the "performance reporter" to the playwright config of the "metrics" test runner. The performance reporter class can be copy/pasted from the Woo PR but I'm having trouble finding the right way to add it to the playwright config. Help from someone familiar with calypso-e2e is needed.

For now I added a test that loads a large post on the post editor in Calypso and collects some basic loading metrics (the same one we use in woo and Gutenberg editor tests)

Testing instructions

cd test/e2e && yarn build && yarn metrics

@youknowriad youknowriad self-assigned this Dec 25, 2024
@matticbot matticbot added [Status] Needs Review The PR is ready for review. This also triggers e2e canary tests and wp-desktop tests automatically. labels Dec 25, 2024
@youknowriad youknowriad force-pushed the add/collect-calypso-metrics branch from 7480fad to ec89f65 Compare December 25, 2024 14:47
@matticbot
Copy link
Contributor

matticbot commented Dec 25, 2024

Here is how your PR affects size of JS and CSS bundles shipped to the user's browser:

App Entrypoints (~9 bytes added 📈 [gzipped])

name                   parsed_size           gzip_size
entry-subscriptions          -32 B  (-0.0%)       +9 B  (+0.0%)
entry-stepper                -32 B  (-0.0%)       +9 B  (+0.0%)
entry-main                   -32 B  (-0.0%)       +9 B  (+0.0%)
entry-login                  -32 B  (-0.0%)       +9 B  (+0.0%)
entry-domains-landing        -32 B  (-0.0%)       +9 B  (+0.0%)

Common code that is always downloaded and parsed every time the app is loaded, no matter which route is used.

Legend

What is parsed and gzip size?

Parsed Size: Uncompressed size of the JS and CSS files. This much code needs to be parsed and stored in memory.
Gzip Size: Compressed size of the JS and CSS files. This much data needs to be downloaded over network.

Generated by performance advisor bot at iscalypsofastyet.com.

Copy link
Member

@tyxla tyxla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Excited to see us gathering WP.com perf data in codevitals 👏

There are some pretty good opportunities to reuse code from Gutenberg, is there a good reason to avoid that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could reuse the Gutenberg one instead of declaring one here.

@@ -20,6 +20,7 @@
"clean:output": "rm -rf ./allure-results && rm -rf ./results",
"build": "yarn workspace @automattic/jest-circus-allure-reporter build && yarn workspace @automattic/calypso-e2e build",
"test": "yarn jest",
"metrics": "yarn jest --config jest.metrics.config.js",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to expose this to a script at the root level of the project? yarn run performance-metrics or something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, you know better than me, Personally I just tried to follow how e2e tests already work on calypso.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're fine, we can always add it later if necessary 👍

declare const browser: Browser;
const results: Record< string, number[] > = {};

describe( DataHelper.createSuiteTitle( 'Metrics: Editor' ), function () {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We keep reference those as just "metrics" but I think it might make sense to change them to "performance metrics" everywhere, since metrics is too generic.

/**
* Metrics class to collect performance metrics.
*/
export class Metrics {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good reason why we're not using @wordpress/e2e-test-utils-playwright here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried quickly, it was not straightforward because of import related issues. I'm not very familiar with the e2e test setup of calypso to be able to solve this properly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a stab at it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was pretty straightforward, but we also needed to add a peer dependency and shuffle some things around, see 47f9e94

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a small risk that the class Metrics doesn't work as expected because of the iframe in some situations though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. If it happens, it might make sense to extend it rather than redefine it though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to start that way and iterate.

* Loads HTML into the editor.
* @param content Content to load.
*/
async loadHtmlContent( content: string ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could reuse loadBlocksFromHtml() from Gutenberg. Perhaps we can move to the utils package and reuse from there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably can't reuse the one from Gutenberg, Calypso's editor is a bit more complex (iframed within iframe sometimes) so we can't reuse Gutenberg's one until the Calypso's iframe is completely removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some TODOs here then - ideally we should be able to remove the iframe soon, and then it would be nice to reuse what we can.

@matticbot
Copy link
Contributor

This PR modifies the release build for the following Calypso Apps:

For info about this notification, see here: PCYsg-OT6-p2

  • blaze-dashboard
  • command-palette-wp-admin
  • happy-blocks
  • help-center
  • notifications
  • odyssey-stats
  • whats-new
  • wpcom-block-editor

To test WordPress.com changes, run install-plugin.sh $pluginSlug add/collect-calypso-metrics on your sandbox.

@tyxla tyxla force-pushed the add/collect-calypso-metrics branch from d374a11 to 47f9e94 Compare December 30, 2024 15:01
Copy link
Member

@tyxla tyxla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working as expected 👍

I've pushed a small change to reuse some utils from @wordpress/e2e-test-utils-playwright, let me know if you have any concerns.

A question: have you given any thought about what kind of performance metrics we are aiming to track?

}
}

console.log( results );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we planning to do with the results? Is that good enough for the first pass / first step of what you're planning? Let me know if you need any help with the CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question in the PR description about this. I would love to use the "performance-reporter" from either Gutenberg or Woo (check the Woo PR) instead of console logging. It's just a way to dump the results for now.

I've spent some time trying to wire the performance reporter but failed, It needs to be added to the playwright config file but in calypso e2e tests, there's not playwright config file really, it's kind of buried in a series of scripts and configs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the entire setup is a bit rudimentary, and while it was migrated to Playwright, it still bears the mark of old technology used before it 😅

This is the config file: https://github.com/Automattic/wp-calypso/blob/2b4ed1bcc9a77c61f4456a70b051200957a84fd7/packages/calypso-e2e/src/jest-playwright-config/index.js - as you can see it uses Jest Playwright as a test runner and was never migrated to @playwright/test.

Now, I haven't tried using a custom reporter there, but you should be able to hook it the same way. You might need to work with a separate config to be able to inherit the original config and specify the custom reporter. Try it out and let me know how it goes.

Alternatively, migrating to @playwright/test could make sense, although I'm not sure what challenges you might bump into and how long it would take. I bet it's outside of your initial scope 😉

/**
* Metrics class to collect performance metrics.
*/
export class Metrics {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was pretty straightforward, but we also needed to add a peer dependency and shuffle some things around, see 47f9e94

@tyxla
Copy link
Member

tyxla commented Dec 30, 2024

Note that we already have some metrics for WP.com, so we might want to ensure we're not building something that exists.

For example, the Calypso RUM data provides some good insights about code vitals and is highly customizable in logstash / kibana:

f71e7d936028e25498cdbcf8d5428ebf-logstash

There are also some other Grafana and Kibana dashboards that might be useful.

cc @sgomes @josephscott @mreishus for any useful links to perf dashboards for WP.com / Calypso they can provide

@youknowriad
Copy link
Contributor Author

A question: have you given any thought about what kind of performance metrics we are aiming to track?

Not much yet but "loading the editor" is a good one for me at least. I have found that the editor is a bit slow to load in .com so at least we can keep track of that over time and see how to improve it.

But overall, I think it really comes down to areas we're working on and slow things we notice.

@tyxla
Copy link
Member

tyxla commented Dec 30, 2024

In addition to the above dashboard I sent, I just recalled we had setup a RUM data dashboard specifically for the iframed Gutenberg instance back in the day. Here it is:

  • Atomic sites: 09c736036ff08469a6fc009e389d889f-logstash
  • Simple sites: ea50f7e49fcf8234a3dc8a89cf826b50-logstash

@youknowriad
Copy link
Contributor Author

youknowriad commented Dec 30, 2024

Thanks for the links @tyxla It is possible that these dashboard are serving other needs, personally I'm having a very hard time understanding them properly. I'm not opinionated in terms of tooling though and would love to use whatever is available. That said what's important to me is:

1- A documented way for developers to add metrics (like the time it takes for the UI to react when opening the inserter, or switching a tab or something...)
2- A way to track that metric over time per commit/deploy to be able to easily find problematic changes
3- A way to compare that metic results of a PR in CI: Example here
4- A simple dashboard showing the important metrics that all developers should pay attention to when merging their PRs/deploying their commits. Something like this (same dashboard for all .com devs)

@tyxla
Copy link
Member

tyxla commented Dec 30, 2024

Thanks for the context @youknowriad, I agree that these serve different purposes - the ones I sent are generic, while you're interested in tracking the performance of specific actions. With that in mind, having WP.com as an instance under codevitals.run makes sense.

The interesting, and potentially tricky part here, is that WP.com is a convoluted piece that depends on multiple repositories, namely these are a few that are worth highlighting:

  • Gutenberg is shipped on it, as well as a few other plugins
  • Jetpack runs on it
  • WP.com, the backend, runs on it
  • Calypso is part of its UI

I can imagine changing any of those might influence the metrics we're interested in. So we'll need either multiple sub-dashboards, or one dashboard that somehow works with multiple repositories. Has this come up as you've been considering introducing WP.com codevitals.run perf metrics?

@youknowriad
Copy link
Contributor Author

I can imagine changing any of those might influence the metrics we're interested in. So we'll need either multiple sub-dashboards, or one dashboard that somehow works with multiple repositories. Has this come up as you've been considering introducing WP.com codevitals.run perf metrics?

Good remark and I agree it makes things trickier :). I think to do that properly we need two things:

1- Being able to run the same metrics tests in multiple repositories.
2- Being able to say which repository triggers the "log" call in codevitals (should be a small addition).

That said, I think for 1 (and unless I'm misunderstanding how .com deployments work), we only need to run these in two repositories: wpcom and wpcalypso. If I'm not wrong, for the two other repositories, the changes need to first be commited to either calypso or .com which would trigger the metrics logging.

@tyxla
Copy link
Member

tyxla commented Dec 30, 2024

I can imagine changing any of those might influence the metrics we're interested in. So we'll need either multiple sub-dashboards, or one dashboard that somehow works with multiple repositories. Has this come up as you've been considering introducing WP.com codevitals.run perf metrics?

Good remark and I agree it makes things trickier :). I think to do that properly we need two things:

1- Being able to run the same metrics tests in multiple repositories. 2- Being able to say which repository triggers the "log" call in codevitals (should be a small addition).

That said, I think for 1 (and unless I'm misunderstanding how .com deployments work), we only need to run these in two repositories: wpcom and wpcalypso. If I'm not wrong, for the two other repositories, the changes need to first be commited to either calypso or .com which would trigger the metrics logging.

That makes sense. Deploying Jetpack, Gutenberg, and any other plugins is done in distinct commits to wpcom, so that will be easy enough to spot. What makes it tricky for the developer is to pinpoint the exact Jetpack or Gutenberg commit that caused it, since the commit will contain all the changes from the last deploy (for Gutenberg, that's an entire GB release for example). So we might need something on top of that to help us pinpoint the differences.

@youknowriad
Copy link
Contributor Author

That makes sense. Deploying Jetpack, Gutenberg, and any other plugins is done in distinct commits to wpcom, so that will be easy enough to spot. What makes it tricky for the developer is to pinpoint the exact Jetpack or Gutenberg commit that caused it, since the commit will contain all the changes from the last deploy (for Gutenberg, that's an entire GB release for example). So we might need something on top of that to help us pinpoint the differences.

Yes, The ideal scenario is that each plugin/repository has a dashboard and run its metrics tests in isolation. In a sense we already have that in Gutenberg. We might want to add a dashboard for Jetpack just like we have for Woo.

That said, we'll still have some things that will be obscured (jetpack mu plugin for instance or things that Gutenberg perf tests might not have catched) but at least we'd reduce that number a lot.

In fact, even for Calypso if we had a way to run some kind of .com container separate from production we could do that but I know it's not the case right now. So what I suggest is to try to start in Calypso and try to see how we can reuse the test runner on .com.

blocks.forEach( ( block: any ) => {
if ( block.name === 'core/image' ) {
delete block.attributes.id;
delete block.attributes.url;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we do this in GB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was originally copied from Gutenberg, so probably?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Status] Needs Review The PR is ready for review. This also triggers e2e canary tests and wp-desktop tests automatically.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants