Refactor performance tests setup #49238

WunderBart · 2023-03-21T16:55:02Z

What?

Refactor the performance testing environment setup to improve developer experience both locally and in CI.

TL;DR:

Cut setup time by 50% through the following improvements:
- a. Utilize the local code for the reference environment and test source.
- b. Run branch environment setups in parallel.
Limit disk space usage by reusing the setup base folder.
Allow rerunning tests with existing setup to save time during local development.
Always use the current root branch as the test source to prevent issues with env misalignments.
Use the latest stable WordPress version by default instead of the bleeding edge for consistent results.

Read on for more details.

⚡︎ Reduce disk space usage to a fixed 3GB

While debugging the script, after a couple of reruns, I started running out of disk space. It turned out that the script is creating a new test environment (repo clones + wp-env docker images) with each run without clearing it when finished. Currently, the initial perf script run takes ~8GB (6GB for 3 repo clones + 2GB for wp-env docker images), and each next takes another ~6GB for new clones.

Furthermore, one of the comparison branches is always the branch that we're currently checked out at (true for each CI job), but for some reason, it's being cloned twice from the origin: one for the test environment and one for the test source. This takes not only extra space but also time.

The refactored approach includes two changes to address the above:

Use a fixed name for the temporary base folder (wp-gutenberg-performance-tests),
Use the current branch for the test environment and source.

This setup requires 3GB for a typical job where we compare two branches, e.g., the current one vs. trunk.

⚡︎ Use the current root branch code directly

Let's say your current branch is try/some-stuff and you'd like to compare its performance against trunk. To do that, you need to run the following from the gutenberg root:

 bin/plugin/cli.js perf try/some-stuff trunk

Currently, this will not use your local code. Instead, it will clone the branch to be used as the env source. It will also create another copy of that branch to be used as the source of the tests. In reality, we always use the current branch as a reference (as each workflow perf comparison job does), so using what's already available makes more sense. For the CI, it means that it will only need to build the current branch, allowing us to drop the clones and extra (uncached) npm installs. Locally, it means we can test against the currently developed code, which isn't currently possible.

⚡︎ Use the current root branch as a tests source

In addition to the above, the current branch should always be used as the tests source because they should always be coupled with the workflow and the CLI runner script. Currently, the tests branch is either the first one passed for comparison or the one specified by the --test-branch option. This approach can cause issues, as exemplified by the currently failing release performance tests where the branch used as the tests source (wp/6.1) is not aligned with workflow and CLI source (release/15.6).

Because of the above, keeping the --test-branch option didn't make sense, so I've removed it.

⚡︎ Enable rerunning the tests using the existing environment (locally)

Using the same base folder for each run enables the reuse of the environment. Currently, each run creates the whole environment from scratch, which takes ~8mins on an M1 Mac. The refactored approach allows for the reuse of existing env, which skips the setup if the builds are up to date and move straight to the tests. Unfortunately, I don't think this is possible currently in the CI, but it helps a ton when working with the tests locally.

⚡︎ Run each branch setup in parallel

Fetching/building separate branches can be done in parallel, saving some extra time. CI efficiency is too erratic to give solid numbers, but testing locally, the refactored setup takes 50% less time than the current one when building from scratch. For my machine, which is an M1 Mac, it reduced from ~8 to ~4 minutes.

⚡︎ Use the latest stable WordPress version instead of bleeding edge by default

We're currently using the WP bleeding edge (core: "WordPress/WordPress" in wp-env config) for the performance tests environment unless specified otherwise via the --wp-version option. I'm not sure if that's intentional, but it makes more sense to me to run performance tests against the latest production release instead (core: null) for more consistent measurements.

Testing Instructions

Checkout this branch (refactor/perf-tests-tmp-folder) and run npm install.
The following scenarios should be working as expected:

Compare the current branch with another branch or commit SHA. There should not be a need to pass the current branch's name, e.g.:
```
bin/plugin/cli.js perf trunk
```
```
bin/plugin/cli.js perf debd225d007f4e441ceec80fbd6fa96653f94737
```

Compare the current branch with two or more other git refs, e.g.:

bin/plugin/cli.js perf trunk debd225d007f4e441ceec80fbd6fa96653f94737

Compare using a specific WP version:

bin/plugin/cli.js perf debd225d007f4e441ceec80fbd6fa96653f94737 --wp-version 6.1

Rerun the above to ensure the setup is being reused:

bin/plugin/cli.js perf debd225d007f4e441ceec80fbd6fa96653f94737 --wp-version 6.1

To ensure whether other specific comparisons work, e.g., the Compare performance with current WordPress Core and previous Gutenberg versions from the performance.yml workflow, you'll need to do the following.

Assuming we're testing release/15.6 vs. release/15.5 and wp/6.1:
- git checkout release/15.6 as it's the one from which the workflow will be dispatched,
- git merge origin refactor/perf-test-results-path,
- nvm use && npm install,
- bin/plugin/cli.js perf release/15.5 wp/6.1 --wp-version 6.1,
- Confirm when asked to clear the environment,
- Confirm when asked if the plugin should be built from the local branch.
- Go make some coffee,
- See the tests pass, 🤞
- git reset --hard origin/release/15.6.
The script should recognize if the current root branch name is passed explicitly and still use the local code, e.g.:
```
bin/plugin/cli.js perf refactor/perf-test-results-path trunk
```
Finally, ensure that when comparing this branch with trunk, they should both show the same numbers. This should be proof that the new logic does not affect the perf measurements.
```
bin/plugin/cli.js perf trunk
```

github-actions · 2023-03-29T16:56:46Z

Size Change: 0 B

Total Size: 1.37 MB

ℹ️ View Unchanged

Filename	Size
`build/a11y/index.min.js`	993 B
`build/annotations/index.min.js`	2.78 kB
`build/api-fetch/index.min.js`	2.27 kB
`build/autop/index.min.js`	2.15 kB
`build/blob/index.min.js`	483 B
`build/block-directory/index.min.js`	7.2 kB
`build/block-directory/style-rtl.css`	1.04 kB
`build/block-directory/style.css`	1.04 kB
`build/block-editor/content-rtl.css`	4.11 kB
`build/block-editor/content.css`	4.1 kB
`build/block-editor/default-editor-styles-rtl.css`	403 B
`build/block-editor/default-editor-styles.css`	403 B
`build/block-editor/index.min.js`	203 kB
`build/block-editor/style-rtl.css`	14.6 kB
`build/block-editor/style.css`	14.6 kB
`build/block-library/blocks/archives/editor-rtl.css`	61 B
`build/block-library/blocks/archives/editor.css`	60 B
`build/block-library/blocks/archives/style-rtl.css`	90 B
`build/block-library/blocks/archives/style.css`	90 B
`build/block-library/blocks/audio/editor-rtl.css`	150 B
`build/block-library/blocks/audio/editor.css`	150 B
`build/block-library/blocks/audio/style-rtl.css`	122 B
`build/block-library/blocks/audio/style.css`	122 B
`build/block-library/blocks/audio/theme-rtl.css`	138 B
`build/block-library/blocks/audio/theme.css`	138 B
`build/block-library/blocks/avatar/editor-rtl.css`	116 B
`build/block-library/blocks/avatar/editor.css`	116 B
`build/block-library/blocks/avatar/style-rtl.css`	91 B
`build/block-library/blocks/avatar/style.css`	91 B
`build/block-library/blocks/block/editor-rtl.css`	305 B
`build/block-library/blocks/block/editor.css`	305 B
`build/block-library/blocks/button/editor-rtl.css`	587 B
`build/block-library/blocks/button/editor.css`	587 B
`build/block-library/blocks/button/style-rtl.css`	628 B
`build/block-library/blocks/button/style.css`	627 B
`build/block-library/blocks/buttons/editor-rtl.css`	337 B
`build/block-library/blocks/buttons/editor.css`	337 B
`build/block-library/blocks/buttons/style-rtl.css`	332 B
`build/block-library/blocks/buttons/style.css`	332 B
`build/block-library/blocks/calendar/style-rtl.css`	239 B
`build/block-library/blocks/calendar/style.css`	239 B
`build/block-library/blocks/categories/editor-rtl.css`	113 B
`build/block-library/blocks/categories/editor.css`	112 B
`build/block-library/blocks/categories/style-rtl.css`	124 B
`build/block-library/blocks/categories/style.css`	124 B
`build/block-library/blocks/code/editor-rtl.css`	53 B
`build/block-library/blocks/code/editor.css`	53 B
`build/block-library/blocks/code/style-rtl.css`	121 B
`build/block-library/blocks/code/style.css`	121 B
`build/block-library/blocks/code/theme-rtl.css`	124 B
`build/block-library/blocks/code/theme.css`	124 B
`build/block-library/blocks/columns/editor-rtl.css`	108 B
`build/block-library/blocks/columns/editor.css`	108 B
`build/block-library/blocks/columns/style-rtl.css`	409 B
`build/block-library/blocks/columns/style.css`	409 B
`build/block-library/blocks/comment-author-avatar/editor-rtl.css`	125 B
`build/block-library/blocks/comment-author-avatar/editor.css`	125 B
`build/block-library/blocks/comment-content/style-rtl.css`	92 B
`build/block-library/blocks/comment-content/style.css`	92 B
`build/block-library/blocks/comment-template/style-rtl.css`	199 B
`build/block-library/blocks/comment-template/style.css`	198 B
`build/block-library/blocks/comments-pagination-numbers/editor-rtl.css`	123 B
`build/block-library/blocks/comments-pagination-numbers/editor.css`	121 B
`build/block-library/blocks/comments-pagination/editor-rtl.css`	222 B
`build/block-library/blocks/comments-pagination/editor.css`	209 B
`build/block-library/blocks/comments-pagination/style-rtl.css`	235 B
`build/block-library/blocks/comments-pagination/style.css`	231 B
`build/block-library/blocks/comments-title/editor-rtl.css`	75 B
`build/block-library/blocks/comments-title/editor.css`	75 B
`build/block-library/blocks/comments/editor-rtl.css`	840 B
`build/block-library/blocks/comments/editor.css`	839 B
`build/block-library/blocks/comments/style-rtl.css`	637 B
`build/block-library/blocks/comments/style.css`	636 B
`build/block-library/blocks/cover/editor-rtl.css`	649 B
`build/block-library/blocks/cover/editor.css`	651 B
`build/block-library/blocks/cover/style-rtl.css`	1.61 kB
`build/block-library/blocks/cover/style.css`	1.6 kB
`build/block-library/blocks/details-summary/editor-rtl.css`	65 B
`build/block-library/blocks/details-summary/editor.css`	65 B
`build/block-library/blocks/details-summary/style-rtl.css`	61 B
`build/block-library/blocks/details-summary/style.css`	61 B
`build/block-library/blocks/details/style-rtl.css`	54 B
`build/block-library/blocks/details/style.css`	54 B
`build/block-library/blocks/embed/editor-rtl.css`	293 B
`build/block-library/blocks/embed/editor.css`	293 B
`build/block-library/blocks/embed/style-rtl.css`	410 B
`build/block-library/blocks/embed/style.css`	410 B
`build/block-library/blocks/embed/theme-rtl.css`	138 B
`build/block-library/blocks/embed/theme.css`	138 B
`build/block-library/blocks/file/editor-rtl.css`	300 B
`build/block-library/blocks/file/editor.css`	300 B
`build/block-library/blocks/file/style-rtl.css`	269 B
`build/block-library/blocks/file/style.css`	270 B
`build/block-library/blocks/file/view.min.js`	353 B
`build/block-library/blocks/freeform/editor-rtl.css`	2.44 kB
`build/block-library/blocks/freeform/editor.css`	2.44 kB
`build/block-library/blocks/gallery/editor-rtl.css`	984 B
`build/block-library/blocks/gallery/editor.css`	988 B
`build/block-library/blocks/gallery/style-rtl.css`	1.55 kB
`build/block-library/blocks/gallery/style.css`	1.55 kB
`build/block-library/blocks/gallery/theme-rtl.css`	122 B
`build/block-library/blocks/gallery/theme.css`	122 B
`build/block-library/blocks/group/editor-rtl.css`	654 B
`build/block-library/blocks/group/editor.css`	654 B
`build/block-library/blocks/group/style-rtl.css`	57 B
`build/block-library/blocks/group/style.css`	57 B
`build/block-library/blocks/group/theme-rtl.css`	78 B
`build/block-library/blocks/group/theme.css`	78 B
`build/block-library/blocks/heading/style-rtl.css`	76 B
`build/block-library/blocks/heading/style.css`	76 B
`build/block-library/blocks/html/editor-rtl.css`	332 B
`build/block-library/blocks/html/editor.css`	333 B
`build/block-library/blocks/image/editor-rtl.css`	830 B
`build/block-library/blocks/image/editor.css`	829 B
`build/block-library/blocks/image/style-rtl.css`	652 B
`build/block-library/blocks/image/style.css`	652 B
`build/block-library/blocks/image/theme-rtl.css`	137 B
`build/block-library/blocks/image/theme.css`	137 B
`build/block-library/blocks/latest-comments/style-rtl.css`	357 B
`build/block-library/blocks/latest-comments/style.css`	357 B
`build/block-library/blocks/latest-posts/editor-rtl.css`	213 B
`build/block-library/blocks/latest-posts/editor.css`	212 B
`build/block-library/blocks/latest-posts/style-rtl.css`	478 B
`build/block-library/blocks/latest-posts/style.css`	478 B
`build/block-library/blocks/list/style-rtl.css`	88 B
`build/block-library/blocks/list/style.css`	88 B
`build/block-library/blocks/media-text/editor-rtl.css`	266 B
`build/block-library/blocks/media-text/editor.css`	263 B
`build/block-library/blocks/media-text/style-rtl.css`	507 B
`build/block-library/blocks/media-text/style.css`	505 B
`build/block-library/blocks/more/editor-rtl.css`	431 B
`build/block-library/blocks/more/editor.css`	431 B
`build/block-library/blocks/navigation-link/editor-rtl.css`	716 B
`build/block-library/blocks/navigation-link/editor.css`	715 B
`build/block-library/blocks/navigation-link/style-rtl.css`	115 B
`build/block-library/blocks/navigation-link/style.css`	115 B
`build/block-library/blocks/navigation-submenu/editor-rtl.css`	299 B
`build/block-library/blocks/navigation-submenu/editor.css`	299 B
`build/block-library/blocks/navigation/editor-rtl.css`	2.13 kB
`build/block-library/blocks/navigation/editor.css`	2.14 kB
`build/block-library/blocks/navigation/style-rtl.css`	2.22 kB
`build/block-library/blocks/navigation/style.css`	2.21 kB
`build/block-library/blocks/navigation/view-modal.min.js`	2.81 kB
`build/block-library/blocks/navigation/view.min.js`	447 B
`build/block-library/blocks/nextpage/editor-rtl.css`	395 B
`build/block-library/blocks/nextpage/editor.css`	395 B
`build/block-library/blocks/page-list/editor-rtl.css`	401 B
`build/block-library/blocks/page-list/editor.css`	401 B
`build/block-library/blocks/page-list/style-rtl.css`	175 B
`build/block-library/blocks/page-list/style.css`	175 B
`build/block-library/blocks/paragraph/editor-rtl.css`	174 B
`build/block-library/blocks/paragraph/editor.css`	174 B
`build/block-library/blocks/paragraph/style-rtl.css`	279 B
`build/block-library/blocks/paragraph/style.css`	281 B
`build/block-library/blocks/post-author/style-rtl.css`	175 B
`build/block-library/blocks/post-author/style.css`	176 B
`build/block-library/blocks/post-comments-form/editor-rtl.css`	96 B
`build/block-library/blocks/post-comments-form/editor.css`	96 B
`build/block-library/blocks/post-comments-form/style-rtl.css`	501 B
`build/block-library/blocks/post-comments-form/style.css`	501 B
`build/block-library/blocks/post-date/style-rtl.css`	61 B
`build/block-library/blocks/post-date/style.css`	61 B
`build/block-library/blocks/post-excerpt/editor-rtl.css`	71 B
`build/block-library/blocks/post-excerpt/editor.css`	71 B
`build/block-library/blocks/post-excerpt/style-rtl.css`	141 B
`build/block-library/blocks/post-excerpt/style.css`	141 B
`build/block-library/blocks/post-featured-image/editor-rtl.css`	588 B
`build/block-library/blocks/post-featured-image/editor.css`	586 B
`build/block-library/blocks/post-featured-image/style-rtl.css`	322 B
`build/block-library/blocks/post-featured-image/style.css`	322 B
`build/block-library/blocks/post-navigation-link/style-rtl.css`	153 B
`build/block-library/blocks/post-navigation-link/style.css`	153 B
`build/block-library/blocks/post-template/editor-rtl.css`	99 B
`build/block-library/blocks/post-template/editor.css`	98 B
`build/block-library/blocks/post-template/style-rtl.css`	281 B
`build/block-library/blocks/post-template/style.css`	281 B
`build/block-library/blocks/post-terms/style-rtl.css`	96 B
`build/block-library/blocks/post-terms/style.css`	96 B
`build/block-library/blocks/post-time-to-read/style-rtl.css`	69 B
`build/block-library/blocks/post-time-to-read/style.css`	69 B
`build/block-library/blocks/post-title/style-rtl.css`	100 B
`build/block-library/blocks/post-title/style.css`	100 B
`build/block-library/blocks/preformatted/style-rtl.css`	103 B
`build/block-library/blocks/preformatted/style.css`	103 B
`build/block-library/blocks/pullquote/editor-rtl.css`	135 B
`build/block-library/blocks/pullquote/editor.css`	135 B
`build/block-library/blocks/pullquote/style-rtl.css`	335 B
`build/block-library/blocks/pullquote/style.css`	335 B
`build/block-library/blocks/pullquote/theme-rtl.css`	167 B
`build/block-library/blocks/pullquote/theme.css`	167 B
`build/block-library/blocks/query-pagination-numbers/editor-rtl.css`	122 B
`build/block-library/blocks/query-pagination-numbers/editor.css`	121 B
`build/block-library/blocks/query-pagination/editor-rtl.css`	221 B
`build/block-library/blocks/query-pagination/editor.css`	211 B
`build/block-library/blocks/query-pagination/style-rtl.css`	288 B
`build/block-library/blocks/query-pagination/style.css`	284 B
`build/block-library/blocks/query-title/style-rtl.css`	63 B
`build/block-library/blocks/query-title/style.css`	63 B
`build/block-library/blocks/query/editor-rtl.css`	463 B
`build/block-library/blocks/query/editor.css`	463 B
`build/block-library/blocks/quote/style-rtl.css`	222 B
`build/block-library/blocks/quote/style.css`	222 B
`build/block-library/blocks/quote/theme-rtl.css`	223 B
`build/block-library/blocks/quote/theme.css`	226 B
`build/block-library/blocks/read-more/style-rtl.css`	132 B
`build/block-library/blocks/read-more/style.css`	132 B
`build/block-library/blocks/rss/editor-rtl.css`	149 B
`build/block-library/blocks/rss/editor.css`	149 B
`build/block-library/blocks/rss/style-rtl.css`	289 B
`build/block-library/blocks/rss/style.css`	288 B
`build/block-library/blocks/search/editor-rtl.css`	165 B
`build/block-library/blocks/search/editor.css`	165 B
`build/block-library/blocks/search/style-rtl.css`	408 B
`build/block-library/blocks/search/style.css`	406 B
`build/block-library/blocks/search/theme-rtl.css`	114 B
`build/block-library/blocks/search/theme.css`	114 B
`build/block-library/blocks/separator/editor-rtl.css`	146 B
`build/block-library/blocks/separator/editor.css`	146 B
`build/block-library/blocks/separator/style-rtl.css`	234 B
`build/block-library/blocks/separator/style.css`	234 B
`build/block-library/blocks/separator/theme-rtl.css`	194 B
`build/block-library/blocks/separator/theme.css`	194 B
`build/block-library/blocks/shortcode/editor-rtl.css`	329 B
`build/block-library/blocks/shortcode/editor.css`	329 B
`build/block-library/blocks/site-logo/editor-rtl.css`	489 B
`build/block-library/blocks/site-logo/editor.css`	489 B
`build/block-library/blocks/site-logo/style-rtl.css`	203 B
`build/block-library/blocks/site-logo/style.css`	203 B
`build/block-library/blocks/site-tagline/editor-rtl.css`	86 B
`build/block-library/blocks/site-tagline/editor.css`	86 B
`build/block-library/blocks/site-title/editor-rtl.css`	116 B
`build/block-library/blocks/site-title/editor.css`	116 B
`build/block-library/blocks/site-title/style-rtl.css`	57 B
`build/block-library/blocks/site-title/style.css`	57 B
`build/block-library/blocks/social-link/editor-rtl.css`	184 B
`build/block-library/blocks/social-link/editor.css`	184 B
`build/block-library/blocks/social-links/editor-rtl.css`	674 B
`build/block-library/blocks/social-links/editor.css`	673 B
`build/block-library/blocks/social-links/style-rtl.css`	1.4 kB
`build/block-library/blocks/social-links/style.css`	1.39 kB
`build/block-library/blocks/spacer/editor-rtl.css`	359 B
`build/block-library/blocks/spacer/editor.css`	359 B
`build/block-library/blocks/spacer/style-rtl.css`	48 B
`build/block-library/blocks/spacer/style.css`	48 B
`build/block-library/blocks/table/editor-rtl.css`	433 B
`build/block-library/blocks/table/editor.css`	433 B
`build/block-library/blocks/table/style-rtl.css`	651 B
`build/block-library/blocks/table/style.css`	650 B
`build/block-library/blocks/table/theme-rtl.css`	157 B
`build/block-library/blocks/table/theme.css`	157 B
`build/block-library/blocks/tag-cloud/style-rtl.css`	251 B
`build/block-library/blocks/tag-cloud/style.css`	253 B
`build/block-library/blocks/template-part/editor-rtl.css`	404 B
`build/block-library/blocks/template-part/editor.css`	404 B
`build/block-library/blocks/template-part/theme-rtl.css`	101 B
`build/block-library/blocks/template-part/theme.css`	101 B
`build/block-library/blocks/text-columns/editor-rtl.css`	95 B
`build/block-library/blocks/text-columns/editor.css`	95 B
`build/block-library/blocks/text-columns/style-rtl.css`	166 B
`build/block-library/blocks/text-columns/style.css`	166 B
`build/block-library/blocks/verse/style-rtl.css`	99 B
`build/block-library/blocks/verse/style.css`	99 B
`build/block-library/blocks/video/editor-rtl.css`	552 B
`build/block-library/blocks/video/editor.css`	555 B
`build/block-library/blocks/video/style-rtl.css`	179 B
`build/block-library/blocks/video/style.css`	179 B
`build/block-library/blocks/video/theme-rtl.css`	139 B
`build/block-library/blocks/video/theme.css`	139 B
`build/block-library/classic-rtl.css`	179 B
`build/block-library/classic.css`	179 B
`build/block-library/common-rtl.css`	1.12 kB
`build/block-library/common.css`	1.12 kB
`build/block-library/editor-elements-rtl.css`	75 B
`build/block-library/editor-elements.css`	75 B
`build/block-library/editor-rtl.css`	11.6 kB
`build/block-library/editor.css`	11.6 kB
`build/block-library/elements-rtl.css`	54 B
`build/block-library/elements.css`	54 B
`build/block-library/index.min.js`	204 kB
`build/block-library/reset-rtl.css`	478 B
`build/block-library/reset.css`	478 B
`build/block-library/style-rtl.css`	12.8 kB
`build/block-library/style.css`	12.8 kB
`build/block-library/theme-rtl.css`	698 B
`build/block-library/theme.css`	703 B
`build/block-serialization-default-parser/index.min.js`	1.13 kB
`build/block-serialization-spec-parser/index.min.js`	2.83 kB
`build/blocks/index.min.js`	51.1 kB
`build/commands/index.min.js`	14.8 kB
`build/commands/style-rtl.css`	1.1 kB
`build/commands/style.css`	1.09 kB
`build/components/index.min.js`	208 kB
`build/components/style-rtl.css`	11.7 kB
`build/components/style.css`	11.7 kB
`build/compose/index.min.js`	12.4 kB
`build/core-data/index.min.js`	16.3 kB
`build/customize-widgets/index.min.js`	12.2 kB
`build/customize-widgets/style-rtl.css`	1.41 kB
`build/customize-widgets/style.css`	1.41 kB
`build/data-controls/index.min.js`	718 B
`build/data/index.min.js`	8.68 kB
`build/date/index.min.js`	40.4 kB
`build/deprecated/index.min.js`	518 B
`build/dom-ready/index.min.js`	336 B
`build/dom/index.min.js`	4.72 kB
`build/edit-post/classic-rtl.css`	571 B
`build/edit-post/classic.css`	571 B
`build/edit-post/index.min.js`	35 kB
`build/edit-post/style-rtl.css`	7.59 kB
`build/edit-post/style.css`	7.58 kB
`build/edit-site/index.min.js`	64.3 kB
`build/edit-site/style-rtl.css`	10.1 kB
`build/edit-site/style.css`	10.1 kB
`build/edit-widgets/index.min.js`	17.3 kB
`build/edit-widgets/style-rtl.css`	4.56 kB
`build/edit-widgets/style.css`	4.56 kB
`build/editor/index.min.js`	45.9 kB
`build/editor/style-rtl.css`	3.49 kB
`build/editor/style.css`	3.48 kB
`build/element/index.min.js`	4.95 kB
`build/escape-html/index.min.js`	548 B
`build/format-library/index.min.js`	7.26 kB
`build/format-library/style-rtl.css`	557 B
`build/format-library/style.css`	556 B
`build/hooks/index.min.js`	1.66 kB
`build/html-entities/index.min.js`	454 B
`build/i18n/index.min.js`	3.79 kB
`build/is-shallow-equal/index.min.js`	535 B
`build/keyboard-shortcuts/index.min.js`	1.79 kB
`build/keycodes/index.min.js`	1.94 kB
`build/list-reusable-blocks/index.min.js`	2.14 kB
`build/list-reusable-blocks/style-rtl.css`	865 B
`build/list-reusable-blocks/style.css`	865 B
`build/media-utils/index.min.js`	2.99 kB
`build/notices/index.min.js`	977 B
`build/plugins/index.min.js`	1.94 kB
`build/preferences-persistence/index.min.js`	2.23 kB
`build/preferences/index.min.js`	1.35 kB
`build/primitives/index.min.js`	960 B
`build/priority-queue/index.min.js`	1.52 kB
`build/private-apis/index.min.js`	942 B
`build/react-i18n/index.min.js`	702 B
`build/react-refresh-entry/index.min.js`	8.44 kB
`build/react-refresh-runtime/index.min.js`	7.31 kB
`build/redux-routine/index.min.js`	2.75 kB
`build/reusable-blocks/index.min.js`	2.26 kB
`build/reusable-blocks/style-rtl.css`	265 B
`build/reusable-blocks/style.css`	265 B
`build/rich-text/index.min.js`	11.1 kB
`build/server-side-render/index.min.js`	2.09 kB
`build/shortcode/index.min.js`	1.52 kB
`build/style-engine/index.min.js`	1.55 kB
`build/token-list/index.min.js`	650 B
`build/url/index.min.js`	3.74 kB
`build/vendors/inert-polyfill.min.js`	2.48 kB
`build/vendors/react-dom.min.js`	41.8 kB
`build/vendors/react.min.js`	4.02 kB
`build/viewport/index.min.js`	1.09 kB
`build/warning/index.min.js`	280 B
`build/widgets/index.min.js`	7.3 kB
`build/widgets/style-rtl.css`	1.18 kB
`build/widgets/style.css`	1.18 kB
`build/wordcount/index.min.js`	1.06 kB

_{compressed-size-action}

github-actions · 2023-03-29T17:17:59Z

Flaky tests detected in 0affd72.
Some tests passed with failed attempts. The failures may not be related to this commit but are still reported for visibility. See the documentation for more information.

🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/4700855963
📝 Reported issues:

[Flaky Test] should be able to create an hierarchical post without title support #36939 in specs/editor/plugins/custom-post-types.test.js

…lder

This reverts commit 62f5d99.

youknowriad · 2023-04-14T17:01:55Z

.github/workflows/performance.yml

@@ -39,7 +39,7 @@ jobs:

            - name: Compare performance with trunk
              if: github.event_name == 'pull_request'
-              run: ./bin/plugin/cli.js perf $GITHUB_SHA trunk --tests-branch $GITHUB_SHA
+              run: ./bin/plugin/cli.js perf trunk


How do we ensure that we run the performance tests that are on the PR. For instance when we make changes to the tests themselves?

Oh If I'm reading this, there's now an assumption that the perf cli.js is included within a Gutenberg plugin and that we need we compare with the current git hash automatically and use it also as tests branch.

One of the initial goal of the cli is to actually be "independent", while it lives in the Gutenberg plugin, it doesn't have any requirement in the sense that the current branch could be broken, have edits to Gutenberg code... it doesn't matter, the tool would always perform a clone. In fact one goal was to make it independent of Gutenberg entirely and be a tool any plugin can use. It seems the current PR goes into the opposite direction.

For the general direction of this PR, see my answer here.

(…) In fact one goal was to make it independent of Gutenberg entirely and be a tool any plugin can use.

I'm struggling to visualize the full picture of how the perf tests CLI runner, as an independent tool, would be utilized by other plugins. Would it not be running from the local gutenberg repo then, but would still be comparing gutenberg branches? It would be really helpful if you could give an example of a plugin development scenario where this tool is used 🙏

Yeah, the initial goal was to be an independent tool, something like so:

# installing the tool npm install -g @wordpress/plugin-cli # writing a config file touch { repoUrl: "some url", buildCommand: "npm run build", perfCommand: "npm run perf", wpPluginSlug: "someslug" } >> plugin-cli.json # running the perf command plugin-cli ---config plugin-cli.json some-branch some-other-branch

The goal was also for the cli to include more than just perf command, maintaining changeling, releasing to SVN/Github was also a command that was part of the same tool, I see that it was removed now in favor of all the git workflows. I'm kind of sad that it didn't remain as a command to be used by the workflows instead of writing it within the workflow files like we do now.

Anyway, given that the previous direction is also to move away from the goal of being an independent tool, one more step away won't be that harmful here, so I guess feel free to make that change if you think it's a good idea.

Yeah, the initial goal was to be an independent tool, something like so:

# installing the tool npm install -g @wordpress/plugin-cli # writing a config file touch { repoUrl: "some url", buildCommand: "npm run build", perfCommand: "npm run perf", wpPluginSlug: "someslug" } >> plugin-cli.json # running the perf command plugin-cli ---config plugin-cli.json some-branch some-other-branch

Oh, so by independent, you meant from Gutenberg entirely, which would run perf comparisons for any plugin?

If yes, I think we can still assume and leverage the fact that it will always be running from SOME plugin repo root and that the current branch will be a primary subject of the test. The CLI will still need to run the current plugin's dedicated perf tests, right? So it will be easier to develop/debug those tests with the CLI, and also utilize it by the target plugin's CI, where the current branch is always a primary reference.

Now that I think about it, we can replace Gutenberg with a generic plugin in this refactored script, and it would be all that's needed to make this PR not tie it any further to Gutenberg. There's no check that the current repo is actually Gutenberg.

Also, as I mentioned earlier, we can implement the --from-origin and build everything from GH if we don't want to use the local code.

How does that sound?

Sounds good to me 👍

youknowriad · 2023-04-14T17:03:31Z

.github/workflows/performance.yml

                  PREVIOUS_VERSION_BASE_10=$((PLUGIN_VERSION_ARRAY[0] * 10 + PLUGIN_VERSION_ARRAY[1] - 1))
                  PREVIOUS_RELEASE_BRANCH="release/$((PREVIOUS_VERSION_BASE_10 / 10)).$((PREVIOUS_VERSION_BASE_10 % 10))"
                  WP_VERSION=$(awk -F ': ' '/^Tested up to/{print $2}' readme.txt)
                  IFS=. read -ra WP_VERSION_ARRAY <<< "$WP_VERSION"
                  WP_MAJOR="${WP_VERSION_ARRAY[0]}.${WP_VERSION_ARRAY[1]}"
-                  ./bin/plugin/cli.js perf "wp/$WP_MAJOR" "$PREVIOUS_RELEASE_BRANCH" "$CURRENT_RELEASE_BRANCH" --wp-version "$WP_MAJOR"
+                  ./bin/plugin/cli.js perf "$PREVIOUS_RELEASE_BRANCH" "wp/$WP_MAJOR" --wp-version "$WP_MAJOR"


Why did we remove $CURRENT_RELEASE_BRANCH here? I know the release perf tests are broken now but the idea is to have a table that compares the current release, the previous release and the previous WP release.

Are we certain here that $CURRENT_RELEASE_BRANCH is the same thing as $GITHUB_SHA

$CURRENT_RELEASE_BRANCH is actually release/xx.x, which is also $GITHUB_HEAD_REF, not $GUTHUB_SHA. The primary reference for the CI perf in the refactored cli is $GITHUB_HEAD_REF so this comparison will still be, e.g.: release/15.6 vs. release/15.5 vs. wp/6.1. With the refactored CLI we need to only list the refs that we want to compare the current one to.

youknowriad · 2023-04-14T17:09:54Z

bin/plugin/commands/performance.js

-	if ( ! runningInCI ) {
-		await askForConfirmation( 'Ready to go? ' );
+	if ( ! localRef ) {
+		throw new Error( 'Must be running from a Gutenberg repository root.' );


I guess this my main issue with this PR. My hope for this tool was to be plugin independent, I wanted to transform it into a package that any plugin can use to run perf tests, or any other command.

OK, making it independent should not be a problem, as the logic for fetching and building from the origin is there. After checking your other comment above, I'm curious whether we should actually account for the fact that the script can run from the Gutenberg root. IMO it would make a lot of sense, as it would not break the original purpose to be an independent tool while being especially helpful in these situations:

When developing or debugging locally, as it enables working on both the CLI perf tool and perf tests at the same time.

When running from CI, where the branch we run the workflow from is always a primary reference, enabling us to speed things up by reusing it and its dependencies (restored from the cache ⚡).

youknowriad · 2023-04-18T10:03:56Z

bin/plugin/commands/performance.js

+			'--abbrev-ref': null,
+			HEAD: null,
+		} );
+	}


It is not clear to me what we're doing this within the performance command. We're creating a dependency between the performance command and Github CI runner. What if we do this logic outside and pass the "localRef" as a regular branch to compare like we used to do before?

Rather than creating a CI dependency, I'd argue that it adds CI support. When run in CI, the script identifies the local reference and uses it to build the plugin. This is quicker than our previous method, as it circumvents the need to clone the repository and install node modules.

During the refactor, I was fairly confident that this script would not be utilized outside an environment that would be the subject of the test, whether it's Gutenberg or another plugin. TBH, I'm still struggling to see an alternative use-case. If our intention is to build and measure or compare the performance of a specific feature, we can easily switch to the appropriate branch and use the local reference. From your perspective, it seems we want this script to act as a standalone application capable of fetching and performance-testing any references from the origin. While it wouldn't be difficult to include support for that (old) logic in the proposed implementation, I feel I'm yet to understand a compelling use-case example.

youknowriad · 2023-04-18T10:07:08Z

Personally I feel this PR is doing too much to be honest and I'd prefer if we make each of the change mentioned in the description separately, so we discuss the reasoning for the existing choices more clearly.

dmsnell · 2023-04-18T14:03:32Z

I'd like to see these split as separate issues too, but I don't have as strong of a feel in it. I've been half-way following along and it's been hard to grasp everything going on.

Personally I think it makes sense to get rid of --tests-branch too because I don't think it ever solved our problems on breaking changes so much as delayed them to a different part of the process. I guess I could be wrong if I've missed that people created special third branches just to run tests in a way that works across the breaking change, but I don't think I've missed that. I think it's just a sad "we have to wait to compare these until we have two working versions." The times where --tests-branch solves the problem are times I think when we could do what you suggested: pick another branch as the first one.

The thing about reusing the branch for the test setup and also for a branch under test was there before but I made a mistake in not having it fully build all the packages and therefore the tests weren't building the Gutenberg plugin, injecting it into the WordPress env, and testing against the actual code. That should be able to be brought back with the appropriate fix if we want, and I think that will account for as much setup time as your changes. (In #45284 I stopped building parts of the app for the tests-branch branch because it didn't need them, but then in #45737 I started reusing the tests-branch if it was the same and already built - side note @youknowriad I now understand how this thing happened, because it looks like I had both of those PRs merge on the same day - they were a non-conflict merge conflict I didn't notice, where each was fine on its own but not together).

I've noted that in trunk the Perf Tests are still not in the hot path for CI builds finished. I applaud this work to continue to optimize them and encourage it, but I'm also less bothered now by certain inefficiencies (particularly those increasing the complexity of the scripts) because improving them doesn't have the direct impact on the workflow as it did when these were running upwards of 1h 20m

kevin940726 · 2023-04-27T09:06:41Z

Thank you for all the hard work! This is looking very promising to me! I agree it's a bit hard to review but I also don't think it necessary has to be split into smaller PRs.

While reading the description of this PR, I've been thinking of potential ways to further improve the performance. This is just me thinking out loud and not relevant to this PR, but just want to jot it down somewhere in case I forgot 👇

Note:
It seems to me that a lot of time spent is during checking out and building the old branch. Most of the time we're comparing a PR branch (current branch) and a commit on trunk. What if we could memoize the result of the commits on trunk? For releases, we can just use wp-env to download a specific version of gutenberg via the command line, we don't actually have to checkout to that git commit and rebuild the plugin every time. For the latest commit on trunk, I'm thinking maybe we could leverage GitHub releases API or artifacts API to automatically publish the latest commit on trunk somewhere. Then we just have to download the artifact and ignore all the build steps. We probably still want to keep the "checking out and building" steps in the code for comparing arbitrary commits, but for most jobs on CI we could potentially save a lot of time.

WunderBart · 2023-05-12T11:15:36Z

It seems to me that a lot of time spent is during checking out and building the old branch. Most of the time we're comparing a PR branch (current branch) and a commit on trunk. What if we could memoize the result of the commits on trunk? For releases, we can just use wp-env to download a specific version of gutenberg via the command line, we don't actually have to checkout to that git commit and rebuild the plugin every time. For the latest commit on trunk, I'm thinking maybe we could leverage GitHub releases API or artifacts API to automatically publish the latest commit on trunk somewhere. Then we just have to download the artifact and ignore all the build steps. We probably still want to keep the "checking out and building" steps in the code for comparing arbitrary commits, but for most jobs on CI we could potentially save a lot of time.

@kevin940726, I was thinking the same thing. 🙌 The good part is that each trunk commit already has a plugin ZIP built and uploaded. All we'd need would be the URL to that artifact (which btw I was hoping to achieve within this PR 🙈), but the artifacts API doesn't seem to support searching by ref/sha/etc (yet?). I'd be happy to explore that in a follow-up PR, as it should be a big jump in speed and should also streamline the setup a lot.

kevin940726 · 2023-05-13T07:04:49Z

The good part is that each trunk commit already has a plugin ZIP built and uploaded.

Oh, yeah! For some reason I totally forgot that it's a thing already 😅.

but the artifacts API doesn't seem to support searching by ref/sha/etc

I think this is possible with a few more extra steps today!

First get the list of the workflow runs of a commit on trunk:

curl https://api.github.com/repos/WordPress/gutenberg/actions/runs?branch=trunk&head_sha=<COMMIT_SHA> > runs.json

Loop through the result to find the Build Gutenberg Plugin Zip workflow run and get the artifacts_url field:

jq -r '.workflow_runs[] | select(.name=="Build Gutenberg Plugin Zip").artifacts_url' runs.json > artifact_url

Fetch the url and grab the downloadable url to download the plugin zip:

curl -s $(cat artifact_url) | jq -r '.artifacts[] | select(.name=="gutenberg-plugin").archive_download_url' > download_url

Download the zip! 🎉

The gotcha here is that the artifacts have retention limits and will be deleted after a certain amount of time (normally 30 days). To enable comparing older commits we might have to upload them to a persistent storage, like the release API?

WunderBart · 2023-05-15T13:13:19Z

I'm wondering if there's anything I should do at this point to wrap up this PR. I understand the suggestion for splitting this into smaller PRs, but I'm wondering if it's necessary. Maybe we'll get a better picture after summarizing the current feedback/discussion.

First of all, there are 3 functionality changes:

Switch from fetching primary ref from origin to local,
Remove the --test-branch option and use the primary ref as the source of the tests,
Use the latest WordPress stable instead of bleeding edge.

...and 2 improvements:

Use a fixed tmp folder so we can reuse the existing plugin clones,
Parallelize ref builds for faster setup,

As far as the functionality changes go, we seem to have a partial consensus on removing the --test-branch option (see @dmsnell comment). I'd love to hear from @youknowriad whether he agrees with the current reasoning or not.

For the part where we use the local code, I've made my argument here and, to add to that, I think it doesn't make sense to fetch the primary reference from the origin until we decide to extract the script to a dedicated package. Currently, the primary use of the script is for the CI.

Regarding the use of WP stable instead of bleeding edge, I think it makes the most sense as it increases the env consistency, which should work towards measurement stability. Is there any reason we should be testing against the bleeding edge?

The improvements part is what speeds up the setup the most, especially the parallel builds, as the fixed tmp folder is relevant only for local development. I believe it's the main thing that makes this PR a bit hard to follow.

I'm unsure if the above makes it easier for you to vote for or against splitting this PR. Anyhow, please let me know what you think, and I'll follow the vote. 😄

Note: If we decide to split, I'd suggest doing it as follows:

Use the latest stable WP instead of bleeding edge,
Remove the --test-branch option,
Use a fixed tmp folder to enable the reuse of the plugin clones + parallelize ref builds,
Use local ref as the primary ref instead of fetching from origin.

/cc @youknowriad @dmsnell @kevin940726

WunderBart · 2023-05-15T13:23:09Z

I think this is possible with a few more extra steps today!

Nice, @kevin940726! 🙌 I'll definitely try that out.

The gotcha here is that the artifacts have retention limits and will be deleted after a certain amount of time (normally 30 days). To enable comparing older commits we might have to upload them to a persistent storage, like the release API?

I guess for this case we could just leave the part where we build the plugin from the ref. I don't think we'll be often testing against commits that >1 month old, will we?

I can see another gotcha, where the ZIP has not been built yet - kinda the other way around. I think we'd either need a waiting mechanism, build from ref (as above), or just create the logic in a way that would fetch the last available ZIP for trunk.

youknowriad · 2023-05-15T13:49:46Z

For me this PR comes down to "smart/implicit vs explicit behavior". I personally prefer explicit behavior: pass the branches to compare, pass the branch to use for the tests... over assumption based on the current branch. I agree that it's slower but I believe it to be more flexible and also opens the door for a more generic tool in the future.

That said, I don't want to be the blocker here if you all think otherwise.

dmsnell · 2023-05-15T18:13:49Z

Currently, the primary use of the script is for the CI.

I believe that for every big release someone runs this script locally to compare against previous releases.

Regarding the use of WP stable instead of bleeding edge, I think it makes the most sense as it increases the env consistency, which should work towards measurement stability.

Having run into this many times I think it'd be good to be explicit on the WP version we're using in the tests. Too many times the tests failed with no explicable reason, only later to discover that something changed in the WP copy in the env which is never called out in the test setup

Use the latest stable WP instead of bleeding edge,

From me 👍

Remove the --test-branch option,

Also from me 👍 but I don't have strong opinions. I don't think this is practically more explicit than saying "use the first given branch" and I think it never solved the problem it was introduced to solve; that problem still exists.

Use a fixed tmp folder to enable the reuse of the plugin clones + parallelize ref builds,

This ties in with eliminating --tests-branch but doesn't need to; bringing back my work which reused or cached the tests branch should cut out a few minutes and have a similar impact. That doesn't require adding more into anything; just involves doing less. The only caveat is that I added that cache simultaneously while trimming it down to only build the tooling. We need to continue to have it generate the full build when we cache it.

Use local ref as the primary ref instead of fetching from origin.

In several experiments I used an ENV with a default value. There is a challenge I experienced with this kind of work and that is the need to reference branches that may not be present when relying only on already-fetched git references.

Otherwise my thoughts on this haven't changed. I appreciate all the work being done here, but I do think it's heavy to say the least.

First of all, there are 3 functionality changes:

Switch from fetching primary ref from origin to local,
Remove the --test-branch option and use the primary ref as the source of the tests,
Use the latest WordPress stable instead of bleeding edge.
...and 2 improvements:

Use a fixed tmp folder so we can reuse the existing plugin clones,
Parallelize ref builds for faster setup,

If one of these fails we have to revert all of them. It's hard to leave a good commit message for these too, let alone review them.

I won't stand in the way if you really believe this is best despite that.

kevin940726 · 2023-05-16T02:48:00Z

I'd vote for giving it a try! Worst case we'd have to revert, no big deal! We can monitor it after a few commits after it gets merged so that we don't mess up with the metrics too much.

I don't think we'll be often testing against commits that >1 month old, will we?

IIRC, that's the case for a step on CI? I don't remember the exact details and I'm too lazy to check 😝 . Either way I think this is the easier part of the issue, we just need a persistent storage, which is relatively cheap or even free.

can see another gotcha, where the ZIP has not been built yet

Yeah, I think we can wait on CI but for local compares we might have to fallback to build it from scratch. Or if the exact commit doesn't matter, we can just pick the latest built output, like you suggested!

youknowriad · 2023-05-16T15:08:41Z

Now, that I think about it, let's make sure that we don't break the "manual" performance CI job where we have the ability to choose which branches to compare manually and choose which branch to use for the tests themselves.

A job you can run from here https://github.com/WordPress/gutenberg/actions/workflows/performance.yml by clicking on "run workflow".

WunderBart · 2023-05-17T10:56:04Z

Now, that I think about it, let's make sure that we don't break the "manual" performance CI job where we have the ability to choose which branches to compare manually and choose which branch to use for the tests themselves.

A job you can run from here https://github.com/WordPress/gutenberg/actions/workflows/performance.yml by clicking on "run workflow".

edit: Custom comparison tests and runner are connected. See #49238 (comment).

~~IIUC, the manual dispatch (workflow_dispatch) doesn't work like that currently, meaning the source of the tests cannot be picked independently. Here's an example run:~~

~~The Use workflow from defines the source of the workflow file and the performance.js command script.~~
~~The branches or commits to compare requires at least 2 branches to be passed, the first of which is used as the source of the tests.~~
~~The base WP version input isn't important in this context. It does what it's supposed to do.~~

The disparity between the performance.js source and the tests' source is actually what's currently causing the release perf tests to fail. The error is that the tests are saving result artifacts under a different name than the perf script is expecting them to be, so it cannot find them. For example, we can look at the most recent release perf job. The command is:

./bin/plugin/cli.js perf "wp/$WP_MAJOR" "$PREVIOUS_RELEASE_BRANCH" "$CURRENT_RELEASE_BRANCH" --wp-version "$WP_MAJOR"

...which values for that run are:

./bin/plugin/cli.js perf wp/6.2 release/15.7 release/15.8 --wp-version 6.2

The source of the workflow and the performance.js script is the branch that the job is dispatched from, which is actually a tag ref tags/v15.8.0-rc.1.
The source of the tests is the first branch to be compared, which is wp/6.2, hence the disparity.

Now that I think about it, the immediate fix would be adding the --tests-branch or simply reversing the order of the comparison branches, which would establish parity between the runner and the tests:

./bin/plugin/cli.js perf wp/6.2 release/15.7 release/15.8 --test-branch release/15.8 --wp-version 6.2

vs.

./bin/plugin/cli.js perf release/15.8 release/15.7 wp/6.2 --wp-version 6.2

I'm not 100% sure, though, because I don't know if tags/v15.8.0-rc.1 is exactly the same as release/15.8. Are we sure those are the same references, or is it possible that release/15.8 can have some extra commits on top, @youknowriad? Why aren't we just passing the triggering ref tags/v15.8.0-rc.1?

I think this is a good example of why it would be safer to use the job triggering ref as the source for both the perf runner script and the tests. I think that would actually be a good first step in arriving at what I'm proposing with this PR. I'll start drafting.

youknowriad · 2023-05-17T11:05:51Z

I'm not 100% sure, though, because I don't know if tags/v15.8.0-rc.1 is exactly the same as release/15.8. Are we sure those are the same references, or is it possible that release/15.8 can have some extra commits on top, @youknowriad? Why aren't we just passing the triggering ref tags/v15.8.0-rc.1?

There could be some differences but at the moment the release job runs, there's none (or almost none).

The workflow dispatch job uses GITHUB_SHA for the perf tests branch (use workflow from select). I think that's a good behavior for the manual test. But it doesn't mean that we want to compare with that branch too.

WunderBart · 2023-05-17T12:15:37Z

The workflow dispatch job uses GITHUB_SHA for the perf tests branch (use workflow from select).

Ah, apologies for the confusion - I was looking at the workflow from my own branch where I already removed the --tests-branch option. 🤦

I think that's a good behavior for the manual test. But it doesn't mean that we want to compare with that branch too.

That's fair, though I'm still failing to see why we wouldn't just use the branch that we're triggering from (cwd) as a ready-to-use source of the tests. By separating the runner from the tests, we're adding an extra setup step and making the whole thing more fragile.

youknowriad · 2023-05-17T12:38:51Z

Imagine we add a new metric today and we want to check whether React 18 upgrade had an impact on that metric. We can run the test manually by using "trunk" as the branch for the tests and compare the commit that did the React 18 with the previous one.

WunderBart · 2023-05-17T13:56:39Z

Imagine we add a new metric today and we want to check whether React 18 upgrade had an impact on that metric. We can run the test manually by using "trunk" as the branch for the tests and compare the commit that did the React 18 with the previous one.

I agreed on separating tests from comparison branches, it should remain as is. I asked about separating the test runner (performance.js) from the tests branch (--tests-branch) - sorry if that wasn't clear enough. The runner source is always the triggering branch, and the tests branch is either the first branch we use for comparison or the one defined by --tests-branch. This way, it's easy to disconnect the two, which is why I wanted to remove the --tests-branch option and just point to the tests we're currently checked out at (triggering branch, e.g. trunk).

youknowriad · 2023-05-17T14:25:36Z

I asked about separating the test runner (performance.js) from the tests branch (--tests-branch) - sorry if that wasn't clear enough.

At the moment at least, the metrics are present in both the tests and the runner. In other words, there's a strong relationship between these two so I wouldn't mind if we use the same thing for both these two.

But, I think it's just a bad design that they are tied, ideally perf tests define the metrics, and the runner is just orchestrating stuff without any relation to the tests themselves. But again It's not the case today, so 🤷

WunderBart · 2023-05-17T14:53:27Z

But, I think it's just a bad design that they are tied, ideally perf tests define the metrics, and the runner is just orchestrating stuff without any relation to the tests themselves. But again It's not the case today, so 🤷

Yeah they're currently very much connected - the runner is responsible for computing and rendering the results that are produced by the tests. In an ideal scenario, which part of the process do you think should take responsibility for handling the results? How could we untie it from the runner? Maybe a separate script? We'd still be making that connection though, only with a different script...

youknowriad · 2023-05-17T15:26:25Z

Yeah they're currently very much connected - the runner is responsible for computing and rendering the results that are produced by the tests. In an ideal scenario, which part of the process do you think should take responsibility for handling the results? How could we untie it from the runner? Maybe a separate script? We'd still be making that connection though, only with a different script...

I think the runner doesn't need to know about the metrics, it can just use the "keys" of the JSON output by the performance tests.

WunderBart · 2023-07-12T14:39:01Z

Yeah they're currently very much connected - the runner is responsible for computing and rendering the results that are produced by the tests. In an ideal scenario, which part of the process do you think should take responsibility for handling the results? How could we untie it from the runner? Maybe a separate script? We'd still be making that connection though, only with a different script...

I think the runner doesn't need to know about the metrics, it can just use the "keys" of the JSON output by the performance tests.

This has been addressed within #52022

Base automatically changed from refactor/perf-test-results-path to trunk March 28, 2023 14:00

WunderBart added 3 commits March 29, 2023 15:29

Parallelize perf test branch builds

8326525

Try enabling colors in CI

0a2195b

Try using current branch as env and tests source

79ea0b0

WunderBart force-pushed the refactor/perf-tests-tmp-folder branch from d1fef2f to 79ea0b0 Compare March 29, 2023 13:31

Streamling & improve local running + clean up

f6464ea

WunderBart added 7 commits March 30, 2023 11:37

Fix refs grouping

f66c711

Fix log indents

18d7449

Undo tmp change

b9da0d3

Don't install deps twice

3a1259e

Fix wp version log

0484374

Add writeJSONFile util

072a7bb

Tweak the logic and clean up + add more useful logs

2638476

WunderBart force-pushed the refactor/perf-tests-tmp-folder branch from 1ec1960 to 887e8e8 Compare April 12, 2023 18:49

Format some logs

fd35da9

WunderBart force-pushed the refactor/perf-tests-tmp-folder branch from 887e8e8 to fd35da9 Compare April 12, 2023 21:06

WunderBart added 5 commits April 13, 2023 10:27

Update CI workflow to work with the new logic

1e1563b

Merge remote-tracking branch 'origin' into refactor/perf-tests-tmp-fo…

53cf2a1

…lder

Use existing modules if package locks match

62f5d99

A few last logging tweaks

abb3df1

Revert "Use existing modules if package locks match"

166091c

This reverts commit 62f5d99.

WunderBart self-assigned this Apr 14, 2023

WunderBart added [Type] Performance Related to performance efforts [Package] E2E Tests /packages/e2e-tests GitHub Actions Pull requests that update GitHub Actions code and removed [Package] E2E Tests /packages/e2e-tests labels Apr 14, 2023

WunderBart marked this pull request as ready for review April 14, 2023 14:12

WunderBart requested a review from ntwb as a code owner April 14, 2023 14:12

WunderBart requested review from youknowriad, kevin940726 and noahtallen April 14, 2023 14:12

Make multiline log simpler

0affd72

youknowriad reviewed Apr 14, 2023

View reviewed changes

youknowriad reviewed Apr 18, 2023

View reviewed changes

WunderBart mentioned this pull request Jul 11, 2023

Switch performance tests to Playwright #52022

Merged

WunderBart closed this Oct 17, 2023

Refactor performance tests setup #49238

Refactor performance tests setup #49238

Conversation

WunderBart commented Mar 21, 2023 • edited Loading

What?

⚡︎ Reduce disk space usage to a fixed 3GB

⚡︎ Use the current root branch code directly

⚡︎ Use the current root branch as a tests source

⚡︎ Enable rerunning the tests using the existing environment (locally)

⚡︎ Run each branch setup in parallel

⚡︎ Use the latest stable WordPress version instead of bleeding edge by default

Testing Instructions

github-actions bot commented Mar 29, 2023 • edited Loading

github-actions bot commented Mar 29, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WunderBart Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WunderBart Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WunderBart Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WunderBart Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youknowriad commented Apr 18, 2023

dmsnell commented Apr 18, 2023

kevin940726 commented Apr 27, 2023

WunderBart commented May 12, 2023

kevin940726 commented May 13, 2023 • edited Loading

WunderBart commented May 15, 2023

WunderBart commented May 15, 2023 • edited Loading

youknowriad commented May 15, 2023

dmsnell commented May 15, 2023

kevin940726 commented May 16, 2023

youknowriad commented May 16, 2023

WunderBart commented May 17, 2023 • edited Loading

youknowriad commented May 17, 2023

WunderBart commented May 17, 2023

youknowriad commented May 17, 2023

WunderBart commented May 17, 2023 • edited Loading

youknowriad commented May 17, 2023

WunderBart commented May 17, 2023

youknowriad commented May 17, 2023

WunderBart commented Jul 12, 2023

WunderBart commented Mar 21, 2023 •

edited

Loading

github-actions bot commented Mar 29, 2023 •

edited

Loading

github-actions bot commented Mar 29, 2023 •

edited

Loading

WunderBart Apr 17, 2023 •

edited

Loading

WunderBart Apr 17, 2023 •

edited

Loading

WunderBart Apr 17, 2023 •

edited

Loading

WunderBart Apr 17, 2023 •

edited

Loading

kevin940726 commented May 13, 2023 •

edited

Loading

WunderBart commented May 15, 2023 •

edited

Loading

WunderBart commented May 17, 2023 •

edited

Loading

WunderBart commented May 17, 2023 •

edited

Loading