Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor performance tests setup #49238

Closed
wants to merge 18 commits into from
Closed

Conversation

WunderBart
Copy link
Member

@WunderBart WunderBart commented Mar 21, 2023

What?

Refactor the performance testing environment setup to improve developer experience both locally and in CI.

TL;DR:

  • Cut setup time by 50% through the following improvements:
    • a. Utilize the local code for the reference environment and test source.
    • b. Run branch environment setups in parallel.
  • Limit disk space usage by reusing the setup base folder.
  • Allow rerunning tests with existing setup to save time during local development.
  • Always use the current root branch as the test source to prevent issues with env misalignments.
  • Use the latest stable WordPress version by default instead of the bleeding edge for consistent results.

Read on for more details.

⚡︎ Reduce disk space usage to a fixed 3GB

While debugging the script, after a couple of reruns, I started running out of disk space. It turned out that the script is creating a new test environment (repo clones + wp-env docker images) with each run without clearing it when finished. Currently, the initial perf script run takes ~8GB (6GB for 3 repo clones + 2GB for wp-env docker images), and each next takes another ~6GB for new clones.

Furthermore, one of the comparison branches is always the branch that we're currently checked out at (true for each CI job), but for some reason, it's being cloned twice from the origin: one for the test environment and one for the test source. This takes not only extra space but also time.

The refactored approach includes two changes to address the above:

  1. Use a fixed name for the temporary base folder (wp-gutenberg-performance-tests),
  2. Use the current branch for the test environment and source.

This setup requires 3GB for a typical job where we compare two branches, e.g., the current one vs. trunk.

⚡︎ Use the current root branch code directly

Let's say your current branch is try/some-stuff and you'd like to compare its performance against trunk. To do that, you need to run the following from the gutenberg root:

 bin/plugin/cli.js perf try/some-stuff trunk

Currently, this will not use your local code. Instead, it will clone the branch to be used as the env source. It will also create another copy of that branch to be used as the source of the tests. In reality, we always use the current branch as a reference (as each workflow perf comparison job does), so using what's already available makes more sense. For the CI, it means that it will only need to build the current branch, allowing us to drop the clones and extra (uncached) npm installs. Locally, it means we can test against the currently developed code, which isn't currently possible.

⚡︎ Use the current root branch as a tests source

In addition to the above, the current branch should always be used as the tests source because they should always be coupled with the workflow and the CLI runner script. Currently, the tests branch is either the first one passed for comparison or the one specified by the --test-branch option. This approach can cause issues, as exemplified by the currently failing release performance tests where the branch used as the tests source (wp/6.1) is not aligned with workflow and CLI source (release/15.6).

Because of the above, keeping the --test-branch option didn't make sense, so I've removed it.

⚡︎ Enable rerunning the tests using the existing environment (locally)

Using the same base folder for each run enables the reuse of the environment. Currently, each run creates the whole environment from scratch, which takes ~8mins on an M1 Mac. The refactored approach allows for the reuse of existing env, which skips the setup if the builds are up to date and move straight to the tests. Unfortunately, I don't think this is possible currently in the CI, but it helps a ton when working with the tests locally.

⚡︎ Run each branch setup in parallel

Fetching/building separate branches can be done in parallel, saving some extra time. CI efficiency is too erratic to give solid numbers, but testing locally, the refactored setup takes 50% less time than the current one when building from scratch. For my machine, which is an M1 Mac, it reduced from ~8 to ~4 minutes.

⚡︎ Use the latest stable WordPress version instead of bleeding edge by default

We're currently using the WP bleeding edge (core: "WordPress/WordPress" in wp-env config) for the performance tests environment unless specified otherwise via the --wp-version option. I'm not sure if that's intentional, but it makes more sense to me to run performance tests against the latest production release instead (core: null) for more consistent measurements.

Testing Instructions

Checkout this branch (refactor/perf-tests-tmp-folder) and run npm install.
The following scenarios should be working as expected:

  1. Compare the current branch with another branch or commit SHA. There should not be a need to pass the current branch's name, e.g.:

    bin/plugin/cli.js perf trunk
    
    bin/plugin/cli.js perf debd225d007f4e441ceec80fbd6fa96653f94737
    
  2. Compare the current branch with two or more other git refs, e.g.:

    bin/plugin/cli.js perf trunk debd225d007f4e441ceec80fbd6fa96653f94737
    
  3. Compare using a specific WP version:

    bin/plugin/cli.js perf debd225d007f4e441ceec80fbd6fa96653f94737 --wp-version 6.1
    
  4. Rerun the above to ensure the setup is being reused:

    bin/plugin/cli.js perf debd225d007f4e441ceec80fbd6fa96653f94737 --wp-version 6.1
    
  5. To ensure whether other specific comparisons work, e.g., the Compare performance with current WordPress Core and previous Gutenberg versions from the performance.yml workflow, you'll need to do the following.

    Assuming we're testing release/15.6 vs. release/15.5 and wp/6.1:

    • git checkout release/15.6 as it's the one from which the workflow will be dispatched,
    • git merge origin refactor/perf-test-results-path,
    • nvm use && npm install,
    • bin/plugin/cli.js perf release/15.5 wp/6.1 --wp-version 6.1,
    • Confirm when asked to clear the environment,
    • Confirm when asked if the plugin should be built from the local branch.
    • Go make some coffee,
    • See the tests pass, 🤞
    • git reset --hard origin/release/15.6.
  6. The script should recognize if the current root branch name is passed explicitly and still use the local code, e.g.:

    bin/plugin/cli.js perf refactor/perf-test-results-path trunk
    
  7. Finally, ensure that when comparing this branch with trunk, they should both show the same numbers. This should be proof that the new logic does not affect the perf measurements.

    bin/plugin/cli.js perf trunk
    
Base automatically changed from refactor/perf-test-results-path to trunk March 28, 2023 14:00
@WunderBart WunderBart force-pushed the refactor/perf-tests-tmp-folder branch from d1fef2f to 79ea0b0 Compare March 29, 2023 13:31
@github-actions
Copy link

github-actions bot commented Mar 29, 2023

Size Change: 0 B

Total Size: 1.37 MB

ℹ️ View Unchanged
Filename Size
build/a11y/index.min.js 993 B
build/annotations/index.min.js 2.78 kB
build/api-fetch/index.min.js 2.27 kB
build/autop/index.min.js 2.15 kB
build/blob/index.min.js 483 B
build/block-directory/index.min.js 7.2 kB
build/block-directory/style-rtl.css 1.04 kB
build/block-directory/style.css 1.04 kB
build/block-editor/content-rtl.css 4.11 kB
build/block-editor/content.css 4.1 kB
build/block-editor/default-editor-styles-rtl.css 403 B
build/block-editor/default-editor-styles.css 403 B
build/block-editor/index.min.js 203 kB
build/block-editor/style-rtl.css 14.6 kB
build/block-editor/style.css 14.6 kB
build/block-library/blocks/archives/editor-rtl.css 61 B
build/block-library/blocks/archives/editor.css 60 B
build/block-library/blocks/archives/style-rtl.css 90 B
build/block-library/blocks/archives/style.css 90 B
build/block-library/blocks/audio/editor-rtl.css 150 B
build/block-library/blocks/audio/editor.css 150 B
build/block-library/blocks/audio/style-rtl.css 122 B
build/block-library/blocks/audio/style.css 122 B
build/block-library/blocks/audio/theme-rtl.css 138 B
build/block-library/blocks/audio/theme.css 138 B
build/block-library/blocks/avatar/editor-rtl.css 116 B
build/block-library/blocks/avatar/editor.css 116 B
build/block-library/blocks/avatar/style-rtl.css 91 B
build/block-library/blocks/avatar/style.css 91 B
build/block-library/blocks/block/editor-rtl.css 305 B
build/block-library/blocks/block/editor.css 305 B
build/block-library/blocks/button/editor-rtl.css 587 B
build/block-library/blocks/button/editor.css 587 B
build/block-library/blocks/button/style-rtl.css 628 B
build/block-library/blocks/button/style.css 627 B
build/block-library/blocks/buttons/editor-rtl.css 337 B
build/block-library/blocks/buttons/editor.css 337 B
build/block-library/blocks/buttons/style-rtl.css 332 B
build/block-library/blocks/buttons/style.css 332 B
build/block-library/blocks/calendar/style-rtl.css 239 B
build/block-library/blocks/calendar/style.css 239 B
build/block-library/blocks/categories/editor-rtl.css 113 B
build/block-library/blocks/categories/editor.css 112 B
build/block-library/blocks/categories/style-rtl.css 124 B
build/block-library/blocks/categories/style.css 124 B
build/block-library/blocks/code/editor-rtl.css 53 B
build/block-library/blocks/code/editor.css 53 B
build/block-library/blocks/code/style-rtl.css 121 B
build/block-library/blocks/code/style.css 121 B
build/block-library/blocks/code/theme-rtl.css 124 B
build/block-library/blocks/code/theme.css 124 B
build/block-library/blocks/columns/editor-rtl.css 108 B
build/block-library/blocks/columns/editor.css 108 B
build/block-library/blocks/columns/style-rtl.css 409 B
build/block-library/blocks/columns/style.css 409 B
build/block-library/blocks/comment-author-avatar/editor-rtl.css 125 B
build/block-library/blocks/comment-author-avatar/editor.css 125 B
build/block-library/blocks/comment-content/style-rtl.css 92 B
build/block-library/blocks/comment-content/style.css 92 B
build/block-library/blocks/comment-template/style-rtl.css 199 B
build/block-library/blocks/comment-template/style.css 198 B
build/block-library/blocks/comments-pagination-numbers/editor-rtl.css 123 B
build/block-library/blocks/comments-pagination-numbers/editor.css 121 B
build/block-library/blocks/comments-pagination/editor-rtl.css 222 B
build/block-library/blocks/comments-pagination/editor.css 209 B
build/block-library/blocks/comments-pagination/style-rtl.css 235 B
build/block-library/blocks/comments-pagination/style.css 231 B
build/block-library/blocks/comments-title/editor-rtl.css 75 B
build/block-library/blocks/comments-title/editor.css 75 B
build/block-library/blocks/comments/editor-rtl.css 840 B
build/block-library/blocks/comments/editor.css 839 B
build/block-library/blocks/comments/style-rtl.css 637 B
build/block-library/blocks/comments/style.css 636 B
build/block-library/blocks/cover/editor-rtl.css 649 B
build/block-library/blocks/cover/editor.css 651 B
build/block-library/blocks/cover/style-rtl.css 1.61 kB
build/block-library/blocks/cover/style.css 1.6 kB
build/block-library/blocks/details-summary/editor-rtl.css 65 B
build/block-library/blocks/details-summary/editor.css 65 B
build/block-library/blocks/details-summary/style-rtl.css 61 B
build/block-library/blocks/details-summary/style.css 61 B
build/block-library/blocks/details/style-rtl.css 54 B
build/block-library/blocks/details/style.css 54 B
build/block-library/blocks/embed/editor-rtl.css 293 B
build/block-library/blocks/embed/editor.css 293 B
build/block-library/blocks/embed/style-rtl.css 410 B
build/block-library/blocks/embed/style.css 410 B
build/block-library/blocks/embed/theme-rtl.css 138 B
build/block-library/blocks/embed/theme.css 138 B
build/block-library/blocks/file/editor-rtl.css 300 B
build/block-library/blocks/file/editor.css 300 B
build/block-library/blocks/file/style-rtl.css 269 B
build/block-library/blocks/file/style.css 270 B
build/block-library/blocks/file/view.min.js 353 B
build/block-library/blocks/freeform/editor-rtl.css 2.44 kB
build/block-library/blocks/freeform/editor.css 2.44 kB
build/block-library/blocks/gallery/editor-rtl.css 984 B
build/block-library/blocks/gallery/editor.css 988 B
build/block-library/blocks/gallery/style-rtl.css 1.55 kB
build/block-library/blocks/gallery/style.css 1.55 kB
build/block-library/blocks/gallery/theme-rtl.css 122 B
build/block-library/blocks/gallery/theme.css 122 B
build/block-library/blocks/group/editor-rtl.css 654 B
build/block-library/blocks/group/editor.css 654 B
build/block-library/blocks/group/style-rtl.css 57 B
build/block-library/blocks/group/style.css 57 B
build/block-library/blocks/group/theme-rtl.css 78 B
build/block-library/blocks/group/theme.css 78 B
build/block-library/blocks/heading/style-rtl.css 76 B
build/block-library/blocks/heading/style.css 76 B
build/block-library/blocks/html/editor-rtl.css 332 B
build/block-library/blocks/html/editor.css 333 B
build/block-library/blocks/image/editor-rtl.css 830 B
build/block-library/blocks/image/editor.css 829 B
build/block-library/blocks/image/style-rtl.css 652 B
build/block-library/blocks/image/style.css 652 B
build/block-library/blocks/image/theme-rtl.css 137 B
build/block-library/blocks/image/theme.css 137 B
build/block-library/blocks/latest-comments/style-rtl.css 357 B
build/block-library/blocks/latest-comments/style.css 357 B
build/block-library/blocks/latest-posts/editor-rtl.css 213 B
build/block-library/blocks/latest-posts/editor.css 212 B
build/block-library/blocks/latest-posts/style-rtl.css 478 B
build/block-library/blocks/latest-posts/style.css 478 B
build/block-library/blocks/list/style-rtl.css 88 B
build/block-library/blocks/list/style.css 88 B
build/block-library/blocks/media-text/editor-rtl.css 266 B
build/block-library/blocks/media-text/editor.css 263 B
build/block-library/blocks/media-text/style-rtl.css 507 B
build/block-library/blocks/media-text/style.css 505 B
build/block-library/blocks/more/editor-rtl.css 431 B
build/block-library/blocks/more/editor.css 431 B
build/block-library/blocks/navigation-link/editor-rtl.css 716 B
build/block-library/blocks/navigation-link/editor.css 715 B
build/block-library/blocks/navigation-link/style-rtl.css 115 B
build/block-library/blocks/navigation-link/style.css 115 B
build/block-library/blocks/navigation-submenu/editor-rtl.css 299 B
build/block-library/blocks/navigation-submenu/editor.css 299 B
build/block-library/blocks/navigation/editor-rtl.css 2.13 kB
build/block-library/blocks/navigation/editor.css 2.14 kB
build/block-library/blocks/navigation/style-rtl.css 2.22 kB
build/block-library/blocks/navigation/style.css 2.21 kB
build/block-library/blocks/navigation/view-modal.min.js 2.81 kB
build/block-library/blocks/navigation/view.min.js 447 B
build/block-library/blocks/nextpage/editor-rtl.css 395 B
build/block-library/blocks/nextpage/editor.css 395 B
build/block-library/blocks/page-list/editor-rtl.css 401 B
build/block-library/blocks/page-list/editor.css 401 B
build/block-library/blocks/page-list/style-rtl.css 175 B
build/block-library/blocks/page-list/style.css 175 B
build/block-library/blocks/paragraph/editor-rtl.css 174 B
build/block-library/blocks/paragraph/editor.css 174 B
build/block-library/blocks/paragraph/style-rtl.css 279 B
build/block-library/blocks/paragraph/style.css 281 B
build/block-library/blocks/post-author/style-rtl.css 175 B
build/block-library/blocks/post-author/style.css 176 B
build/block-library/blocks/post-comments-form/editor-rtl.css 96 B
build/block-library/blocks/post-comments-form/editor.css 96 B
build/block-library/blocks/post-comments-form/style-rtl.css 501 B
build/block-library/blocks/post-comments-form/style.css 501 B
build/block-library/blocks/post-date/style-rtl.css 61 B
build/block-library/blocks/post-date/style.css 61 B
build/block-library/blocks/post-excerpt/editor-rtl.css 71 B
build/block-library/blocks/post-excerpt/editor.css 71 B
build/block-library/blocks/post-excerpt/style-rtl.css 141 B
build/block-library/blocks/post-excerpt/style.css 141 B
build/block-library/blocks/post-featured-image/editor-rtl.css 588 B
build/block-library/blocks/post-featured-image/editor.css 586 B
build/block-library/blocks/post-featured-image/style-rtl.css 322 B
build/block-library/blocks/post-featured-image/style.css 322 B
build/block-library/blocks/post-navigation-link/style-rtl.css 153 B
build/block-library/blocks/post-navigation-link/style.css 153 B
build/block-library/blocks/post-template/editor-rtl.css 99 B
build/block-library/blocks/post-template/editor.css 98 B
build/block-library/blocks/post-template/style-rtl.css 281 B
build/block-library/blocks/post-template/style.css 281 B
build/block-library/blocks/post-terms/style-rtl.css 96 B
build/block-library/blocks/post-terms/style.css 96 B
build/block-library/blocks/post-time-to-read/style-rtl.css 69 B
build/block-library/blocks/post-time-to-read/style.css 69 B
build/block-library/blocks/post-title/style-rtl.css 100 B
build/block-library/blocks/post-title/style.css 100 B
build/block-library/blocks/preformatted/style-rtl.css 103 B
build/block-library/blocks/preformatted/style.css 103 B
build/block-library/blocks/pullquote/editor-rtl.css 135 B
build/block-library/blocks/pullquote/editor.css 135 B
build/block-library/blocks/pullquote/style-rtl.css 335 B
build/block-library/blocks/pullquote/style.css 335 B
build/block-library/blocks/pullquote/theme-rtl.css 167 B
build/block-library/blocks/pullquote/theme.css 167 B
build/block-library/blocks/query-pagination-numbers/editor-rtl.css 122 B
build/block-library/blocks/query-pagination-numbers/editor.css 121 B
build/block-library/blocks/query-pagination/editor-rtl.css 221 B
build/block-library/blocks/query-pagination/editor.css 211 B
build/block-library/blocks/query-pagination/style-rtl.css 288 B
build/block-library/blocks/query-pagination/style.css 284 B
build/block-library/blocks/query-title/style-rtl.css 63 B
build/block-library/blocks/query-title/style.css 63 B
build/block-library/blocks/query/editor-rtl.css 463 B
build/block-library/blocks/query/editor.css 463 B
build/block-library/blocks/quote/style-rtl.css 222 B
build/block-library/blocks/quote/style.css 222 B
build/block-library/blocks/quote/theme-rtl.css 223 B
build/block-library/blocks/quote/theme.css 226 B
build/block-library/blocks/read-more/style-rtl.css 132 B
build/block-library/blocks/read-more/style.css 132 B
build/block-library/blocks/rss/editor-rtl.css 149 B
build/block-library/blocks/rss/editor.css 149 B
build/block-library/blocks/rss/style-rtl.css 289 B
build/block-library/blocks/rss/style.css 288 B
build/block-library/blocks/search/editor-rtl.css 165 B
build/block-library/blocks/search/editor.css 165 B
build/block-library/blocks/search/style-rtl.css 408 B
build/block-library/blocks/search/style.css 406 B
build/block-library/blocks/search/theme-rtl.css 114 B
build/block-library/blocks/search/theme.css 114 B
build/block-library/blocks/separator/editor-rtl.css 146 B
build/block-library/blocks/separator/editor.css 146 B
build/block-library/blocks/separator/style-rtl.css 234 B
build/block-library/blocks/separator/style.css 234 B
build/block-library/blocks/separator/theme-rtl.css 194 B
build/block-library/blocks/separator/theme.css 194 B
build/block-library/blocks/shortcode/editor-rtl.css 329 B
build/block-library/blocks/shortcode/editor.css 329 B
build/block-library/blocks/site-logo/editor-rtl.css 489 B
build/block-library/blocks/site-logo/editor.css 489 B
build/block-library/blocks/site-logo/style-rtl.css 203 B
build/block-library/blocks/site-logo/style.css 203 B
build/block-library/blocks/site-tagline/editor-rtl.css 86 B
build/block-library/blocks/site-tagline/editor.css 86 B
build/block-library/blocks/site-title/editor-rtl.css 116 B
build/block-library/blocks/site-title/editor.css 116 B
build/block-library/blocks/site-title/style-rtl.css 57 B
build/block-library/blocks/site-title/style.css 57 B
build/block-library/blocks/social-link/editor-rtl.css 184 B
build/block-library/blocks/social-link/editor.css 184 B
build/block-library/blocks/social-links/editor-rtl.css 674 B
build/block-library/blocks/social-links/editor.css 673 B
build/block-library/blocks/social-links/style-rtl.css 1.4 kB
build/block-library/blocks/social-links/style.css 1.39 kB
build/block-library/blocks/spacer/editor-rtl.css 359 B
build/block-library/blocks/spacer/editor.css 359 B
build/block-library/blocks/spacer/style-rtl.css 48 B
build/block-library/blocks/spacer/style.css 48 B
build/block-library/blocks/table/editor-rtl.css 433 B
build/block-library/blocks/table/editor.css 433 B
build/block-library/blocks/table/style-rtl.css 651 B
build/block-library/blocks/table/style.css 650 B
build/block-library/blocks/table/theme-rtl.css 157 B
build/block-library/blocks/table/theme.css 157 B
build/block-library/blocks/tag-cloud/style-rtl.css 251 B
build/block-library/blocks/tag-cloud/style.css 253 B
build/block-library/blocks/template-part/editor-rtl.css 404 B
build/block-library/blocks/template-part/editor.css 404 B
build/block-library/blocks/template-part/theme-rtl.css 101 B
build/block-library/blocks/template-part/theme.css 101 B
build/block-library/blocks/text-columns/editor-rtl.css 95 B
build/block-library/blocks/text-columns/editor.css 95 B
build/block-library/blocks/text-columns/style-rtl.css 166 B
build/block-library/blocks/text-columns/style.css 166 B
build/block-library/blocks/verse/style-rtl.css 99 B
build/block-library/blocks/verse/style.css 99 B
build/block-library/blocks/video/editor-rtl.css 552 B
build/block-library/blocks/video/editor.css 555 B
build/block-library/blocks/video/style-rtl.css 179 B
build/block-library/blocks/video/style.css 179 B
build/block-library/blocks/video/theme-rtl.css 139 B
build/block-library/blocks/video/theme.css 139 B
build/block-library/classic-rtl.css 179 B
build/block-library/classic.css 179 B
build/block-library/common-rtl.css 1.12 kB
build/block-library/common.css 1.12 kB
build/block-library/editor-elements-rtl.css 75 B
build/block-library/editor-elements.css 75 B
build/block-library/editor-rtl.css 11.6 kB
build/block-library/editor.css 11.6 kB
build/block-library/elements-rtl.css 54 B
build/block-library/elements.css 54 B
build/block-library/index.min.js 204 kB
build/block-library/reset-rtl.css 478 B
build/block-library/reset.css 478 B
build/block-library/style-rtl.css 12.8 kB
build/block-library/style.css 12.8 kB
build/block-library/theme-rtl.css 698 B
build/block-library/theme.css 703 B
build/block-serialization-default-parser/index.min.js 1.13 kB
build/block-serialization-spec-parser/index.min.js 2.83 kB
build/blocks/index.min.js 51.1 kB
build/commands/index.min.js 14.8 kB
build/commands/style-rtl.css 1.1 kB
build/commands/style.css 1.09 kB
build/components/index.min.js 208 kB
build/components/style-rtl.css 11.7 kB
build/components/style.css 11.7 kB
build/compose/index.min.js 12.4 kB
build/core-data/index.min.js 16.3 kB
build/customize-widgets/index.min.js 12.2 kB
build/customize-widgets/style-rtl.css 1.41 kB
build/customize-widgets/style.css 1.41 kB
build/data-controls/index.min.js 718 B
build/data/index.min.js 8.68 kB
build/date/index.min.js 40.4 kB
build/deprecated/index.min.js 518 B
build/dom-ready/index.min.js 336 B
build/dom/index.min.js 4.72 kB
build/edit-post/classic-rtl.css 571 B
build/edit-post/classic.css 571 B
build/edit-post/index.min.js 35 kB
build/edit-post/style-rtl.css 7.59 kB
build/edit-post/style.css 7.58 kB
build/edit-site/index.min.js 64.3 kB
build/edit-site/style-rtl.css 10.1 kB
build/edit-site/style.css 10.1 kB
build/edit-widgets/index.min.js 17.3 kB
build/edit-widgets/style-rtl.css 4.56 kB
build/edit-widgets/style.css 4.56 kB
build/editor/index.min.js 45.9 kB
build/editor/style-rtl.css 3.49 kB
build/editor/style.css 3.48 kB
build/element/index.min.js 4.95 kB
build/escape-html/index.min.js 548 B
build/format-library/index.min.js 7.26 kB
build/format-library/style-rtl.css 557 B
build/format-library/style.css 556 B
build/hooks/index.min.js 1.66 kB
build/html-entities/index.min.js 454 B
build/i18n/index.min.js 3.79 kB
build/is-shallow-equal/index.min.js 535 B
build/keyboard-shortcuts/index.min.js 1.79 kB
build/keycodes/index.min.js 1.94 kB
build/list-reusable-blocks/index.min.js 2.14 kB
build/list-reusable-blocks/style-rtl.css 865 B
build/list-reusable-blocks/style.css 865 B
build/media-utils/index.min.js 2.99 kB
build/notices/index.min.js 977 B
build/plugins/index.min.js 1.94 kB
build/preferences-persistence/index.min.js 2.23 kB
build/preferences/index.min.js 1.35 kB
build/primitives/index.min.js 960 B
build/priority-queue/index.min.js 1.52 kB
build/private-apis/index.min.js 942 B
build/react-i18n/index.min.js 702 B
build/react-refresh-entry/index.min.js 8.44 kB
build/react-refresh-runtime/index.min.js 7.31 kB
build/redux-routine/index.min.js 2.75 kB
build/reusable-blocks/index.min.js 2.26 kB
build/reusable-blocks/style-rtl.css 265 B
build/reusable-blocks/style.css 265 B
build/rich-text/index.min.js 11.1 kB
build/server-side-render/index.min.js 2.09 kB
build/shortcode/index.min.js 1.52 kB
build/style-engine/index.min.js 1.55 kB
build/token-list/index.min.js 650 B
build/url/index.min.js 3.74 kB
build/vendors/inert-polyfill.min.js 2.48 kB
build/vendors/react-dom.min.js 41.8 kB
build/vendors/react.min.js 4.02 kB
build/viewport/index.min.js 1.09 kB
build/warning/index.min.js 280 B
build/widgets/index.min.js 7.3 kB
build/widgets/style-rtl.css 1.18 kB
build/widgets/style.css 1.18 kB
build/wordcount/index.min.js 1.06 kB

compressed-size-action

@github-actions
Copy link

github-actions bot commented Mar 29, 2023

Flaky tests detected in 0affd72.
Some tests passed with failed attempts. The failures may not be related to this commit but are still reported for visibility. See the documentation for more information.

🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/4700855963
📝 Reported issues:

@WunderBart WunderBart force-pushed the refactor/perf-tests-tmp-folder branch from 1ec1960 to 887e8e8 Compare April 12, 2023 18:49
@WunderBart WunderBart force-pushed the refactor/perf-tests-tmp-folder branch from 887e8e8 to fd35da9 Compare April 12, 2023 21:06
@WunderBart WunderBart self-assigned this Apr 14, 2023
@WunderBart WunderBart added [Type] Performance Related to performance efforts [Package] E2E Tests /packages/e2e-tests GitHub Actions Pull requests that update GitHub Actions code and removed [Package] E2E Tests /packages/e2e-tests labels Apr 14, 2023
@WunderBart WunderBart marked this pull request as ready for review April 14, 2023 14:12
@WunderBart WunderBart requested a review from ntwb as a code owner April 14, 2023 14:12
@@ -39,7 +39,7 @@ jobs:

- name: Compare performance with trunk
if: github.event_name == 'pull_request'
run: ./bin/plugin/cli.js perf $GITHUB_SHA trunk --tests-branch $GITHUB_SHA
run: ./bin/plugin/cli.js perf trunk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we ensure that we run the performance tests that are on the PR. For instance when we make changes to the tests themselves?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh If I'm reading this, there's now an assumption that the perf cli.js is included within a Gutenberg plugin and that we need we compare with the current git hash automatically and use it also as tests branch.

One of the initial goal of the cli is to actually be "independent", while it lives in the Gutenberg plugin, it doesn't have any requirement in the sense that the current branch could be broken, have edits to Gutenberg code... it doesn't matter, the tool would always perform a clone. In fact one goal was to make it independent of Gutenberg entirely and be a tool any plugin can use. It seems the current PR goes into the opposite direction.

Copy link
Member Author

@WunderBart WunderBart Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the general direction of this PR, see my answer here.

(…) In fact one goal was to make it independent of Gutenberg entirely and be a tool any plugin can use.

I'm struggling to visualize the full picture of how the perf tests CLI runner, as an independent tool, would be utilized by other plugins. Would it not be running from the local gutenberg repo then, but would still be comparing gutenberg branches? It would be really helpful if you could give an example of a plugin development scenario where this tool is used 🙏

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the initial goal was to be an independent tool, something like so:

# installing the tool
npm install -g @wordpress/plugin-cli

# writing a config file
touch { repoUrl: "some url", buildCommand: "npm run build", perfCommand: "npm run perf", wpPluginSlug: "someslug" } >> plugin-cli.json

# running the perf command
plugin-cli  ---config plugin-cli.json some-branch some-other-branch

The goal was also for the cli to include more than just perf command, maintaining changeling, releasing to SVN/Github was also a command that was part of the same tool, I see that it was removed now in favor of all the git workflows. I'm kind of sad that it didn't remain as a command to be used by the workflows instead of writing it within the workflow files like we do now.

Anyway, given that the previous direction is also to move away from the goal of being an independent tool, one more step away won't be that harmful here, so I guess feel free to make that change if you think it's a good idea.

Copy link
Member Author

@WunderBart WunderBart Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the initial goal was to be an independent tool, something like so:

# installing the tool
npm install -g @wordpress/plugin-cli

# writing a config file
touch { repoUrl: "some url", buildCommand: "npm run build", perfCommand: "npm run perf", wpPluginSlug: "someslug" } >> plugin-cli.json

# running the perf command
plugin-cli  ---config plugin-cli.json some-branch some-other-branch

Oh, so by independent, you meant from Gutenberg entirely, which would run perf comparisons for any plugin?

If yes, I think we can still assume and leverage the fact that it will always be running from SOME plugin repo root and that the current branch will be a primary subject of the test. The CLI will still need to run the current plugin's dedicated perf tests, right? So it will be easier to develop/debug those tests with the CLI, and also utilize it by the target plugin's CI, where the current branch is always a primary reference.

Now that I think about it, we can replace Gutenberg with a generic plugin in this refactored script, and it would be all that's needed to make this PR not tie it any further to Gutenberg. There's no check that the current repo is actually Gutenberg.

Also, as I mentioned earlier, we can implement the --from-origin and build everything from GH if we don't want to use the local code.

How does that sound?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me 👍

PREVIOUS_VERSION_BASE_10=$((PLUGIN_VERSION_ARRAY[0] * 10 + PLUGIN_VERSION_ARRAY[1] - 1))
PREVIOUS_RELEASE_BRANCH="release/$((PREVIOUS_VERSION_BASE_10 / 10)).$((PREVIOUS_VERSION_BASE_10 % 10))"
WP_VERSION=$(awk -F ': ' '/^Tested up to/{print $2}' readme.txt)
IFS=. read -ra WP_VERSION_ARRAY <<< "$WP_VERSION"
WP_MAJOR="${WP_VERSION_ARRAY[0]}.${WP_VERSION_ARRAY[1]}"
./bin/plugin/cli.js perf "wp/$WP_MAJOR" "$PREVIOUS_RELEASE_BRANCH" "$CURRENT_RELEASE_BRANCH" --wp-version "$WP_MAJOR"
./bin/plugin/cli.js perf "$PREVIOUS_RELEASE_BRANCH" "wp/$WP_MAJOR" --wp-version "$WP_MAJOR"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we remove $CURRENT_RELEASE_BRANCH here? I know the release perf tests are broken now but the idea is to have a table that compares the current release, the previous release and the previous WP release.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we certain here that $CURRENT_RELEASE_BRANCH is the same thing as $GITHUB_SHA

Copy link
Member Author

@WunderBart WunderBart Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$CURRENT_RELEASE_BRANCH is actually release/xx.x, which is also $GITHUB_HEAD_REF, not $GUTHUB_SHA. The primary reference for the CI perf in the refactored cli is $GITHUB_HEAD_REF so this comparison will still be, e.g.: release/15.6 vs. release/15.5 vs. wp/6.1. With the refactored CLI we need to only list the refs that we want to compare the current one to.

if ( ! runningInCI ) {
await askForConfirmation( 'Ready to go? ' );
if ( ! localRef ) {
throw new Error( 'Must be running from a Gutenberg repository root.' );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this my main issue with this PR. My hope for this tool was to be plugin independent, I wanted to transform it into a package that any plugin can use to run perf tests, or any other command.

Copy link
Member Author

@WunderBart WunderBart Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, making it independent should not be a problem, as the logic for fetching and building from the origin is there. After checking your other comment above, I'm curious whether we should actually account for the fact that the script can run from the Gutenberg root. IMO it would make a lot of sense, as it would not break the original purpose to be an independent tool while being especially helpful in these situations:

  1. When developing or debugging locally, as it enables working on both the CLI perf tool and perf tests at the same time.
  2. When running from CI, where the branch we run the workflow from is always a primary reference, enabling us to speed things up by reusing it and its dependencies (restored from the cache ⚡).
'--abbrev-ref': null,
HEAD: null,
} );
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me what we're doing this within the performance command. We're creating a dependency between the performance command and Github CI runner. What if we do this logic outside and pass the "localRef" as a regular branch to compare like we used to do before?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than creating a CI dependency, I'd argue that it adds CI support. When run in CI, the script identifies the local reference and uses it to build the plugin. This is quicker than our previous method, as it circumvents the need to clone the repository and install node modules.

During the refactor, I was fairly confident that this script would not be utilized outside an environment that would be the subject of the test, whether it's Gutenberg or another plugin. TBH, I'm still struggling to see an alternative use-case. If our intention is to build and measure or compare the performance of a specific feature, we can easily switch to the appropriate branch and use the local reference. From your perspective, it seems we want this script to act as a standalone application capable of fetching and performance-testing any references from the origin. While it wouldn't be difficult to include support for that (old) logic in the proposed implementation, I feel I'm yet to understand a compelling use-case example.

@youknowriad
Copy link
Contributor

Personally I feel this PR is doing too much to be honest and I'd prefer if we make each of the change mentioned in the description separately, so we discuss the reasoning for the existing choices more clearly.

@dmsnell
Copy link
Contributor

dmsnell commented Apr 18, 2023

I'd like to see these split as separate issues too, but I don't have as strong of a feel in it. I've been half-way following along and it's been hard to grasp everything going on.

Personally I think it makes sense to get rid of --tests-branch too because I don't think it ever solved our problems on breaking changes so much as delayed them to a different part of the process. I guess I could be wrong if I've missed that people created special third branches just to run tests in a way that works across the breaking change, but I don't think I've missed that. I think it's just a sad "we have to wait to compare these until we have two working versions." The times where --tests-branch solves the problem are times I think when we could do what you suggested: pick another branch as the first one.

The thing about reusing the branch for the test setup and also for a branch under test was there before but I made a mistake in not having it fully build all the packages and therefore the tests weren't building the Gutenberg plugin, injecting it into the WordPress env, and testing against the actual code. That should be able to be brought back with the appropriate fix if we want, and I think that will account for as much setup time as your changes. (In #45284 I stopped building parts of the app for the tests-branch branch because it didn't need them, but then in #45737 I started reusing the tests-branch if it was the same and already built - side note @youknowriad I now understand how this thing happened, because it looks like I had both of those PRs merge on the same day - they were a non-conflict merge conflict I didn't notice, where each was fine on its own but not together).

I've noted that in trunk the Perf Tests are still not in the hot path for CI builds finished. I applaud this work to continue to optimize them and encourage it, but I'm also less bothered now by certain inefficiencies (particularly those increasing the complexity of the scripts) because improving them doesn't have the direct impact on the workflow as it did when these were running upwards of 1h 20m

@kevin940726
Copy link
Member

Thank you for all the hard work! This is looking very promising to me! I agree it's a bit hard to review but I also don't think it necessary has to be split into smaller PRs.

While reading the description of this PR, I've been thinking of potential ways to further improve the performance. This is just me thinking out loud and not relevant to this PR, but just want to jot it down somewhere in case I forgot 👇

Note:
It seems to me that a lot of time spent is during checking out and building the old branch. Most of the time we're comparing a PR branch (current branch) and a commit on trunk. What if we could memoize the result of the commits on trunk? For releases, we can just use wp-env to download a specific version of gutenberg via the command line, we don't actually have to checkout to that git commit and rebuild the plugin every time. For the latest commit on trunk, I'm thinking maybe we could leverage GitHub releases API or artifacts API to automatically publish the latest commit on trunk somewhere. Then we just have to download the artifact and ignore all the build steps. We probably still want to keep the "checking out and building" steps in the code for comparing arbitrary commits, but for most jobs on CI we could potentially save a lot of time.

@WunderBart
Copy link
Member Author

It seems to me that a lot of time spent is during checking out and building the old branch. Most of the time we're comparing a PR branch (current branch) and a commit on trunk. What if we could memoize the result of the commits on trunk? For releases, we can just use wp-env to download a specific version of gutenberg via the command line, we don't actually have to checkout to that git commit and rebuild the plugin every time. For the latest commit on trunk, I'm thinking maybe we could leverage GitHub releases API or artifacts API to automatically publish the latest commit on trunk somewhere. Then we just have to download the artifact and ignore all the build steps. We probably still want to keep the "checking out and building" steps in the code for comparing arbitrary commits, but for most jobs on CI we could potentially save a lot of time.

@kevin940726, I was thinking the same thing. 🙌 The good part is that each trunk commit already has a plugin ZIP built and uploaded. All we'd need would be the URL to that artifact (which btw I was hoping to achieve within this PR 🙈), but the artifacts API doesn't seem to support searching by ref/sha/etc (yet?). I'd be happy to explore that in a follow-up PR, as it should be a big jump in speed and should also streamline the setup a lot.

@kevin940726
Copy link
Member

kevin940726 commented May 13, 2023

The good part is that each trunk commit already has a plugin ZIP built and uploaded.

Oh, yeah! For some reason I totally forgot that it's a thing already 😅.

but the artifacts API doesn't seem to support searching by ref/sha/etc

I think this is possible with a few more extra steps today!

  1. First get the list of the workflow runs of a commit on trunk:
curl https://api.github.com/repos/WordPress/gutenberg/actions/runs?branch=trunk&head_sha=<COMMIT_SHA> > runs.json
  1. Loop through the result to find the Build Gutenberg Plugin Zip workflow run and get the artifacts_url field:
jq -r '.workflow_runs[] | select(.name=="Build Gutenberg Plugin Zip").artifacts_url' runs.json > artifact_url
  1. Fetch the url and grab the downloadable url to download the plugin zip:
curl -s $(cat artifact_url) | jq -r '.artifacts[] | select(.name=="gutenberg-plugin").archive_download_url' > download_url
  1. Download the zip! 🎉

The gotcha here is that the artifacts have retention limits and will be deleted after a certain amount of time (normally 30 days). To enable comparing older commits we might have to upload them to a persistent storage, like the release API?

@WunderBart
Copy link
Member Author

I'm wondering if there's anything I should do at this point to wrap up this PR. I understand the suggestion for splitting this into smaller PRs, but I'm wondering if it's necessary. Maybe we'll get a better picture after summarizing the current feedback/discussion.

First of all, there are 3 functionality changes:

  • Switch from fetching primary ref from origin to local,
  • Remove the --test-branch option and use the primary ref as the source of the tests,
  • Use the latest WordPress stable instead of bleeding edge.

...and 2 improvements:

  • Use a fixed tmp folder so we can reuse the existing plugin clones,
  • Parallelize ref builds for faster setup,

As far as the functionality changes go, we seem to have a partial consensus on removing the --test-branch option (see @dmsnell comment). I'd love to hear from @youknowriad whether he agrees with the current reasoning or not.

For the part where we use the local code, I've made my argument here and, to add to that, I think it doesn't make sense to fetch the primary reference from the origin until we decide to extract the script to a dedicated package. Currently, the primary use of the script is for the CI.

Regarding the use of WP stable instead of bleeding edge, I think it makes the most sense as it increases the env consistency, which should work towards measurement stability. Is there any reason we should be testing against the bleeding edge?

The improvements part is what speeds up the setup the most, especially the parallel builds, as the fixed tmp folder is relevant only for local development. I believe it's the main thing that makes this PR a bit hard to follow.


I'm unsure if the above makes it easier for you to vote for or against splitting this PR. Anyhow, please let me know what you think, and I'll follow the vote. 😄


Note: If we decide to split, I'd suggest doing it as follows:

  1. Use the latest stable WP instead of bleeding edge,
  2. Remove the --test-branch option,
  3. Use a fixed tmp folder to enable the reuse of the plugin clones + parallelize ref builds,
  4. Use local ref as the primary ref instead of fetching from origin.

/cc @youknowriad @dmsnell @kevin940726

@WunderBart
Copy link
Member Author

WunderBart commented May 15, 2023

I think this is possible with a few more extra steps today!

Nice, @kevin940726! 🙌 I'll definitely try that out.

The gotcha here is that the artifacts have retention limits and will be deleted after a certain amount of time (normally 30 days). To enable comparing older commits we might have to upload them to a persistent storage, like the release API?

I guess for this case we could just leave the part where we build the plugin from the ref. I don't think we'll be often testing against commits that >1 month old, will we?

I can see another gotcha, where the ZIP has not been built yet - kinda the other way around. I think we'd either need a waiting mechanism, build from ref (as above), or just create the logic in a way that would fetch the last available ZIP for trunk.

@youknowriad
Copy link
Contributor

For me this PR comes down to "smart/implicit vs explicit behavior". I personally prefer explicit behavior: pass the branches to compare, pass the branch to use for the tests... over assumption based on the current branch. I agree that it's slower but I believe it to be more flexible and also opens the door for a more generic tool in the future.

That said, I don't want to be the blocker here if you all think otherwise.

@dmsnell
Copy link
Contributor

dmsnell commented May 15, 2023

Currently, the primary use of the script is for the CI.

I believe that for every big release someone runs this script locally to compare against previous releases.

Regarding the use of WP stable instead of bleeding edge, I think it makes the most sense as it increases the env consistency, which should work towards measurement stability.

Having run into this many times I think it'd be good to be explicit on the WP version we're using in the tests. Too many times the tests failed with no explicable reason, only later to discover that something changed in the WP copy in the env which is never called out in the test setup

Use the latest stable WP instead of bleeding edge,

From me 👍

Remove the --test-branch option,

Also from me 👍 but I don't have strong opinions. I don't think this is practically more explicit than saying "use the first given branch" and I think it never solved the problem it was introduced to solve; that problem still exists.

Use a fixed tmp folder to enable the reuse of the plugin clones + parallelize ref builds,

This ties in with eliminating --tests-branch but doesn't need to; bringing back my work which reused or cached the tests branch should cut out a few minutes and have a similar impact. That doesn't require adding more into anything; just involves doing less. The only caveat is that I added that cache simultaneously while trimming it down to only build the tooling. We need to continue to have it generate the full build when we cache it.

Use local ref as the primary ref instead of fetching from origin.

In several experiments I used an ENV with a default value. There is a challenge I experienced with this kind of work and that is the need to reference branches that may not be present when relying only on already-fetched git references.


Otherwise my thoughts on this haven't changed. I appreciate all the work being done here, but I do think it's heavy to say the least.

First of all, there are 3 functionality changes:

Switch from fetching primary ref from origin to local,
Remove the --test-branch option and use the primary ref as the source of the tests,
Use the latest WordPress stable instead of bleeding edge.
...and 2 improvements:

Use a fixed tmp folder so we can reuse the existing plugin clones,
Parallelize ref builds for faster setup,

If one of these fails we have to revert all of them. It's hard to leave a good commit message for these too, let alone review them.

I won't stand in the way if you really believe this is best despite that.

@kevin940726
Copy link
Member

I'd vote for giving it a try! Worst case we'd have to revert, no big deal! We can monitor it after a few commits after it gets merged so that we don't mess up with the metrics too much.

I don't think we'll be often testing against commits that >1 month old, will we?

IIRC, that's the case for a step on CI? I don't remember the exact details and I'm too lazy to check 😝 . Either way I think this is the easier part of the issue, we just need a persistent storage, which is relatively cheap or even free.

can see another gotcha, where the ZIP has not been built yet

Yeah, I think we can wait on CI but for local compares we might have to fallback to build it from scratch. Or if the exact commit doesn't matter, we can just pick the latest built output, like you suggested!

@youknowriad
Copy link
Contributor

Now, that I think about it, let's make sure that we don't break the "manual" performance CI job where we have the ability to choose which branches to compare manually and choose which branch to use for the tests themselves.

A job you can run from here https://github.com/WordPress/gutenberg/actions/workflows/performance.yml by clicking on "run workflow".

@WunderBart
Copy link
Member Author

WunderBart commented May 17, 2023

Now, that I think about it, let's make sure that we don't break the "manual" performance CI job where we have the ability to choose which branches to compare manually and choose which branch to use for the tests themselves.

A job you can run from here https://github.com/WordPress/gutenberg/actions/workflows/performance.yml by clicking on "run workflow".

edit: Custom comparison tests and runner are connected. See #49238 (comment).

IIUC, the manual dispatch (workflow_dispatch) doesn't work like that currently, meaning the source of the tests cannot be picked independently. Here's an example run:

Screenshot 2023-05-17 at 12 01 59
  • The Use workflow from defines the source of the workflow file and the performance.js command script.
  • The branches or commits to compare requires at least 2 branches to be passed, the first of which is used as the source of the tests.
  • The base WP version input isn't important in this context. It does what it's supposed to do.

The disparity between the performance.js source and the tests' source is actually what's currently causing the release perf tests to fail. The error is that the tests are saving result artifacts under a different name than the perf script is expecting them to be, so it cannot find them. For example, we can look at the most recent release perf job. The command is:

./bin/plugin/cli.js perf "wp/$WP_MAJOR" "$PREVIOUS_RELEASE_BRANCH" "$CURRENT_RELEASE_BRANCH" --wp-version "$WP_MAJOR"

...which values for that run are:

./bin/plugin/cli.js perf wp/6.2 release/15.7 release/15.8 --wp-version 6.2
  • The source of the workflow and the performance.js script is the branch that the job is dispatched from, which is actually a tag ref tags/v15.8.0-rc.1.
  • The source of the tests is the first branch to be compared, which is wp/6.2, hence the disparity.

Now that I think about it, the immediate fix would be adding the --tests-branch or simply reversing the order of the comparison branches, which would establish parity between the runner and the tests:

./bin/plugin/cli.js perf wp/6.2 release/15.7 release/15.8 --test-branch release/15.8 --wp-version 6.2

vs.

./bin/plugin/cli.js perf release/15.8 release/15.7 wp/6.2 --wp-version 6.2

I'm not 100% sure, though, because I don't know if tags/v15.8.0-rc.1 is exactly the same as release/15.8. Are we sure those are the same references, or is it possible that release/15.8 can have some extra commits on top, @youknowriad? Why aren't we just passing the triggering ref tags/v15.8.0-rc.1?

I think this is a good example of why it would be safer to use the job triggering ref as the source for both the perf runner script and the tests. I think that would actually be a good first step in arriving at what I'm proposing with this PR. I'll start drafting.

@youknowriad
Copy link
Contributor

I'm not 100% sure, though, because I don't know if tags/v15.8.0-rc.1 is exactly the same as release/15.8. Are we sure those are the same references, or is it possible that release/15.8 can have some extra commits on top, @youknowriad? Why aren't we just passing the triggering ref tags/v15.8.0-rc.1?

There could be some differences but at the moment the release job runs, there's none (or almost none).

The workflow dispatch job uses GITHUB_SHA for the perf tests branch (use workflow from select). I think that's a good behavior for the manual test. But it doesn't mean that we want to compare with that branch too.

@WunderBart
Copy link
Member Author

The workflow dispatch job uses GITHUB_SHA for the perf tests branch (use workflow from select).

Ah, apologies for the confusion - I was looking at the workflow from my own branch where I already removed the --tests-branch option. 🤦

I think that's a good behavior for the manual test. But it doesn't mean that we want to compare with that branch too.

That's fair, though I'm still failing to see why we wouldn't just use the branch that we're triggering from (cwd) as a ready-to-use source of the tests. By separating the runner from the tests, we're adding an extra setup step and making the whole thing more fragile.

@youknowriad
Copy link
Contributor

Imagine we add a new metric today and we want to check whether React 18 upgrade had an impact on that metric. We can run the test manually by using "trunk" as the branch for the tests and compare the commit that did the React 18 with the previous one.

@WunderBart
Copy link
Member Author

WunderBart commented May 17, 2023

Imagine we add a new metric today and we want to check whether React 18 upgrade had an impact on that metric. We can run the test manually by using "trunk" as the branch for the tests and compare the commit that did the React 18 with the previous one.

I agreed on separating tests from comparison branches, it should remain as is. I asked about separating the test runner (performance.js) from the tests branch (--tests-branch) - sorry if that wasn't clear enough. The runner source is always the triggering branch, and the tests branch is either the first branch we use for comparison or the one defined by --tests-branch. This way, it's easy to disconnect the two, which is why I wanted to remove the --tests-branch option and just point to the tests we're currently checked out at (triggering branch, e.g. trunk).

@youknowriad
Copy link
Contributor

I asked about separating the test runner (performance.js) from the tests branch (--tests-branch) - sorry if that wasn't clear enough.

At the moment at least, the metrics are present in both the tests and the runner. In other words, there's a strong relationship between these two so I wouldn't mind if we use the same thing for both these two.

But, I think it's just a bad design that they are tied, ideally perf tests define the metrics, and the runner is just orchestrating stuff without any relation to the tests themselves. But again It's not the case today, so 🤷

@WunderBart
Copy link
Member Author

But, I think it's just a bad design that they are tied, ideally perf tests define the metrics, and the runner is just orchestrating stuff without any relation to the tests themselves. But again It's not the case today, so 🤷

Yeah they're currently very much connected - the runner is responsible for computing and rendering the results that are produced by the tests. In an ideal scenario, which part of the process do you think should take responsibility for handling the results? How could we untie it from the runner? Maybe a separate script? We'd still be making that connection though, only with a different script...

@youknowriad
Copy link
Contributor

Yeah they're currently very much connected - the runner is responsible for computing and rendering the results that are produced by the tests. In an ideal scenario, which part of the process do you think should take responsibility for handling the results? How could we untie it from the runner? Maybe a separate script? We'd still be making that connection though, only with a different script...

I think the runner doesn't need to know about the metrics, it can just use the "keys" of the JSON output by the performance tests.

@WunderBart
Copy link
Member Author

Yeah they're currently very much connected - the runner is responsible for computing and rendering the results that are produced by the tests. In an ideal scenario, which part of the process do you think should take responsibility for handling the results? How could we untie it from the runner? Maybe a separate script? We'd still be making that connection though, only with a different script...

I think the runner doesn't need to know about the metrics, it can just use the "keys" of the JSON output by the performance tests.

This has been addressed within #52022

@WunderBart WunderBart closed this Oct 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GitHub Actions Pull requests that update GitHub Actions code [Type] Performance Related to performance efforts
4 participants