Seeking feedback: improved PR benchmark workflow

After having spent time looking at the implementation of the PR benchmark workflow and in-tree benchmarks in fastify/point-of-view, fastify/fastify, and fastify/workflows, I've managed to pull it all together. Hopefully the result is more usable with less code duplication, and provides better actionable information to PR reviewers.

I feel like things are in a stable place now, so I'm seeking feedback on if this is going to be valuable to reviewers, my approach, and the changes to each of the three repos needed. Open to any and all feedback before I spent any more time on it.

## Demo
### fastify
<img width="928" alt="CleanShot 2024-02-25 at 15 38 45@2x" src="https://github.com/fastify/workflows/assets/1062734/22b3fcdd-0488-4a09-a238-fda9d5977638">

### point-of-view
<img width="929" alt="CleanShot 2024-02-25 at 14 32 26@2x" src="https://github.com/fastify/workflows/assets/1062734/b85a78f7-d41f-41e9-9ccb-77dd1b028e7d">

## Required changes
`fastify/workflows`: https://github.com/mweberxyz/fastify-workflows/commit/9cee01191ba450c4550eaa9c2eec4fa93f2d2c21
`fastify/fastify`: https://github.com/mweberxyz/fastify/commit/053330ec0c52b4f063e78b13e0488bd92e551e86
`fastify/point-of-view`: https://github.com/mweberxyz/point-of-view/commit/07b17c8e0e5bb7f41ae326abba360d9ed0e9fa01

## Sample PRs
### fastify/fastify: PR from fork
https://github.com/mweberxyz/fastify/pull/3

Merges code from a fork into my fork, to demonstrate that the "base" benchmark are run against the target of the PR. Additionally, it shows warnings in the comment because the "head" aka PR branch does not run the `parser.js` correctly and all requests return 404s.

### fastify/point-of-view: PR from same repo with performance degredation
https://github.com/mweberxyz/point-of-view/pull/5

Merges code from a branch into the default branch of the same fork. It reverts a performance improvement, to demonstrate what it looks like when a PR really tanks performance.

## Approach
- Everything needed to run, parse, send comments, and remove the benchmark label is contained in the re-usable workflow
    - Re-usable workflows and custom actions each come with their pros and cons. In the end, I decided to keep the entirety of the logic in a re-usable workflow for ease of maintenance, though I admit the JS-in-YAML is a bit unwieldy.
    - Some benefits are we don't need to pass around GITHUB_TOKEN, we avoid a build step, and it fits in better with the rest of the workflows already defined in this repo
- Each file in the input `benchmarks-dir` directory is executed (except any file specified in the input `files-to-ignore`), then autocannon is run 3 times* for 5 seconds* each against each file, taking the maximum result of mean req/s of the three runs
    - Following conclusion of benchmark runs, a table is sent to the PR as a comment
    - Any results that differ by 5% or more are bolded 
- Autocannon is loaded via `npx`
    - Any plugin/core can use the workflow without regard for whether `autocannon` is a listed dep, or installed
    - When invoked this way, `autocannon` writes it's table output to stderr, so the raw results can be seen if needed in the action logs
        - Example: https://github.com/mweberxyz/fastify/actions/runs/8040188150/job/21958002386 (expand one of the "Run Benchmark" steps
- Autocannon's `--on-port` is used to spawn benchmark scripts
    -  removes the need for the logic currently in `fastify/point-of-view/benchmark.js`
- Selection of node versions is moved to the `node-versions` input
    - the current `fastify/workflows` workflow uses different versions of Node for benchmarks than is currently implemented in `fastify/fastify`
- Static outputs removed, results moved to GHA artifacts
    - in part due to the previous, but also to keep history of these runs over time
 - If any benchmark needs to add autocannon arguments, they can be defined in a comment in the benchmark file itself
     - example: `examples/benchmark/parser.js` in `fastify/fastify`

## Lessons
- https://github.com/fastify/workflows/pull/120 still has potentially incorrect logic
    - When commits are added to `main` after the PR is created, the `github.event.pull_request.base.sha` is not updated. That is to say, when running the `base` benchmarks, they always run against the `main` commit as-of-the-time the PR was created.
    - Fixed in POC by using the `github.event.pull_request.base.ref` instead

## Future work
- Test error and failure states more extensively
    - Add input-configurable `timeout-minutes` to the benchmark steps
    - Correctly handle when a PR adds or removes a benchmark
- Experiment with self-hosted runners as a strategy to reduce run-to-run variance
- Make benchmark run and benchmark duration input-configurable (see * above in Approach)
- Factor out logic used in GHA to be usable by developers locally
    - Would be nice as a developer to run `npm run benchmark` and see the same type of output



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Seeking feedback: improved PR benchmark workflow #121

Demo

fastify

point-of-view

Required changes

Sample PRs

fastify/fastify: PR from fork

fastify/point-of-view: PR from same repo with performance degredation

Approach

Lessons

Future work

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Seeking feedback: improved PR benchmark workflow #121

Description

Demo

fastify

point-of-view

Required changes

Sample PRs

fastify/fastify: PR from fork

fastify/point-of-view: PR from same repo with performance degredation

Approach

Lessons

Future work

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions