feat: add cairo-metrics benchmark harness + CI tracking by giladchase · Pull Request #9401 · starkware-libs/cairo

giladchase · 2026-01-04T16:54:09Z

Summary

Introduce cairo-metrics: a small CLI that currently checks wall-clock regressions from main.
Persist results in repo_root/results.db (git-ignored), currently SQLite keyed by run
id (defaults to current git SHA). This enables caching and tracking results across time locally by
checking out builds starting from this commit, and having the tool populate the db.
The tool then allows comparisons based on the db results. Future work can easily extend this to
a stateful DB machine (like RDS), since the DB is behind a trait.
Add a GitHub Actions workflow that benchmarks baseline vs PR, reuses cached baseline results
via artifacts (saves ci runtime for multiple PR runs over the same base branch), and posts a
“Benchmark Comparison” PR comment, not blocking, the reviewer decided.
Seed initial benchmark suites (corelib + OpenZeppelin): corelib uses local src, and openzepplin
is bundeled as a vendored release (not a submodule for simplicity).
Walltime engine is either home-brewed timed comparison of the compiler library,
or via hyperfine which uses the binary. It uses hyperfine by
default if available (need to install with apt) otherwise the builtin.
This is useful locally, to debug the builtin engine itself
(results are similar since hyperfine cancels out shell overhead), and since
hyperfine outputs useful statistical anomaly messages.
But if hyperfine is a pain to maintain it can be removed.

Type of change

Please check one:

Bug fix (fixes incorrect behavior)
New feature
Performance improvement
Documentation change with concrete technical impact
Style, wording, formatting, or typo-only change

⚠️ Note:
To keep maintainer workload sustainable, we generally do not accept PRs that
are only minor wording, grammar, formatting, or style changes.
Such PRs may be closed without detailed review.

Why is this change needed?

Implemented a benchmarking harness for testing and tracking performance regressions

What was the behavior or documentation before?

What is the behavior or documentation after?

Related issue or discussion (if any)

Additional context

reviewable-StarkWare · 2026-01-04T16:54:21Z

This change is

giladchase · 2026-01-04T16:54:34Z

feat(metrics): add compilation phases #9403
feat(metrics): add incremental mode logic #9402
feat: add cairo-metrics benchmark harness + CI tracking #9401 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

- Introduce `cairo-metrics`: a small CLI that currently checks wall-clock regressions from main. - Persist results in `repo_root/results.db` (git-ignored), currently SQLite keyed by run id (defaults to current git SHA). This enables caching and tracking results across time locally by checking out builds starting from this commit, and having the tool populate the db. The tool then allows comparisons based on the db results. Future work can easily extend this to a stateful DB machine (like RDS), since the DB is behind a trait. - Add a GitHub Actions workflow that benchmarks baseline vs PR, reuses cached baseline results via artifacts (saves ci runtime for multiple PR runs over the same base branch), and posts a “Benchmark Comparison” PR comment, not blocking, the reviewer decided. - Seed initial benchmark suites (corelib + OpenZeppelin): corelib uses local src, and openzepplin is bundeled as a vendored release (not a submodule for simplicity). - Walltime engine is either home-brewed timed comparison of the compiler library, or via `hyperfine` which uses the binary. It uses hyperfine by default if available (need to install with `apt`) otherwise the builtin. This is useful locally, to debug the builtin engine itself (results are similar since hyperfine cancels out shell overhead), and since hyperfine outputs useful statistical anomaly messages. But if hyperfine is a pain to maintain it can be removed.

giladchase

@giladchase made 1 comment.
Reviewable status: 0 of 24 files reviewed, all discussions resolved (waiting on @orizi and @TomerStarkware).

crates/bin/cairo-metrics/src/engine.rs line 63 at r1 (raw file):

            // TODO(gilad): Incremental compilation requires cairo compiler support.
            // For now, skip incremental scenarios entirely.
            if is_incremental {

Next level in the stack adds the logic itself, but the short circuit will stay until we support incremental in the compiler.

giladchase force-pushed the gilad/01-04-feat_add_cairo-metrics_benchmark_harness_ci_tracking branch from 33466e0 to 80885ee Compare January 4, 2026 17:03

giladchase force-pushed the gilad/01-04-feat_add_cairo-metrics_benchmark_harness_ci_tracking branch from 80885ee to 50a1f2e Compare January 4, 2026 17:08

giladchase mentioned this pull request Jan 4, 2026

feat(metrics): add incremental mode logic #9402

Draft

5 tasks

giladchase requested review from TomerStarkware and orizi January 4, 2026 19:00

giladchase commented Jan 4, 2026

View reviewed changes

giladchase mentioned this pull request Jan 4, 2026

feat(metrics): add compilation phases #9403

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add cairo-metrics benchmark harness + CI tracking#9401

feat: add cairo-metrics benchmark harness + CI tracking#9401
giladchase wants to merge 1 commit intomainfrom
gilad/01-04-feat_add_cairo-metrics_benchmark_harness_ci_tracking

giladchase commented Jan 4, 2026 •

edited

Loading

Uh oh!

reviewable-StarkWare commented Jan 4, 2026

Uh oh!

giladchase commented Jan 4, 2026 •

edited

Loading

Uh oh!

giladchase left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

giladchase commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of change

Why is this change needed?

What was the behavior or documentation before?

What is the behavior or documentation after?

Related issue or discussion (if any)

Additional context

Uh oh!

reviewable-StarkWare commented Jan 4, 2026

Uh oh!

giladchase commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

giladchase left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

giladchase commented Jan 4, 2026 •

edited

Loading

giladchase commented Jan 4, 2026 •

edited

Loading