Iai-Callgrind is a benchmarking framework/harness which uses Valgrind's Callgrind and other Valgrind tools like DHAT, Massif, ... to provide extremely accurate and consistent measurements of Rust code, making it perfectly suited to run in environments like a CI. Iai-Callgrind is integrated in Bencher.
Iai-Callgrind is:
- Precise: High-precision measurements of
Instruction
counts and many other metrics allow you to reliably detect very small optimizations and regressions of your code. - Consistent: Iai-Callgrind can take accurate measurements even in virtualized CI environments and make them comparable between different systems completely negating the noise of the environment.
- Fast: Each benchmark is only run once, which is usually much faster than benchmarks which measure execution and wall-clock time. Benchmarks measuring the wall-clock time have to be run many times to increase their accuracy, detect outliers, filter out noise, etc.
- Visualizable: Iai-Callgrind generates a Callgrind (DHAT, ...) profile of
the benchmarked code and can be configured to create flamegraph-like charts
from Callgrind metrics. In general, all Valgrind-compatible tools like
callgrind_annotate, kcachegrind or
dh_view.html
and others to analyze the results in detail are fully supported. - Easy: The API for setting up benchmarks is easy to use and allows you to quickly create concise and clear benchmarks. Focus more on profiling and your code than on the framework.
See the Guide and api documentation at docs.rs for all the details.
Iai-Callgrind benchmarks are designed to be runnable with cargo bench
. The
benchmark files are expanded to a benchmarking harness which replaces the native
benchmark harness of rust
. Iai-Callgrind is a profiling framework that can
quickly and reliably detect performance regressions and optimizations even in
noisy environments with a precision that is impossible to achieve with
wall-clock time based benchmarks. At the same time, we want to abstract the
complicated parts and repetitive tasks away and provide an easy to use and
intuitive api. Concentrate more on profiling and your code than on the
framework!
Although Iai-Callgrind is useful in many projects, there are cases where Iai-Callgrind is not a good fit.
- If you need wall-clock times, Iai-Callgrind cannot help you much. The estimation of cpu cycles merely correlates to wall-clock times but is not a replacement for wall-clock times. The cycles estimation is primarily designed to be a relative metric to be used for comparison.
- Iai-Callgrind cannot be run on Windows and platforms not supported by Valgrind.
You're missing the old README? To get started read the Guide.
The guide maintains only the versions 0.12.3
upwards. For older versions
checkout the README of this repo using a specific tagged version for example
https://github.com/iai-callgrind/iai-callgrind/tree/v0.12.2 or using the
github ui.
Here's just a small introductory example, assuming you have everything
installed and a benchmark with the following content in
benches/library_benchmark.rs
ready:
use iai_callgrind::{main, library_benchmark_group, library_benchmark};
use std::hint::black_box;
fn fibonacci(n: u64) -> u64 {
match n {
0 => 1,
1 => 1,
n => fibonacci(n - 1) + fibonacci(n - 2),
}
}
#[library_benchmark]
#[bench::short(10)]
#[bench::long(30)]
fn bench_fibonacci(value: u64) -> u64 {
black_box(fibonacci(value))
}
library_benchmark_group!(name = bench_fibonacci_group; benchmarks = bench_fibonacci);
main!(library_benchmark_groups = bench_fibonacci_group);
Now run
cargo bench
library_benchmark::bench_fibonacci_group::bench_fibonacci short:10
Instructions: 1734|N/A (*********)
L1 Hits: 2359|N/A (*********)
L2 Hits: 0|N/A (*********)
RAM Hits: 3|N/A (*********)
Total read+write: 2362|N/A (*********)
Estimated Cycles: 2464|N/A (*********)
library_benchmark::bench_fibonacci_group::bench_fibonacci long:30
Instructions: 26214734|N/A (*********)
L1 Hits: 35638616|N/A (*********)
L2 Hits: 2|N/A (*********)
RAM Hits: 4|N/A (*********)
Total read+write: 35638622|N/A (*********)
Estimated Cycles: 35638766|N/A (*********)
Thanks for helping to improve this project! A guideline about contributing to Iai-Callgrind can be found in the CONTRIBUTING.md file.
You have an idea for a new feature, are missing a functionality or have found a bug?
Please don't hesitate to open an issue.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you shall be dual licensed as in License, without any additional terms or conditions.
- Iai-Callgrind is mentioned in a talk at RustNation UK about Towards Impeccable Rust by Jon Gjengset
- Iai-Callgrind is supported by Bencher
- Iai: The repository from which Iai-Callgrind is forked. Iai uses Cachegrind instead of Callgrind under the hood.
- Criterion-rs: A Statistics-driven benchmarking library for Rust. Wall-clock times based benchmarks.
- hyperfine: A command-line benchmarking tool. Wall-clock time based benchmarks.
- divan: Statistically-comfy benchmarking library. Wall-clock times based benchmarks.
- dhat-rs: Provides heap profiling and ad hoc profiling capabilities to Rust programs, similar to those provided by DHAT.
- cargo-valgrind: A cargo subcommand, that runs valgrind and collects its output in a helpful manner.
- crabgrind: Valgrind Client Request interface for Rust programs. A small library that enables Rust programs to tap into Valgrind's tools and virtualized environment.
Iai-Callgrind is forked from https://github.com/bheisler/iai and was originally written by Brook Heisler (@bheisler).
Iai-Callgrind wouldn't be possible without Valgrind.
Iai-Callgrind is like Iai dual licensed under the Apache 2.0 license and the MIT license at your option.
According to Valgrind's documentation:
The Valgrind headers, unlike most of the rest of the code, are under a BSD-style license so you may include them without worrying about license incompatibility.
We have included the original license where we made use of the original header files.