Add benchmark and validate script #17

bmesuere · 2026-02-02T12:02:12Z

This PR prepares for a set of potential performance improvements by adding a validation and benchmark script. Both scripts run on the 3 files in the examples directory.

Performance baseline on my macbook:

Benchmark: Short reads (NC_000913-454.fna)
Benchmark 1: /Users/bart/Code/FragGeneScanRs/target/release/FragGeneScanRs -s /Users/bart/Code/FragGeneScanRs/example/NC_000913-454.fna -t 454_10 -w 0 -o /var/folders/j3/38fskpy159v07np8syk3p_2m0000gn/T/tmp.nzJs6Qr9D1/NC_000913-454
  Time (mean ± σ):     657.4 ms ±   5.8 ms    [User: 640.4 ms, System: 11.7 ms]
  Range (min … max):   652.1 ms … 672.0 ms    20 runs


Benchmark: Complete genome (NC_000913.fna)
Benchmark 1: /Users/bart/Code/FragGeneScanRs/target/release/FragGeneScanRs -s /Users/bart/Code/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /var/folders/j3/38fskpy159v07np8syk3p_2m0000gn/T/tmp.nzJs6Qr9D1/NC_000913
  Time (mean ± σ):     971.7 ms ±  12.8 ms    [User: 847.1 ms, System: 114.7 ms]
  Range (min … max):   958.8 ms … 1018.3 ms    20 runs


Benchmark: Long reads (contigs.fna)
Benchmark 1: /Users/bart/Code/FragGeneScanRs/target/release/FragGeneScanRs -s /Users/bart/Code/FragGeneScanRs/example/contigs.fna -t complete -w 1 -o /var/folders/j3/38fskpy159v07np8syk3p_2m0000gn/T/tmp.nzJs6Qr9D1/contigs
  Time (mean ± σ):      6.619 s ±  0.023 s    [User: 6.530 s, System: 0.059 s]
  Range (min … max):    6.598 s …  6.672 s    10 runs

PR	Short reads	Complete genome	Long reads
Baseline	657.4 ± 5.8 ms 1.000×	971.7 ± 12.8 ms 1.000×	6.619 ± 0.023 s 1.000×
#18 bitwise operations	532.2 ± 4.8 ms 1.235×	695.9 ± 8.3 ms 1.396×	4.488 ± 0.015 s 1.475×
#19 vector initialization	450.7 ± 7.4 ms 1.459×	626.8 ± 3.9 ms 1.550×	3.937 ± 0.009 s 1.681×
#20 string formatting	440.7 ± 6.7 ms 1.492×	642.2 ± 24.9 ms 1.513×	3.931 ± 0.009 s 1.684×
#21 precompute penalties	393.5 ± 3.8 ms 1.671×	620.2 ± 6.8 ms 1.567×	3.861 ± 0.021 s 1.714×

Between 19 and 20 there was a small regression. Not because of 20, but because of tweaks to 18 and 19 for which I didn't run the benchmark again.

Other lessons learned

There can be (big) differences in optimization effectiveness, depending on the machine/cpu architecture. replace manual vector initialization #19 had a big effect on my macbook, but @ninewise didn't see any improvements. A SIMD change saw a big improvement on an x86 VM, but on my macbook the execution time doubled.
The vectors for replace manual vector initialization #19 are recreated each time. It seems that reusing them would be faster because of the reduced memory pressure. However, on my macbook this was noticeably slower, possibly because of the value reset that is needed and initialisation being very fast on apple silicon.
Using get_unchecked would shave off 10%, but I didn't create a PR for this

ninewise · 2026-02-02T18:24:02Z

scripts/validate.sh

+# Parse arguments
+MODE="check"
+if [[ $# -gt 0 ]]; then
+    case "$1" in
+        --baseline)
+            MODE="baseline"
+            ;;
+        --check)
+            MODE="check"
+            ;;
+        *)
+            usage
+            ;;
+    esac
+fi


Feels weird to combine the check and validate in here to deduplicate the 3 example calls, and not do the same for the benchmark. I'd merge all three.

ninewise · 2026-02-02T18:25:10Z

scripts/validate.sh

@@ -0,0 +1,131 @@
+#!/bin/bash


ninewise · 2026-02-02T18:31:35Z

scripts/validate.sh

+        IFS=':' read -r input train whole name <<< "$example"
+        echo "  Processing $name..."
+        run_example "$input" "$train" "$whole" "$BASELINE_DIR/$name"


Rather than putting the example in a string array, splitting and naming them, then naming them again in the run method, I'd rather write three methods

NC454() { "$BINARY" -s example/NC_000913-454.fna -t 454_10 -w 0 -o NC_000913-454; } ... examples=(NC454 ...)

And loop through the methods to call them directly.

ninewise · 2026-02-02T18:33:45Z

scripts/benchmark.sh

+BINARY="$PROJECT_ROOT/target/release/FragGeneScanRs"
+
+# Check for hyperfine
+if ! command -v hyperfine &> /dev/null; then


Just use ! hyperfine -V, no need for command.

ninewise · 2026-02-02T18:36:00Z

scripts/validate.sh

+                echo "    Warning: Baseline file $baseline_file not found"
+                continue


I'd rather be defensive and have this fail if there is no baseline to be found.

bmesuere · 2026-02-03T16:13:34Z

I'll stop putting any effort into these branches. I'm sure there are large performance gains possible, but they are too dependant on CPU architecture to reliably benchmark. Every optimization I made on apple silicon had the opposite effect on @ninewise his machine.

On x86, the biggest gains are in restructuring memory and operations to make use of additional vectorisation and AVX512 instructions, but these are not available on Apple Silicon. In addition, memory pressure can be reduced by reusing alpha and path instead of allocation memory each time. On apple silicon hower, this is slower because it is extremely fast in allocating zeroed memory by use of "zero pages".

add benchmark and validate script

67ae292

bmesuere mentioned this pull request Feb 2, 2026

replace manual vector initialization #19

Open

ninewise reviewed Feb 2, 2026

View reviewed changes

This was referenced Feb 3, 2026

Replace string formatting and appending with write! #20

Open

Precompute deletion penalties and log values #21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmark and validate script #17

Add benchmark and validate script #17

bmesuere commented Feb 2, 2026 •

edited

Loading

Uh oh!

ninewise Feb 2, 2026

Uh oh!

ninewise Feb 2, 2026

Uh oh!

ninewise Feb 2, 2026

Uh oh!

ninewise Feb 2, 2026

Uh oh!

ninewise Feb 2, 2026

Uh oh!

bmesuere commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		echo " Warning: Baseline file $baseline_file not found"
		continue

Add benchmark and validate script #17

Are you sure you want to change the base?

Add benchmark and validate script #17

Conversation

bmesuere commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other lessons learned

Uh oh!

ninewise Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

ninewise Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

ninewise Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

ninewise Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

ninewise Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

bmesuere commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bmesuere commented Feb 2, 2026 •

edited

Loading