Skip to content

Conversation

@bmesuere
Copy link
Member

@bmesuere bmesuere commented Feb 2, 2026

This PR optimizes the Nuc enum to use bitwise operations and replaces the trinucleotide function by direct calculation.

Benchmark on my laptop:

Benchmark: Short reads (NC_000913-454.fna)
Benchmark 1: /Users/bart/Code/FragGeneScanRs/target/release/FragGeneScanRs -s /Users/bart/Code/FragGeneScanRs/example/NC_000913-454.fna -t 454_10 -w 0 -o /var/folders/j3/38fskpy159v07np8syk3p_2m0000gn/T/tmp.oLaKKc9LR3/NC_000913-454
  Time (mean ± σ):     532.2 ms ±   4.8 ms    [User: 520.0 ms, System: 9.4 ms]
  Range (min … max):   525.9 ms … 543.7 ms    20 runs


Benchmark: Complete genome (NC_000913.fna)
Benchmark 1: /Users/bart/Code/FragGeneScanRs/target/release/FragGeneScanRs -s /Users/bart/Code/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /var/folders/j3/38fskpy159v07np8syk3p_2m0000gn/T/tmp.oLaKKc9LR3/NC_000913
  Time (mean ± σ):     695.9 ms ±   8.3 ms    [User: 576.6 ms, System: 111.8 ms]
  Range (min … max):   685.2 ms … 716.3 ms    20 runs


Benchmark: Long reads (contigs.fna)
Benchmark 1: /Users/bart/Code/FragGeneScanRs/target/release/FragGeneScanRs -s /Users/bart/Code/FragGeneScanRs/example/contigs.fna -t complete -w 1 -o /var/folders/j3/38fskpy159v07np8syk3p_2m0000gn/T/tmp.oLaKKc9LR3/contigs
  Time (mean ± σ):      4.488 s ±  0.015 s    [User: 4.438 s, System: 0.042 s]
  Range (min … max):    4.464 s …  4.511 s    10 runs

short reads: 532.2 ms ± 4.8 ms
complete genome: 695.9 ms ± 8.3 ms
long reads: 4.488 s ± 0.015 s

Copilot AI review requested due to automatic review settings February 2, 2026 12:24
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the nucleotide (Nuc) representation to exploit bitwise operations and replaces the large pattern-matching trinucleotide lookup with a direct arithmetic calculation, yielding measurable performance improvements in the benchmarks provided.

Changes:

  • Reworked Nuc to be a #[repr(u8)] enum with a documented bit layout (bit 0 as insertion flag, bits 1–2 as base encoding).
  • Implemented bitwise-based helpers on Nuc (to_int, to_lower, is_insertion, rc) to replace large pattern matches.
  • Replaced the exhaustive trinucleotide pattern-match table with a direct computed codon index based on base indices.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

/// Nucleotide representation optimized for bit operations.
///
/// Layout: bit 0 = insertion flag, bits 1-2 = base encoding
/// - A=0, Ai=1, C=2, Ci=3, G=4, Gi=5, T=6, Ti=7
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd not enumerate these here, they are right below in the definition.


/// Convert to lowercase (insertion) variant
#[inline]
pub fn to_lower(&self) -> Nuc {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On its own, this change gives me a 1.11x on execution time (slower). I'm running only on NC_000913.fna though:

Benchmark 1: ./secondline -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
  Time (mean ± σ):      1.557 s ±  0.102 s    [User: 0.990 s, System: 0.560 s]
  Range (min … max):    1.473 s …  1.754 s    10 runs
 
Benchmark 2: ./table -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
  Time (mean ± σ):      1.725 s ±  0.024 s    [User: 1.134 s, System: 0.581 s]
  Range (min … max):    1.695 s …  1.767 s    10 runs
 
Benchmark 3: ./transmute -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
  Time (mean ± σ):      1.736 s ±  0.026 s    [User: 1.124 s, System: 0.603 s]
  Range (min … max):    1.697 s …  1.786 s    10 runs
 
Summary
  ./secondline -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913 ran
    1.11 ± 0.07 times faster than ./table -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
    1.11 ± 0.07 times faster than ./transmute -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913

(secondline is this version without this change, table is me trying out a constant table lookup, transmute is this version)

let n0 = s[0].to_int()?;
let n1 = s[1].to_int()?;
let n2 = s[2].to_int()?;
Some(n0 * 16 + n1 * 4 + n2)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bitshifts here give a minor improvement for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants