-
Notifications
You must be signed in to change notification settings - Fork 1
Optimize the Nuc enum with bitwise operations and remove the lookup in trinucleotide #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: benchmark/0
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR optimizes the nucleotide (Nuc) representation to exploit bitwise operations and replaces the large pattern-matching trinucleotide lookup with a direct arithmetic calculation, yielding measurable performance improvements in the benchmarks provided.
Changes:
- Reworked
Nucto be a#[repr(u8)]enum with a documented bit layout (bit 0 as insertion flag, bits 1–2 as base encoding). - Implemented bitwise-based helpers on
Nuc(to_int,to_lower,is_insertion,rc) to replace large pattern matches. - Replaced the exhaustive
trinucleotidepattern-match table with a direct computed codon index based on base indices.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Nucleotide representation optimized for bit operations. | ||
| /// | ||
| /// Layout: bit 0 = insertion flag, bits 1-2 = base encoding | ||
| /// - A=0, Ai=1, C=2, Ci=3, G=4, Gi=5, T=6, Ti=7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd not enumerate these here, they are right below in the definition.
|
|
||
| /// Convert to lowercase (insertion) variant | ||
| #[inline] | ||
| pub fn to_lower(&self) -> Nuc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On its own, this change gives me a 1.11x on execution time (slower). I'm running only on NC_000913.fna though:
Benchmark 1: ./secondline -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
Time (mean ± σ): 1.557 s ± 0.102 s [User: 0.990 s, System: 0.560 s]
Range (min … max): 1.473 s … 1.754 s 10 runs
Benchmark 2: ./table -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
Time (mean ± σ): 1.725 s ± 0.024 s [User: 1.134 s, System: 0.581 s]
Range (min … max): 1.695 s … 1.767 s 10 runs
Benchmark 3: ./transmute -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
Time (mean ± σ): 1.736 s ± 0.026 s [User: 1.124 s, System: 0.603 s]
Range (min … max): 1.697 s … 1.786 s 10 runs
Summary
./secondline -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913 ran
1.11 ± 0.07 times faster than ./table -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
1.11 ± 0.07 times faster than ./transmute -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
(secondline is this version without this change, table is me trying out a constant table lookup, transmute is this version)
| let n0 = s[0].to_int()?; | ||
| let n1 = s[1].to_int()?; | ||
| let n2 = s[2].to_int()?; | ||
| Some(n0 * 16 + n1 * 4 + n2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bitshifts here give a minor improvement for me.
This PR optimizes the Nuc enum to use bitwise operations and replaces the trinucleotide function by direct calculation.
Benchmark on my laptop:
short reads: 532.2 ms ± 4.8 ms
complete genome: 695.9 ms ± 8.3 ms
long reads: 4.488 s ± 0.015 s