Skip to content

Conversation

@penberg
Copy link
Collaborator

@penberg penberg commented Oct 26, 2025

The default Rust hash map is slow for integer keys. Switch to FxHash instead to reduce executed instructions for, for example, throughput benchmark.

Before:

penberg@turing:~/src/tursodatabase/turso/perf/throughput/turso$ perf stat ../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000
Turso,1,100,0,106875.21

 Performance counter stats for '../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000':

          2,908.02 msec task-clock                       #    0.310 CPUs utilized
            30,508      context-switches                 #   10.491 K/sec
               261      cpu-migrations                   #   89.752 /sec
               813      page-faults                      #  279.572 /sec
    20,655,313,128      instructions                     #    1.73  insn per cycle
                                                  #    0.14  stalled cycles per insn
    11,930,088,949      cycles                           #    4.102 GHz
     2,845,040,381      stalled-cycles-frontend          #   23.85% frontend cycles idle
     3,814,652,892      branches                         #    1.312 G/sec
        54,760,600      branch-misses                    #    1.44% of all branches

       9.372979876 seconds time elapsed

       2.276835000 seconds user
       0.530135000 seconds sys

After:

penberg@turing:~/src/tursodatabase/turso/perf/throughput/turso$ perf stat ../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000
Turso,1,100,0,108663.84

 Performance counter stats for '../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000':

          2,838.65 msec task-clock                       #    0.308 CPUs utilized
            30,629      context-switches                 #   10.790 K/sec
               351      cpu-migrations                   #  123.650 /sec
               818      page-faults                      #  288.165 /sec
    19,887,102,451      instructions                     #    1.72  insn per cycle
                                                  #    0.14  stalled cycles per insn
    11,593,166,024      cycles                           #    4.084 GHz
     2,830,298,617      stalled-cycles-frontend          #   24.41% frontend cycles idle
     3,764,334,333      branches                         #    1.326 G/sec
        53,157,766      branch-misses                    #    1.41% of all branches

       9.218225731 seconds time elapsed

       2.231889000 seconds user
       0.508785000 seconds sys

@nyrkio
Copy link

nyrkio bot commented Oct 26, 2025

Nyrkiö Report for Commit: dfab8c4

No performance changes detected.
Remember that Nyrkiö results become more precise when more commits are merged.
So please check again in a few days.

Nyrkiö 0 changes / 130 tests & 302 metrics.

@penberg penberg force-pushed the fxhash branch 2 times, most recently from 0ba498a to 1ab253d Compare October 26, 2025 14:40
The default Rust hash map is slow for integer keys. Switch to FxHash
instead to reduce executed instructions for, for example, throughput
benchmark.

Note that dirty page tracking is changed to BTreeMap to ensure that the
hash function changes don't impact the WAL frame order, which SQLite
guarantees to be page number ordered.

Before:

```
penberg@turing:~/src/tursodatabase/turso/perf/throughput/turso$ perf stat ../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000
Turso,1,100,0,106875.21

 Performance counter stats for '../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000':

          2,908.02 msec task-clock                       #    0.310 CPUs utilized
            30,508      context-switches                 #   10.491 K/sec
               261      cpu-migrations                   #   89.752 /sec
               813      page-faults                      #  279.572 /sec
    20,655,313,128      instructions                     #    1.73  insn per cycle
                                                  #    0.14  stalled cycles per insn
    11,930,088,949      cycles                           #    4.102 GHz
     2,845,040,381      stalled-cycles-frontend          #   23.85% frontend cycles idle
     3,814,652,892      branches                         #    1.312 G/sec
        54,760,600      branch-misses                    #    1.44% of all branches

       9.372979876 seconds time elapsed

       2.276835000 seconds user
       0.530135000 seconds sys
```

After:

```
penberg@turing:~/src/tursodatabase/turso/perf/throughput/turso$ perf stat ../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000
Turso,1,100,0,108663.84

 Performance counter stats for '../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000':

          2,838.65 msec task-clock                       #    0.308 CPUs utilized
            30,629      context-switches                 #   10.790 K/sec
               351      cpu-migrations                   #  123.650 /sec
               818      page-faults                      #  288.165 /sec
    19,887,102,451      instructions                     #    1.72  insn per cycle
                                                  #    0.14  stalled cycles per insn
    11,593,166,024      cycles                           #    4.084 GHz
     2,830,298,617      stalled-cycles-frontend          #   24.41% frontend cycles idle
     3,764,334,333      branches                         #    1.326 G/sec
        53,157,766      branch-misses                    #    1.41% of all branches

       9.218225731 seconds time elapsed

       2.231889000 seconds user
       0.508785000 seconds sys

```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant