Skip to content

Conversation

@fsimonis
Copy link
Member

@fsimonis fsimonis commented Nov 17, 2025

This PR changes the merge command by first using an in-memory DB to create the database and writing it to disk once it is complete.
This leads to an 7-10% improvement and doesn't leave behind incomplete databases when the merge scripts errors.

Example using a run of 4592 files:

Creating directly on-disk (previous)

Time (mean ± σ):     10.178 s ±  0.123 s    [User: 9.315 s, System: 0.848 s]
Range (min … max):   10.030 s … 10.402 s    10 runs

Creating in-memory first (this PR)

Time (mean ± σ):      9.260 s ±  0.197 s    [User: 8.863 s, System: 0.390 s]
Range (min … max):    9.021 s …  9.537 s    10 runs

Example using a run of 15k files:

Creating directly on-disk (previous)

Time (mean ± σ):     10.858 s ±  0.104 s    [User: 9.712 s, System: 1.131 s]
Range (min … max):   10.656 s … 11.022 s    10 runs

Creating in-memory first (this PR)

Time (mean ± σ):     10.046 s ±  0.160 s    [User: 9.425 s, System: 0.611 s]
Range (min … max):    9.739 s … 10.346 s    10 runs

@fsimonis fsimonis marked this pull request as ready for review November 17, 2025 16:16
@fsimonis fsimonis self-assigned this Nov 17, 2025
@fsimonis fsimonis added the enhancement New feature or request label Nov 17, 2025
@fsimonis fsimonis merged commit 69deadc into main Nov 17, 2025
10 checks passed
@fsimonis fsimonis deleted the create-db-in-memory branch November 17, 2025 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants