-
Notifications
You must be signed in to change notification settings - Fork 107
Description
Background
I’m new to QLever and recently started using libqlever to run an embedded database inside a Google Cloud Run instance with GCS Fuse as the persistence layer.
After experimenting for about a week, I’m getting very promising results so far—thanks for the fantastic work on this project!
Feature request
We use QLever in a pipeline that needs to “saturate” datasets by running a series of update queries. To explore the performance characteristics, I tested how queries perform when executed over the delta triples (i.e., the entries tracked in <index_name>.update-triples).
Unfortunately, performance drops significantly once update triples are involved. I was wondering if you would consider adding a function that merges the delta triples into the main index, effectively compacting the dataset again.
This would help maintain performance while still enabling incremental updates between rebuilds.
Test Results
-
Baseline count query:
Dataset size: 9,003,298 triples
Execution time: 0.386 s -
After update query:
245k triples added to <index_name>.update-triples -
Count query after update:
Dataset size: 9,248,305 triples
Execution time: 4.885 s (≈13× slower)
Additional context
To enable update queries through libqlever, I had to add a small helper function.
It’s quite minimal (and admittedly vibe-coded since I don’t write C++), but it works well for quick experiments.
I’d be happy to open a PR if you think this could be useful to other users or serve as a starting point for a more robust interface.