-
Notifications
You must be signed in to change notification settings - Fork 21
CNDB-15527: Add config to use FusedPQ #2042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Checklist before you submit for review
|
e63f53d to
5e02971
Compare
5e02971 to
7c3c804
Compare
|
|
||
| VectorCompression.CompressionType compressionType = VectorCompression.CompressionType.values()[reader.readByte()]; | ||
| if (features.contains(FeatureId.FUSED_ADC)) | ||
| if (features.contains(FeatureId.FUSED_PQ)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marianotepper - I just noticed that the features map already has logic that loads the ProductQuantization, meaning this branch currently keeps two identical maps in memory. I think it'd make sense to possibly expose the features map in the OnDiskGraphIndex so we can remove the duplicate cost. Any reason we can't do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we actually already use the one from the header, I just didn't catch it. I think we'll be able to get rid of it with a little extra work in CC.
built from the branch in this PR: datastax/jvector#588
This is a hack because it increases memory utilization and may impact other threads in the CC host. However, this will help unblock certain test scenarios.
Improve hashCode of the Key to prevent collisions Implement Comparable<Key> in case the ConcurrentHashMap falls back in to Tree mode (cherry picked from commit 091b636)
|
I have cherry-pick this into this feature branch: #2156 |
|
❌ Build ds-cassandra-pr-gate/PR-2042 rejected by Butler2 regressions found Found 2 new test failures
Found 2 known test failures |



What is the issue
https://github.com/riptano/cndb/issues/15527
What does this PR fix and why was it fixed
Adds ability to build vector indexes using Fused PQ. Enabled by default. Can be disabled via
cassandra.sai.vector.enable_fusedconfig param. When enabled, fused graphs store the product quantization in the graph on disk, removing the need to store it in memory, and unlocking new scale for vector indexes.