-
Notifications
You must be signed in to change notification settings - Fork 2
Add support for LeanVec-OOD #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for LeanVec-OOD (Out-of-Distribution) functionality to the SVS benchmark tool. LeanVec-OOD allows for improved vector quantization by training matrices on query data that may be out-of-distribution relative to the base vectors.
- Introduces a new module
generate_leanvec_matrices.pyfor computing and saving LeanVec transformation matrices - Extends the build functionality to support training query files and pre-computed matrices
- Updates the loader interface to accept optional data and query matrices for LeanVec operations
Reviewed Changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/svsbench/generate_leanvec_matrices.py |
New module implementing LeanVec matrix generation and saving functionality |
src/svsbench/build.py |
Extended to support LeanVec-OOD training with query files and matrix parameters |
src/svsbench/loader.py |
Updated loader creation to accept optional data and query matrices |
src/svsbench/merge.py |
Added max_vectors parameter to read_vecs function for limiting vectors read |
src/svsbench/consts.py |
Added default constants for LeanVec dimensions and training vector limits |
tests/test_build.py |
Added comprehensive tests for new LeanVec-OOD functionality |
tests/test_generate_leanvec_matrices.py |
New test file for matrix generation functionality |
pyproject.toml |
Added typer-slim dependency for CLI functionality |
README.md |
Added documentation for building LeanVec-OOD indexes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a suggestion to change the README sentence slightly. Let me know what you think. Also, did you get a chance to review the Copilot suggestions?
Yes. I resolved the Copilot conversations; I don't think changes were necessary. (A previous suggestion did uncover a bug that I fixed.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
No description provided.