Skip to content

Conversation

@mihaic
Copy link
Member

@mihaic mihaic commented Jul 24, 2025

No description provided.

@mihaic mihaic requested a review from Copilot July 24, 2025 21:35
@mihaic mihaic enabled auto-merge (squash) July 24, 2025 21:35

This comment was marked as outdated.

@mihaic mihaic requested a review from Copilot July 24, 2025 21:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for LeanVec-OOD (Out-of-Distribution) functionality to the SVS benchmark tool. LeanVec-OOD allows for improved vector quantization by training matrices on query data that may be out-of-distribution relative to the base vectors.

  • Introduces a new module generate_leanvec_matrices.py for computing and saving LeanVec transformation matrices
  • Extends the build functionality to support training query files and pre-computed matrices
  • Updates the loader interface to accept optional data and query matrices for LeanVec operations

Reviewed Changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/svsbench/generate_leanvec_matrices.py New module implementing LeanVec matrix generation and saving functionality
src/svsbench/build.py Extended to support LeanVec-OOD training with query files and matrix parameters
src/svsbench/loader.py Updated loader creation to accept optional data and query matrices
src/svsbench/merge.py Added max_vectors parameter to read_vecs function for limiting vectors read
src/svsbench/consts.py Added default constants for LeanVec dimensions and training vector limits
tests/test_build.py Added comprehensive tests for new LeanVec-OOD functionality
tests/test_generate_leanvec_matrices.py New test file for matrix generation functionality
pyproject.toml Added typer-slim dependency for CLI functionality
README.md Added documentation for building LeanVec-OOD indexes

@mihaic mihaic requested a review from aguerreb July 24, 2025 21:52
Copy link

@aguerreb aguerreb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a suggestion to change the README sentence slightly. Let me know what you think. Also, did you get a chance to review the Copilot suggestions?

@mihaic
Copy link
Member Author

mihaic commented Jul 25, 2025

Also, did you get a chance to review the Copilot suggestions?

Yes. I resolved the Copilot conversations; I don't think changes were necessary. (A previous suggestion did uncover a bug that I fixed.)

@aguerreb aguerreb self-requested a review July 25, 2025 16:32
Copy link

@aguerreb aguerreb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@mihaic mihaic merged commit e23d182 into IntelLabs:main Jul 25, 2025
2 checks passed
@mihaic mihaic deleted the ood branch July 25, 2025 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants