Skip to content

[DiskBBQ] Write the raw centroid on the posting list file instead of the centroids file #131421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 17, 2025

Conversation

iverase
Copy link
Contributor

@iverase iverase commented Jul 17, 2025

We are currently storing the raw centroid on the centroids file but this is only needed to quantize the vector query when visiting the posting lists. I think this data can be added at the beginning of the posting list instead so we keep our centroid file small.

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jul 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

# Conflicts:
#	server/src/main/java/org/elasticsearch/index/codec/vectors/DefaultIVFVectorsWriter.java
Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes things much simpler!

Copy link
Contributor

@john-wagster john-wagster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@iverase iverase merged commit 628828f into elastic:main Jul 17, 2025
3 of 5 checks passed
@iverase iverase deleted the ivf_centroidRaw branch July 17, 2025 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants