Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Distance metrics M1] support for L2 distance #156

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

tjgreen42
Copy link
Contributor

@tjgreen42 tjgreen42 commented Nov 12, 2024

Description

This PR adds support for distance metrics besides cosine, beginning with L2. The primitives for this metric were already present (earlier versions of pgvectorscale used L2 rather than cosine), so this PR just adds the operator class-related plumbing. As discussed in the design doc (see references) we follow pgvector syntax.

Testing

  • Refactored some of the basic scaffolds to take a DistanceType parameter and added L2 versions of some of the callers. We could carry this out systemically and insist that all tests have variants covering all metrics, but this seems like overkill.
  • Confirmed manually via EXPLAIN PLAN that the correct index is still chosen.
  • Confirmed manually that test_upgrade succeeds (with generated sql schema, not included in this PR)

References

@tjgreen42 tjgreen42 requested a review from a team as a code owner November 12, 2024 19:05
@tjgreen42 tjgreen42 changed the title [Distance metrics M1] support for L2 (aka Euclidean) distance [Distance metrics M1] support for L2 distance Nov 12, 2024
pgvectorscale/Cargo.toml Outdated Show resolved Hide resolved
Copy link
Collaborator

@cevian cevian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good but some comments

pgvectorscale/Cargo.toml Outdated Show resolved Hide resolved
pgvectorscale/sql/vectorscale--0.4.0--0.5.0.sql Outdated Show resolved Hide resolved
pgvectorscale/src/access_method/mod.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@cevian cevian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good except for the update script which I still have concerns about.

pgvectorscale/src/access_method/mod.rs Outdated Show resolved Hide resolved
tjgreen42 and others added 3 commits November 13, 2024 16:46
pgrx sets CARGO_TARGET_DIR so we were previously sharing a target dir
between multiple versions, which caused havoc.

Now set explicitly.
pgvectorscale/src/access_method/mod.rs Outdated Show resolved Hide resolved
pgvectorscale/src/access_method/mod.rs Outdated Show resolved Hide resolved
tjgreen42 and others added 8 commits November 14, 2024 12:02
Co-authored-by: Matvey Arye <[email protected]>
Signed-off-by: tjgreen42 <[email protected]>
Co-authored-by: Matvey Arye <[email protected]>
Signed-off-by: tjgreen42 <[email protected]>
The cargo environment variable is needed for correct
sql generation with versioned so's.

Small fixes to migration scripts.

Related to work adding l2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants