Skip to content

Embeddings column type and vector similarity search #1163

@nileshtrivedi

Description

@nileshtrivedi

I would like to be able to perform similarity search over vector embeddings generated by language models to find records in a table.

Describe the solution you'd like
Postgresql has an extension pgvector which allows easy storage and query over embedding vectors.

It supports:
- exact and approximate nearest neighbor search
- single-precision, half-precision, binary, and sparse vectors
- L2 distance, inner product, cosine distance, L1 distance, Hamming distance, and Jaccard distance
- any language with a Postgres client

Sample usage in SQL:

CREATE EXTENSION vector;
-- Create a vector column with 3 dimensions
CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3));

-- Insert Vectors
INSERT INTO items (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');

-- Get the nearest neighbors by L2 distance
SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

pgvector supports inner product (<#>), cosine distance (<=>), and L1 distance (<+>, added in 0.7.0)

These operations are agnostic of the model which was used to generate these embedding vectors.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions