Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to create pgvector HNSW indices with proper operator class and parameters in HCL #3338

Open
mpereira opened this issue Jan 22, 2025 · 2 comments

Comments

@mpereira
Copy link

mpereira commented Jan 22, 2025

When trying to create HNSW indices for pgvector columns in Atlas HCL, the generated SQL is missing required operator class and parameters.

Given this simplified schema:

schema "public" {}

extension "vector" {
  schema  = schema.public
  version = "0.8.0"
}

table "items" {
  schema = schema.public

  column "id" {
    type = text
    null = false
  }

  column "embedding" {
    type = sql("vector(384)")
    null = true
  }

  primary_key {
    columns = [column.id]
  }

  index "hnsw_embedding_idx" {
    type = "HNSW"
    columns = [column.embedding]
  }
}

Atlas generates:

CREATE INDEX "hnsw_embedding_idx" ON "public"."items" USING HNSW ("embedding");

Which fails with:

pq: data type vector has no default operator class for access method "hnsw"

The correct SQL should be:

CREATE INDEX "hnsw_embedding_idx" ON "public"."items" 
USING hnsw ((embedding vector_l2_ops)) 
WITH (m=16, ef_construction=64);

Currently there's no way to specify:

  1. The operator class (vector_l2_ops, vector_ip_ops, or vector_cosine_ops)
  2. The required HNSW parameters (m, ef_construction)

I tried various approaches including:

index "hnsw_embedding_idx" {
  type = sql("hnsw")
  columns = [column.embedding]
  options = sql("WITH (m=16, ef_construction=64)")
}

But none worked properly with Atlas's HCL syntax.

This is related to #3222.

@a8m
Copy link
Member

a8m commented Jan 22, 2025

You can define the operator class of index parts like this:

  index "hnsw_embedding_idx" {
    type = "hnsw"
    on {
      column = column.embedding
      ops    = "vector_l2_ops"
    }
  }

However, the other storage parameters are not supported. I'll check what's the status of it and I'll update you.

@mpereira
Copy link
Author

That works, thanks @a8m!

However, the other storage parameters are not supported. I'll check what's the status of it and I'll update you.

Sounds good, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants