You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement Inference Models and Namespaces APIs (#107)
## Problem
Inference Models and Namespaces are new API resources available in
`2025-04`. They need to be implemented in the Go client.
Additionally, there are a few client bugs which need to be fixed:
- The `Embed` method and it's return type `EmbedResponse` need to be
refactored to support both sparse and dense embedding responses, rather
than just dense. Currently, embedding with a model that returns sparse
values will cause errors.
- The `IndexConnection` struct is not safe to reuse for performing
operations across namespaces while reusing the existing gRPC connection
for the index. This is because `IndexConnection.Namespace` is publicly
exposed, and could be updated at any point.
## Solution
Implement new namespaces and models API operations:
- Namespace operations have been implemented on `IndexConnection`.
They're exposed as methods which you can call via
`IndexConnection.ListNamespaces`, `IndexConnection.DescribeNamespace`,
or `IndexConnection.DeleteNamespace`.
- Hosted model operations can be performed using the `Client.Inference`
namespace (`InferenceService` struct) by calling
`client.Inference.DescribeModel` or `client.Inference.ListModels`.
The `InferenceService.Embed` method now returns a different `Embedding`
inside of `EmbedResponse`. `Embedding` has been refactored into a tagged
union type which is basically just a struct that wraps either a
`SparseEmbedding` or `DenseEmbedding` pointers. This seemed to be the
best way to manage this type of thing in Go given the lack of explicit
union types. I'm open to suggestions if anyone has something that may be
a bit more ergonomic.
`IndexConnection` has been refactored to no longer expose
`IndexConnection.Namespace` directly. Instead, `Namespace` is now a
method which allows checking the currently targeted namespace. Users can
now call `IndexConnection.WithNamespace` which will return a copy of the
`IndexConnection` targeting the new namespace, but sharing the
underlying gRPC connection. Again, I think this is a reasonable way of
approaching this and allowing safely targeting multiple namespaces
within an index, but I'm open to feedback.
## Type of Change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [X] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- Breaking change for `EmbedResponse` on `InferenceService.Embed`
- [ ] This change requires a documentation update
- [ ] Infrastructure change (CI configs, etc)
- [ ] Non-code change (docs, etc)
- [ ] None of the above: (explain here)
## Test Plan
CI - unit tests & integration tests
If you'd like to test the new APIs out yourself you can check the
integration tests or README for more detailed examples:
Models:
```go
ctx := context.Background()
pc, err := pinecone.NewClient(pinecone.NewClientParams{
ApiKey: "YOUR_API_KEY",
})
if err != nil {
log.Fatalf("Failed to create Client: %v", err)
}
embed := "embed"
rerank := "rerank"
embedModels, err := pc.Inference.ListModels(ctx, &pinecone.ListModelsParams{
Type: &embed,
})
if err != nil {
log.Fatalf("Failed to list embedding models: %v", err)
}
rerankModels, err := pc.Inference.ListModels(ctx, &pinecone.ListModelsParams{
Type: &rerank,
})
if err != nil {
log.Fatalf("Failed to list reranking models: %v", err)
}
multilingualModel, := pc.Inference.DescribeModel(ctx, "multilingual-e5-large")
if err != nil {
log.Fatalf("Failed to describe models: %v", err)
}
```
Namespaces:
```go
ctx := context.Background()
pc, err := pinecone.NewClient(pinecone.NewClientParams{
ApiKey: "YOUR_API_KEY",
})
if err != nil {
log.Fatalf("Failed to create Client: %v", err)
}
idx, err := pc.DescribeIndex(ctx, "example-index")
if err != nil {
log.Fatalf("Failed to describe index \"%v\": %v", idx.Name, err)
}
idxConnection, err := pc.Index(pinecone.NewIndexConnParams{Host: idx.Host})
if err != nil {
log.Fatalf("Failed to create IndexConnection for Host: %v: %v", idx.Host, err)
}
// list namespaces
limit := uint32(10)
namespaces, err := idxConnection.ListNamespaces(ctx, &pinecone.ListNamespacesParams{
Limit: &limit,
})
if err != nil {
log.Fatalf("Failed to list namespaces for Host: %v: %v", idx.Host, err)
}
// describe a namespace
namespace1, err := idxConnection.DescribeNamespace(ctx, "my-namespace-1")
if err != nil {
log.Fatalf("Failed to describe namespace: %v: %v", "my-namespace-1", err)
}
// delete a namespace
err := idxConnection.DeleteNamespace("my-namespace-1")
if err != nil {
log.Fatalf("Failed to delete namespace: %v: %v", "my-namespace-1", err)
}
```
---
- To see the specific tasks where the Asana app for GitHub is being
used, see below:
- https://app.asana.com/0/0/1210238631289243
- https://app.asana.com/0/0/1209828518477630
Copy file name to clipboardExpand all lines: README.md
+110-8Lines changed: 110 additions & 8 deletions
Original file line number
Diff line number
Diff line change
@@ -588,8 +588,10 @@ Pinecone indexes support working with vector data using operations such as upser
588
588
589
589
### Targeting an index
590
590
591
-
To perform data operations on an index, you target it using the `Index` method on a `Client` object. You will
592
-
need your index's `Host` value, which you can retrieve via `DescribeIndex` or `ListIndexes`.
591
+
To perform data operations on an index, you target it using the `Index` method on a `Client` object which returns a pointer to an `IndexConnection`. Calling `Index` will create and dial the index via a new gRPC connection. You can target a specific `Namespace` when calling `Index`, but if you want to reuse the connection with different namespaces, you can call `IndexConnection.WithNamespace`. If no `Namespace` is provided when establishing a new
592
+
`IndexConnection`, the default of `"__default__"` will be used.
593
+
594
+
You will need your index's `Host` value, which you can retrieve via `DescribeIndex` or `ListIndexes`.
593
595
594
596
```go
595
597
package main
@@ -628,9 +630,57 @@ func main() {
628
630
}
629
631
```
630
632
633
+
### Working with namespaces
634
+
635
+
Within an index, records are partitioned into namespaces, and all upserts, queries, and other data operations always target one namespace. You can read more about [namespaces here](https://docs.pinecone.io/guides/index-data/indexing-overview#namespaces).
636
+
637
+
You can list all namespaces in an index in a paginated format, describe a specific namespace, or delete a namespace. NOTE: Deleting a namespace will delete all record information partitioned in that namespace.
log.Fatalf("Failed to delete namespace: %v: %v", "my-namespace-1", err)
678
+
}
679
+
```
680
+
631
681
### Upsert vectors
632
682
633
-
The following example upserts dense vectors and metadata to `example-index`.
683
+
The following example upserts dense vectors and metadata to `example-index` in the namespace `my-namespace`. Upserting to a specific `Namespace` will implicitly create the namespace if it does not exist already.
634
684
635
685
```go
636
686
package main
@@ -663,7 +713,7 @@ func main() {
663
713
log.Fatalf("Failed to describe index \"%v\": %v", idx.Name, err)
log.Fatalf("Failed to create IndexConnection for Host: %v: %v", idx.Host, err)
669
719
}
@@ -1582,10 +1632,8 @@ func main() {
1582
1632
1583
1633
## Inference
1584
1634
1585
-
The `Client` object has an `Inference` namespace which allows interacting with
1586
-
Pinecone's [Inference API](https://docs.pinecone.io/guides/inference/generate-embeddings). The Inference
1587
-
API is a service that gives you access to embedding models hosted on Pinecone's infrastructure. Read more
1588
-
at [Understanding Pinecone Inference](https://docs.pinecone.io/guides/inference/understanding-inference).
1635
+
The `Client` object has an `Inference` namespace which exposes an `InferenceService` pointer which allows interacting with Pinecone's [Inference API](https://docs.pinecone.io/guides/inference/generate-embeddings).
1636
+
The Inference API is a service that gives you access to embedding models hosted on Pinecone's infrastructure. Read more at [Understanding Pinecone Inference](https://docs.pinecone.io/guides/inference/understanding-inference).
To see available models hosted by Pinecone, you can use the `DescribeModel` and `ListModels` methods on the `InferenceService` struct. This allows you to retrieve detailed information about specific models.
1753
+
1754
+
You can list all available models, with the options of filtering by model `Type` (`"embed"`, `"rerank"`), and `VectorType` (`"sparse"`, `"dense"`) for models with `Type``"embed"`.
When using an index with integrated inference, embedding and reranking operations are tied to index operations and do not require extra steps. This allows working with an index that accepts source text and converts it to vectors automatically using an embedding model hosted by Pinecone.
0 commit comments