Skip to content

Commit d4fd3fc

Browse files
authored
Merge pull request #2993 from milvus-io/v2.5.4-docs-anthony
update docs
2 parents 41e7b05 + 98f1d3f commit d4fd3fc

File tree

4 files changed

+284
-2
lines changed

4 files changed

+284
-2
lines changed

assets/iterative-filtering.png

216 KB
Loading

assets/partition-key-isolation.png

318 KB
Loading

site/en/userGuide/search-query-get/filtered-search.md

Lines changed: 193 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,10 @@ An ANN search finds vector embeddings most similar to specified vector embedding
1111

1212
## Overview
1313

14+
In Milvus, filtered searches are categorized into two types — **standard filtering** and **iterative filtering** — depending on the stage at which the filtering is applied.
15+
16+
## Standard Filtering
17+
1418
If a collection contains both vector embeddings and their metadata, you can filter metadata before ANN search to improve the relevancy of the search result. Once Milvus receives a search request carrying a filtering condition, it restricts the search scope within the entities matching the specified filtering condition.​
1519

1620
![Filtered search](../../../../assets/filtered-search.png)
@@ -23,6 +27,18 @@ As shown in the above diagram, the search request carries `chunk like % red %` a
2327

2428
- Returns top-K entities.​
2529

30+
## Iterative Filtering
31+
32+
The standard filtering process effectively narrows the search scope to a small range. However, overly complex filtering expressions may result in very high search latency. In such cases, iterative filtering can serve as an alternative, helping to reduce the workload of scalar filtering.
33+
34+
![Iterative filtering](../../../../assets/iterative-filtering.png)
35+
36+
As illustrated in the diagram above, a search with iterative filtering performs the vector search in iterations. Each entity returned by the iterator undergoes scalar filtering, and this process continues until the specified topK results are achieved.
37+
38+
This method significantly reduces the number of entities subjected to scalar filtering, making it especially beneficial for handling highly complex filtering expressions.
39+
40+
However, it’s important to note that the iterator processes entities one at a time. This sequential approach can lead to longer processing times or potential performance issues, especially when a large number of entities are subjected to scalar filtering.
41+
2642
## Examples
2743

2844
This section demonstrates how to conduct a filtered search. Code snippets in this section assume you already have the following entities in your collection. Each entity has four fields, namely **id**, **vector**, **color**, and **likes**.​
@@ -43,7 +59,9 @@ This section demonstrates how to conduct a filtered search. Code snippets in thi
4359

4460
```
4561

46-
The search request in the following code snippet carries a filtering condition and several output fields.​
62+
### Search with Standard Filtering
63+
64+
The following code snippets demonstrate a search with standard filtering, and the request in the following code snippet carries a filtering condition and several output fields.
4765

4866
<div class="multipleCode">
4967
<a href="#python">Python </a>
@@ -235,4 +253,177 @@ The filtering condition carried in the search request reads `color like "red%" a
235253

236254
```
237255

238-
For more information on the operators that you can use in metadata filtering, refer to [​Metadata Filtering](boolean.md).​
256+
For more information on the operators that you can use in metadata filtering, refer to [​Metadata Filtering](boolean.md).​
257+
### Search with iteraive filtering
258+
259+
To conduct a filtered search with iterative filtering, you can do as follows:
260+
261+
<div class="multipleCode">
262+
<a href="#python">Python </a>
263+
<a href="#java">Java</a>
264+
<a href="#javascript">Node.js</a>
265+
<a href="#curl">cURL</a>
266+
</div>
267+
268+
```python
269+
from pymilvus import MilvusClient​
270+
271+
client = MilvusClient(​
272+
uri="http://localhost:19530",​
273+
token="root:Milvus"
274+
)​
275+
276+
query_vector = [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592]​
277+
278+
res = client.search(​
279+
collection_name="my_collection",​
280+
data=[query_vector],​
281+
limit=5,​
282+
# highlight-start​
283+
filter='color like "red%" and likes > 50',​
284+
output_fields=["color", "likes"]​,
285+
search_params={
286+
"hints": "iterative_filter"
287+
}
288+
# highlight-end​
289+
)​
290+
291+
for hits in res:​
292+
print("TopK results:")​
293+
for hit in hits:​
294+
print(hit)​
295+
296+
```
297+
298+
```java
299+
import io.milvus.v2.client.ConnectConfig;​
300+
import io.milvus.v2.client.MilvusClientV2;​
301+
import io.milvus.v2.service.vector.request.SearchReq​;
302+
import io.milvus.v2.service.vector.request.data.FloatVec;​
303+
import io.milvus.v2.service.vector.response.SearchResp​;
304+
305+
MilvusClientV2 client = new MilvusClientV2(ConnectConfig.builder()​
306+
.uri("http://localhost:19530")​
307+
.token("root:Milvus")​
308+
.build());​
309+
310+
FloatVec queryVector = new FloatVec(new float[]{0.3580376395471989f, -0.6023495712049978f, 0.18414012509913835f, -0.26286205330961354f, 0.9029438446296592f});​
311+
SearchReq searchReq = SearchReq.builder()​
312+
.collectionName("filtered_search_collection")​
313+
.data(Collections.singletonList(queryVector))​
314+
.topK(5)​
315+
.filter("color like \"red%\" and likes > 50")​
316+
.outputFields(Arrays.asList("color", "likes"))​
317+
.searchParams(new HashMap<>("hints", "iterative_filter"))
318+
.build();​
319+
320+
SearchResp searchResp = client.search(searchReq);​
321+
322+
List<List<SearchResp.SearchResult>> searchResults = searchResp.getSearchResults();​
323+
for (List<SearchResp.SearchResult> results : searchResults) {​
324+
System.out.println("TopK results:");​
325+
for (SearchResp.SearchResult result : results) {​
326+
System.out.println(result);​
327+
}​
328+
}​
329+
330+
// Output​
331+
// TopK results:​
332+
// SearchResp.SearchResult(entity={color=red_4794, likes=122}, score=0.5975797, id=4)​
333+
// SearchResp.SearchResult(entity={color=red_9392, likes=58}, score=-0.24996188, id=6)​
334+
335+
```
336+
337+
```go
338+
import (​
339+
"context"
340+
"log"
341+
342+
"github.com/milvus-io/milvus/client/v2"
343+
"github.com/milvus-io/milvus/client/v2/entity"
344+
)​
345+
346+
func ExampleClient_Search_filter() {​
347+
ctx, cancel := context.WithCancel(context.Background())​
348+
defer cancel()​
349+
350+
milvusAddr := "127.0.0.1:19530"
351+
token := "root:Milvus"
352+
353+
cli, err := client.New(ctx, &client.ClientConfig{​
354+
Address: milvusAddr,​
355+
APIKey: token,​
356+
})​
357+
if err != nil {​
358+
log.Fatal("failed to connect to milvus server: ", err.Error())​
359+
}​
360+
361+
defer cli.Close(ctx)​
362+
363+
queryVector := []float32{0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592}​
364+
365+
resultSets, err := cli.Search(ctx, client.NewSearchOption(​
366+
"filtered_search_collection", // collectionName​
367+
3, // limit​
368+
[]entity.Vector{entity.FloatVector(queryVector)},​
369+
).WithFilter(`color like "red%" and likes > 50`).WithHints("iterative_filter").WithOutputFields("color", "likes"))​
370+
if err != nil {​
371+
log.Fatal("failed to perform basic ANN search collection: ", err.Error())​
372+
}​
373+
374+
for _, resultSet := range resultSets {​
375+
log.Println("IDs: ", resultSet.IDs)​
376+
log.Println("Scores: ", resultSet.Scores)​
377+
}​
378+
// Output:​
379+
// IDs:​
380+
// Scores:​
381+
}​
382+
383+
384+
```
385+
386+
```javascript
387+
import { MilvusClient, DataType } from "@zilliz/milvus2-sdk-node";​
388+
389+
const address = "http://localhost:19530";​
390+
const token = "root:Milvus";​
391+
const client = new MilvusClient({address, token});​
392+
393+
const query_vector = [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592]​
394+
395+
const res = await client.search({​
396+
collection_name: "filtered_search_collection",​
397+
data: [query_vector],​
398+
limit: 5,​
399+
// highlight-start​
400+
filters: 'color like "red%" and likes > 50',​
401+
hints: "iterative_filter",
402+
output_fields: ["color", "likes"]​
403+
// highlight-end​
404+
})​
405+
406+
```
407+
408+
```curl
409+
export CLUSTER_ENDPOINT="http://localhost:19530"​
410+
export TOKEN="root:Milvus"​
411+
412+
curl --request POST \​
413+
--url "${CLUSTER_ENDPOINT}/v2/vectordb/entities/search" \​
414+
--header "Authorization: Bearer ${TOKEN}" \​
415+
--header "Content-Type: application/json" \​
416+
-d '{​
417+
"collectionName": "quick_setup",​
418+
"data": [​
419+
[0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592]​
420+
],​
421+
"annsField": "vector",​
422+
"filter": "color like \"red%\" and likes > 50",​
423+
"searchParams": {"hints": "iterative_filter"},
424+
"limit": 3,​
425+
"outputFields": ["color", "likes"]​
426+
}'​
427+
# {"code":0,"cost":0,"data":[]}​
428+
429+
```

site/en/userGuide/search-query-get/use-partition-key.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -259,3 +259,94 @@ export filter='partition_key == "x" && <other conditions>'​
259259
export filter='partition_key in ["x", "y", "z"] && <other conditions>'​
260260
261261
```
262+
263+
<div class="alert note">
264+
265+
You have to replace `partition_key` with the name of the field that is designated as the partition key.
266+
267+
</div>
268+
269+
## Use Partition Key Isolation
270+
271+
In the multi-tenancy scenario, you can designate the scalar field related to tenant identities as the partition key and create a filter based on a specific value in this scalar field. To further improve search performance in similar scenarios, Milvus introduces the Partition Key Isolation feature.
272+
273+
![Partition Key Isolation](../../../../assets/partition-key-isolation.png)
274+
275+
As shown in the above figure, Milvus groups entities based on the Partition Key value and creates a separate index for each of these groups. Upon receiving a search request, Milvus locates the index based on the Partition Key value specified in the filtering condition and restricts the search scope within the entities included in the index, thus avoiding scanning irrelevant entities during the search and greatly enhancing the search performance.
276+
Once you have enabled Partition Key Isolation, you can include only a specific value in the Partition-key-based filter so that Milvus can restrict the search scope within the entities included in the index that match.
277+
278+
<div class="alert note">
279+
280+
Currently, the Partition-Key Isolation feature applies only to searches with the index type set to HNSW.
281+
282+
</div>
283+
284+
### Enable Partition Key Isolation
285+
286+
The following code examples demonstrate how to enable Partition Key Isolation.
287+
288+
<div class="multipleCode">
289+
<a href="#python">Python </a>
290+
<a href="#java">Java</a>
291+
<a href="#javascript">Node.js</a>
292+
<a href="#go">Go</a>
293+
<a href="#curl">cURL</a>
294+
</div>
295+
296+
```python
297+
client.create_collection(
298+
collection_name="my_collection",
299+
schema=schema,
300+
# highlight-next-line
301+
properties={"partitionkey.isolation": True}
302+
)
303+
304+
```
305+
306+
```java
307+
import io.milvus.v2.service.collection.request.CreateCollectionReq;
308+
309+
Map<String, String> properties = new HashMap<>();
310+
properties.put("partitionkey.isolation", "true");
311+
312+
CreateCollectionReq createCollectionReq = CreateCollectionReq.builder()
313+
.collectionName("my_collection")
314+
.collectionSchema(schema)
315+
.numPartitions(1024)
316+
.properties(properties)
317+
.build();
318+
client.createCollection(createCollectionReq);
319+
320+
```
321+
322+
```javascript
323+
res = await client.alterCollection({
324+
collection_name: "my_collection",
325+
properties: {
326+
"partitionkey.isolation": true
327+
}
328+
})
329+
330+
```
331+
332+
```curl
333+
export params='{
334+
"partitionKeyIsolation": true
335+
}'
336+
337+
export CLUSTER_ENDPOINT="http://localhost:19530"
338+
export TOKEN="root:Milvus"
339+
340+
curl --request POST \
341+
--url "${CLUSTER_ENDPOINT}/v2/vectordb/collections/create" \
342+
--header "Authorization: Bearer ${TOKEN}" \
343+
--header "Content-Type: application/json" \
344+
-d "{
345+
\"collectionName\": \"myCollection\",
346+
\"schema\": $schema,
347+
\"params\": $params
348+
}"
349+
350+
```
351+
352+
Once you have enabled Partition Key Isolation, you can still set the Partition Key and number of partitions as described in [Set Partition Numbers](#Set-Partition-Numbers). Note that the Partition-Key-based filter should include only a specific Partition Key value.

0 commit comments

Comments
 (0)