Skip to content

Commit

Permalink
Release new docs to preview
Browse files Browse the repository at this point in the history
  • Loading branch information
Milvus-doc-bot authored and Milvus-doc-bot committed Nov 26, 2024
1 parent f00e8ab commit 38c549c
Show file tree
Hide file tree
Showing 9 changed files with 36 additions and 36 deletions.
6 changes: 3 additions & 3 deletions v2.5.x/site/en/release_notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ Milvus 2.5.0-beta brings significant advancements to enhance usability, scalabil

### Key Features

#### Doc In Doc Out/BM25
#### Full Text Search

Keyword-based full-text search is an important complement to Milvus's strong semantic search capabilities, especially in scenarios involving rare words or technical terms. In previous versions, Milvus supported sparse vectors to assist with keyword search scenarios. These sparse vectors were generated outside of Milvus by neural models like SPLADEv2/BGE-M3 or statistical models such as the BM25 algorithm.
Milvus2.5 supports full text search implemented with Sparse-BM25! This feature is an important complement to Milvus's strong semantic search capabilities, especially in scenarios involving rare words or technical terms. In previous versions, Milvus supported sparse vectors to assist with keyword search scenarios. These sparse vectors were generated outside of Milvus by neural models like SPLADEv2/BGE-M3 or statistical models such as the BM25 algorithm.

In Milvus 2.5, tokenization and sparse vector extraction are now built-in, truly realizing "Doc-In-Doc-Out" instead of the previous "Vec-in-vec-out" approach. BM25 statistical information is updated in real time as data is inserted, enhancing usability and accuracy. Additionally, sparse vectors based on approximate nearest neighbor (ANN) algorithms offer more powerful performance than standard keyword search systems.

Expand All @@ -48,7 +48,7 @@ Bitmap indexes have traditionally been effective for low-cardinality columns, wh

For details, refer to [Bitmap Index](bitmap.md).

#### Default Value & Null
#### Nullable & Default Value

Milvus now supports setting nullable properties and default values for scalar fields other than the primary key field. For fields marked as `nullable=True`, users can omit the field when inserting data; the system will treat it as a null value or default value (if set) without throwing an error.

Expand Down
4 changes: 2 additions & 2 deletions v2.5.x/site/en/userGuide/collections/load-and-release.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Loading a collection is the prerequisite to conducting similarity searches and q

## Load Collection​

When you load a collection, Zilliz Cloud loads the index files and the raw data of all fields into memory for rapid response to searches and queries. Entities inserted after a collection load are automatically indexed and loaded.​
When you load a collection, Milvus loads the index files and the raw data of all fields into memory for rapid response to searches and queries. Entities inserted after a collection load are automatically indexed and loaded.​

The following code snippets demonstrate how to load a collection.​

Expand Down Expand Up @@ -179,7 +179,7 @@ curl --request POST \​

## Load Specific Fields​

Zilliz Cloud can load only the fields involved in searches and queries, reducing memory usage and improving search performance.​
Milvus can load only the fields involved in searches and queries, reducing memory usage and improving search performance.​

The following code snippet assumes that you have created a collection named **customized_setup_2**, and there are two fields named **my_id** and **my_vector** in the collection.​

Expand Down
12 changes: 6 additions & 6 deletions v2.5.x/site/en/userGuide/collections/manage-collections.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,15 +53,15 @@ For more information, refer to [​Schema Explained](schema.md).​

## Load and Release​

Loading a collection is the prerequisite to conducting similarity searches and queries in collections. When you load a collection, Zilliz Cloud loads all index files and the raw data in each field into memory for fast response to searches and queries.​
Loading a collection is the prerequisite to conducting similarity searches and queries in collections. When you load a collection, Milvus loads all index files and the raw data in each field into memory for fast response to searches and queries.​

Searches and queries are memory-intensive operations. To save the cost, you are advised to release the collections that are currently not in use.​

For more details, refer to [​Load & Release](load-and-release.md).​

## Search and Query​

Once you create indexes and load the collection, you can start a similarity search by feeding one or several query vectors. For example, when receiving the vector representation of your query carried in a search request, Zilliz Cloud uses the specified metric type to measure the similarity between the query vector and those in the target collection before returning those that are semantically similar to the query.​
Once you create indexes and load the collection, you can start a similarity search by feeding one or several query vectors. For example, when receiving the vector representation of your query carried in a search request, Milvus uses the specified metric type to measure the similarity between the query vector and those in the target collection before returning those that are semantically similar to the query.​

You can also include metadata filtering within searches and queries to improve the relevancy of the results. Note that, metadata filtering conditions are mandatory in queries but optional in searches.​

Expand All @@ -87,7 +87,7 @@ For more information about searches and queries, refer to the articles in the [

- [Keyword Match](keyword-match.md)

In addition, Zilliz Cloud also provides enhancements to improve search performance and efficiency. They are disabled by default, and you can enable and use them according to your service requirements. They are​
In addition, Milvus also provides enhancements to improve search performance and efficiency. They are disabled by default, and you can enable and use them according to your service requirements. They are​

- [​Use Partition Key](use-partition-key.md)

Expand All @@ -99,7 +99,7 @@ In addition, Zilliz Cloud also provides enhancements to improve search performan

Partitions are subsets of a collection, which share the same field set with its parent collection, each containing a subset of entities.​

By allocating entities into different partitions, you can create entity groups. You can conduct searches and queries in specific partitions to have Zilliz Cloud ignore entities in other partitions, and improve search efficiency.​
By allocating entities into different partitions, you can create entity groups. You can conduct searches and queries in specific partitions to have Milvus ignore entities in other partitions, and improve search efficiency.​

For details, refer to [​Manage Partitions](manage-partitions.md).​

Expand All @@ -111,13 +111,13 @@ For details on how to set the shard number, refer to [​Create Collection](crea

## Alias​

You can create aliases for your collections. A collection can have several aliases, but collections cannot share an alias. Upon receiving a request against a collection, Zilliz Cloud locates the collection based on the provided name. If the collection by the provided name does not exist, Zilliz Cloud continues locating the provided name as an alias. You can use collection aliases to adapt your code to different scenarios.​
You can create aliases for your collections. A collection can have several aliases, but collections cannot share an alias. Upon receiving a request against a collection, Milvus locates the collection based on the provided name. If the collection by the provided name does not exist, Milvus continues locating the provided name as an alias. You can use collection aliases to adapt your code to different scenarios.​

For more details, refer to [​Manage Aliases](manage-aliases.md).​

## Function​

You can set functions for Zilliz Cloud to derive fields upon collection creation. For example, the full-text search function uses the user-defined function to derive a sparse vector field from a specific varchar field. For more information on full-text search, refer to [​Full-Text Search](full-text-search.md).​
You can set functions for Milvus to derive fields upon collection creation. For example, the full-text search function uses the user-defined function to derive a sparse vector field from a specific varchar field. For more information on full-text search, refer to [​Full-Text Search](full-text-search.md).​

## Consistency Level​

Expand Down
6 changes: 3 additions & 3 deletions v2.5.x/site/en/userGuide/collections/manage-partitions.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,23 +9,23 @@ A partition is a subset of a collection. Each partition shares the same data str

## Overview​

When creating a collection, Zilliz Cloud also creates a partition named **_default** in the collection. If you are not going to add any other partitions, all entities inserted into the collection go into the default partition, and all searches and queries are also carried out within the default partition.​
When creating a collection, Milvus also creates a partition named **_default** in the collection. If you are not going to add any other partitions, all entities inserted into the collection go into the default partition, and all searches and queries are also carried out within the default partition.​

You can add more partitions and insert entities into them based on certain criteria. Then you can restrict your searches and queries within certain partitions, improving search performance.​

A collection can have a maximum of 1,024 partitions.​

<div class="alert note">

The **Partition Key** feature is a search optimization based on partitions and allows Zilliz Cloud to distribute entities into different partitions based on the values in a specific scalar field. This feature helps implement partition-oriented multi-tenancy and improves search performance.​
The **Partition Key** feature is a search optimization based on partitions and allows Milvus to distribute entities into different partitions based on the values in a specific scalar field. This feature helps implement partition-oriented multi-tenancy and improves search performance.​

This feature will not be discussed on this page. To find more, refer to [​Use Partition Key](use-partition-key.md).​

</div>

## List Partitions​

When creating a collection, Zilliz Cloud also creates a partition named **_default** in the collection. You can list the partitions in a collection as follows.​
When creating a collection, Milvus also creates a partition named **_default** in the collection. You can list the partitions in a collection as follows.​

<div class="multipleCode">
<a href="#python">Python </a>
Expand Down
2 changes: 1 addition & 1 deletion v2.5.x/site/en/userGuide/collections/modify-collection.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ curl --request POST \​

## Set Collection TTL​

If a collection needs to be dropped for a specific period, consider setting its Time-To-Live (TTL) in seconds. Once the TTL times out, Zilliz Cloud deletes entities in the collection and drops the collection. The deletion is asynchronous, indicating that searches and queries are still possible before the deletion is complete.​
If a collection needs to be dropped for a specific period, consider setting its Time-To-Live (TTL) in seconds. Once the TTL times out, Milvus deletes entities in the collection and drops the collection. The deletion is asynchronous, indicating that searches and queries are still possible before the deletion is complete.​

The following code snippet demonstrates how to change the TTL of a collection.​

Expand Down
10 changes: 5 additions & 5 deletions v2.5.x/site/en/userGuide/schema/schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ A schema defines the data structure of a collection. Before creating a collectio

## Overview​

On Zilliz Cloud, a collection schema assembles a table in a relational database, which defines how Zilliz Cloud organizes data in the collection. ​
In Milvus, a collection schema assembles a table in a relational database, which defines how Milvus organizes data in the collection. ​

A well-designed schema is essential as it abstracts the data model and decides if you can achieve the business objectives through a search. Furthermore, since every row of data inserted into the collection must follow the schema, it helps maintain data consistency and long-term quality. From a technical perspective, a well-defined schema leads to well-organized column data storage and a cleaner index structure, boosting search performance.​

Expand Down Expand Up @@ -129,13 +129,13 @@ export schema='{​

When adding a field, you can explicitly clarify the field as the primary field by setting its `is_primary` property to `True`. A primary field accepts **Int64** values by default. In this case, the primary field value should be integers similar to `12345`. If you choose to use **VarChar** values in the primary field, the value should be strings similar to `my_entity_1234`.​

You can also set the `autoId` properties to `True` to make Zilliz Cloud automatically allocate primary field values upon data insertions.​
You can also set the `autoId` properties to `True` to make Milvus automatically allocate primary field values upon data insertions.​

For details, refer to [​Primary Field & AutoID](primary-field.md).​

## Add Vector Fields​

Vector fields accept various sparse and dense vector embeddings. On Zilliz Cloud, you can add four vector fields to a collection. The following code snippets demonstrate how to add a vector field.​
Vector fields accept various sparse and dense vector embeddings. In Milvus, you can add four vector fields to a collection. The following code snippets demonstrate how to add a vector field.​

<div class="multipleCode">
<a href="#python">Python </a>
Expand Down Expand Up @@ -193,7 +193,7 @@ export schema="{​
```

The `dim` paramter in the above code snippets indicates the dimensionality of the vector embeddings to be held in the vector field. The `FLOAT_VECTOR` value indicates that the vector field holds a list of 32-bit floating numbers, which are usually used to represent antilogarithms.In addition to that, Zilliz Cloud also supports the following types of vector embeddings:​
The `dim` paramter in the above code snippets indicates the dimensionality of the vector embeddings to be held in the vector field. The `FLOAT_VECTOR` value indicates that the vector field holds a list of 32-bit floating numbers, which are usually used to represent antilogarithms.In addition to that, Milvus also supports the following types of vector embeddings:​

- `FLOAT16_VECTOR`

Expand All @@ -213,7 +213,7 @@ The `dim` paramter in the above code snippets indicates the dimensionality of th

## Add Scalar Fields​

In common cases, you can use scalar fields to store the metadata of the vector embeddings stored in Milvus, and conduct ANN searches with metadata filtering to improve the correctness of the search results. Zilliz Cloud supports multiple scalar field types, including **VarChar**, **Boolean**, **Int**, Float, **Double**, **Array**, and JSON.​
In common cases, you can use scalar fields to store the metadata of the vector embeddings stored in Milvus, and conduct ANN searches with metadata filtering to improve the correctness of the search results. Milvus supports multiple scalar field types, including **VarChar**, **Boolean**, **Int**, Float, **Double**, **Array**, and JSON.​

### Add String Fields​

Expand Down
10 changes: 5 additions & 5 deletions v2.5.x/site/en/userGuide/search-query-get/filtered-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,20 @@
id: filtered-search.md
title: Filtered Search​
related_key: ann search, filtered search
summary: An ANN search finds vector embeddings most similar to specified vector embeddings. However, the search results may not always be correct. You can include filtering conditions in a search request so that Zilliz Cloud conducts metadata filtering before conducting ANN searches, reducing the search scope from the whole collection to only the entities matching the specified filtering conditions.​
summary: An ANN search finds vector embeddings most similar to specified vector embeddings. However, the search results may not always be correct. You can include filtering conditions in a search request so that Milvus conducts metadata filtering before conducting ANN searches, reducing the search scope from the whole collection to only the entities matching the specified filtering conditions.​
---

# Filtered Search​

An ANN search finds vector embeddings most similar to specified vector embeddings. However, the search results may not always be correct. You can include filtering conditions in a search request so that Zilliz Cloud conducts metadata filtering before conducting ANN searches, reducing the search scope from the whole collection to only the entities matching the specified filtering conditions.​
An ANN search finds vector embeddings most similar to specified vector embeddings. However, the search results may not always be correct. You can include filtering conditions in a search request so that Milvus conducts metadata filtering before conducting ANN searches, reducing the search scope from the whole collection to only the entities matching the specified filtering conditions.​

## Overview

If a collection contains both vector embeddings and their metadata, you can filter metadata before ANN search to improve the relevancy of the search result. Once Zilliz Cloud receives a search request carrying a filtering condition, it restricts the search scope within the entities matching the specified filtering condition.​
If a collection contains both vector embeddings and their metadata, you can filter metadata before ANN search to improve the relevancy of the search result. Once Milvus receives a search request carrying a filtering condition, it restricts the search scope within the entities matching the specified filtering condition.​

![Filtered search](../../../../assets/filtered-search.png)

As shown in the above diagram, the search request carries `chunk like % red %` as the filtering condition, indicating that Zilliz Cloud should conduct the ANN search within all the entities that have the word `red` in the `chunk` field. Specifically, Zilliz Cloud does the following:​
As shown in the above diagram, the search request carries `chunk like % red %` as the filtering condition, indicating that Milvus should conduct the ANN search within all the entities that have the word `red` in the `chunk` field. Specifically, Milvus does the following:​

- Filter entities that match the filtering conditions carried in the search request.​

Expand Down Expand Up @@ -209,7 +209,7 @@ curl --request POST \​
```

The filtering condition carried in the search request reads `color like "red%" and likes > 50`. It uses the and operator to include two conditions: the first one asks for entities that have a value starting with `red` in the `color` field, and the other asks for entities with a value greater than `50` in the `likes` field. There are only two entities meeting these requirements. With the top-K set to `3`, Zilliz Cloud will calculate the distance between these two entities to the query vector and return them as the search results.​
The filtering condition carried in the search request reads `color like "red%" and likes > 50`. It uses the and operator to include two conditions: the first one asks for entities that have a value starting with `red` in the `color` field, and the other asks for entities with a value greater than `50` in the `likes` field. There are only two entities meeting these requirements. With the top-K set to `3`, Milvus will calculate the distance between these two entities to the query vector and return them as the search results.​

```JSON
[
Expand Down
4 changes: 2 additions & 2 deletions v2.5.x/site/en/userGuide/search-query-get/metric.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ title: Metric Types

Similarity metrics are used to measure similarities among vectors. Choosing an appropriate distance metric helps improve classification and clustering performance significantly.​

Currently, Zilliz Cloud supports these types of similarity Metrics: Euclidean distance (`L2`), Inner Product (`IP`), Cosine Similarity (`COSINE`), `JACCARD`, `HAMMING`, and `BM25` (specifically designed for full text search on sparse vectors).​
Currently, Milvus supports these types of similarity Metrics: Euclidean distance (`L2`), Inner Product (`IP`), Cosine Similarity (`COSINE`), `JACCARD`, `HAMMING`, and `BM25` (specifically designed for full text search on sparse vectors).​

The table below summarizes the mapping between different field types and their corresponding metric types.​

Expand Down Expand Up @@ -136,7 +136,7 @@ It's the most commonly used distance metric and is very useful when the data are

<div class="alert note">

Zilliz Cloud only calculates the value before applying the square root when Euclidean distance is chosen as the distance metric.​
Milvus only calculates the value before applying the square root when Euclidean distance is chosen as the distance metric.​

</div>

Expand Down
Loading

0 comments on commit 38c549c

Please sign in to comment.