Skip to content

Commit

Permalink
Merge pull request #2987 from liyun95/v2.5.x
Browse files Browse the repository at this point in the history
update docs for 2.5.4 release
  • Loading branch information
liyun95 authored Jan 24, 2025
2 parents 0eeeb91 + a78600c commit 186644b
Show file tree
Hide file tree
Showing 3 changed files with 122 additions and 48 deletions.
8 changes: 4 additions & 4 deletions site/en/Variables.json
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
{
"milvus_release_version": "2.5.3",
"milvus_release_tag": "2.5.3",
"milvus_release_version": "2.5.4",
"milvus_release_tag": "2.5.4",
"milvus_deb_name": "milvus_2.2.0-1_amd64",
"milvus_rpm_name": "milvus-2.2.0-1.el7.x86_64",
"milvus_python_sdk_version": "2.4.x",
"milvus_python_sdk_real_version": "2.5.3",
"milvus_python_sdk_real_version": "2.5.4",
"milvus_node_sdk_version": "2.4.x",
"milvus_node_sdk_real_version": "v2.5.3",
"milvus_node_sdk_real_version": "v2.5.4",
"milvus_go_sdk_version": "2.3.x",
"milvus_go_sdk_real_version": "2.4.0",
"milvus_java_sdk_version": "2.4.x",
Expand Down
54 changes: 54 additions & 0 deletions site/en/release_notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,60 @@ title: Release Notes

Find out what’s new in Milvus! This page summarizes new features, improvements, known issues, and bug fixes in each release. You can find the release notes for each released version after v2.5.0 in this section. We suggest that you regularly visit this page to learn about updates.

## v2.5.4

Release date: January 23, 2025

| Milvus version | Python SDK version | Node.js SDK version | Java SDK version |
|----------------|--------------------|---------------------|------------------|
| 2.5.4 | 2.5.4 | 2.5.4 | 2.5.4 |

We’re excited to announce the release of Milvus 2.5.4, which introduces key performance optimizations and new features such as PartitionKey isolation, Sparse Index with DAAT MaxScore, and enhanced locking mechanisms. This version also addresses multiple bugs that improve overall stability and reliability. We encourage you to upgrade or try out this latest release, and we look forward to your feedback in helping us continually refine Milvus!

### Features

- Supports PartitionKey isolation to improve performance with multiple partition keys ([#39245](https://github.com/milvus-io/milvus/pull/39245)). For more information, refer to [Use Partition Key](use-partition-key.md).
- Sparse Index now supports DAAT MaxScore [knowhere/#1015](https://github.com/milvus-io/knowhere/pull/1015). For more information, refer to [Sparse Vector](sparse_vector.md).
- Adds support for `is_null` in expression ([#38931](https://github.com/milvus-io/milvus/pull/38931))
- Root privileges can be customized ([#39324](https://github.com/milvus-io/milvus/pull/39324))

### Improvements

- Cached segments’ delta information to accelerate the Query Coordinator ([#39349](https://github.com/milvus-io/milvus/pull/39349))
- Read metadata concurrently at the collection level to speed up failure recovery ([#38900](https://github.com/milvus-io/milvus/pull/38900))
- Refined lock granularity in QueryNode ([#39282](https://github.com/milvus-io/milvus/pull/39282)), ([#38907](https://github.com/milvus-io/milvus/pull/38907))
- Unified style by using CStatus to handle NewCollection CGO calls ([#39303](https://github.com/milvus-io/milvus/pull/39303))
- Skipped generating the partition limiter if no partition is set ([#38911](https://github.com/milvus-io/milvus/pull/38911))
- Added more RESTful API support ([#38875](https://github.com/milvus-io/milvus/pull/38875)) ([#39425](https://github.com/milvus-io/milvus/pull/39425))
- Removed unnecessary Bloom Filters in QueryNode and DataNode to reduce memory usage ([#38913](https://github.com/milvus-io/milvus/pull/38913))
- Speeded up data loading by accelerating task generation, scheduling, and execution in QueryCoord ([#38905](https://github.com/milvus-io/milvus/pull/38905))
- Reduced locking in DataCoord to speed up load and insert operations ([#38904](https://github.com/milvus-io/milvus/pull/38904))
- Added primary field names in `SearchResult` and `QueryResults` ([#39222](https://github.com/milvus-io/milvus/pull/39222))
- Used both binlog size and index size as the disk quota throttling standard ([#38844](https://github.com/milvus-io/milvus/pull/38844))
- Optimized memory usage for full-text search knowhere/#1011
- Added version control for scalar indexes ([#39236](https://github.com/milvus-io/milvus/pull/39236))
- Improved the speed of fetching collection information from RootCoord by avoiding unnecessary copies ([#38902](https://github.com/milvus-io/milvus/pull/38902))

### Bug fixes

- Fixed slow query issues caused by coarse lock granularity during multi-column loading ([#39255](https://github.com/milvus-io/milvus/pull/39255))
- Fixed an issue where using aliases could cause an iterator to traverse the wrong database ([#39248](https://github.com/milvus-io/milvus/pull/39248))
- Fixed search failures for primary keys with indexes ([#39390](https://github.com/milvus-io/milvus/pull/39390))
- Fixed potential data loss issue caused by restarting MixCoord and flushing concurrently ([#39422](https://github.com/milvus-io/milvus/pull/39422))
- Fixed a resource group update failure when altering the database ([#39356](https://github.com/milvus-io/milvus/pull/39356))
- Fixed a sporadic issue where the tantivy index could not delete index files during release ([#39434](https://github.com/milvus-io/milvus/pull/39434))
- Fixed a delete failure triggered by improper concurrency between stats tasks and L0 compaction after MixCoord restarts ([#39460](https://github.com/milvus-io/milvus/pull/39460))
- Fixed slow indexing caused by having too many threads ([#39341](https://github.com/milvus-io/milvus/pull/39341))
- Fixed an issue preventing disk quota checks from being skipped during bulk import ([#39319](https://github.com/milvus-io/milvus/pull/39319))
- Resolved freeze issues caused by too many message queue consumers by limiting concurrency ([#38915](https://github.com/milvus-io/milvus/pull/38915))
- Fixed query timeouts caused by MixCoord restarts during large-scale compactions ([#38926](https://github.com/milvus-io/milvus/pull/38926))
- Fixed scalar inverted index incompatibility when upgrading from 2.4 to 2.5 ([#39272](https://github.com/milvus-io/milvus/pull/39272))
- Fixed channel imbalance issues caused by node downtime ([#39200](https://github.com/milvus-io/milvus/pull/39200))
- Fixed an issue that could cause channel balance to become stuck. ([#39160](https://github.com/milvus-io/milvus/pull/39160))
- Fixed an issue where RBAC custom group privilege level checks became ineffective ([#39224](https://github.com/milvus-io/milvus/pull/39224))
- Fixed a failure to retrieve the number of rows in empty indexes ([#39210](https://github.com/milvus-io/milvus/pull/39210))
- Fixed incorrect memory estimation for small segments ([#38909](https://github.com/milvus-io/milvus/pull/38909))

## v2.5.3

Release date: January 13, 2025
Expand Down
108 changes: 64 additions & 44 deletions site/en/userGuide/schema/sparse_vector.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,68 +225,88 @@ The process of creating an index for sparse vectors is similar to that for [dens
</div>

```python
index_params = client.prepare_index_params()
index_params.add_index(
field_name="sparse_vector",
index_name="sparse_inverted_index",
index_type="SPARSE_INVERTED_INDEX",
metric_type="IP",
params={"drop_ratio_build": 0.2},​
)
index_params = client.prepare_index_params()

index_params.add_index(
field_name="sparse_vector",
index_name="sparse_inverted_index",
index_type="SPARSE_INVERTED_INDEX",
metric_type="IP",
params={"inverted_index_algo": "DAAT_MAXSCORE"},
)

```

```java
import io.milvus.v2.common.IndexParam;
import java.util.*;
List<IndexParam> indexes = new ArrayList<>();
Map<String,Object> extraParams = new HashMap<>();
extraParams.put("drop_ratio_build", 0.2);​
indexes.add(IndexParam.builder()
.fieldName("sparse_vector")
.indexName("sparse_inverted_index")
.indexType(IndexParam.IndexType.SPARSE_INVERTED_INDEX)
.metricType(IndexParam.MetricType.IP)
.extraParams(extraParams)
.build());
import io.milvus.v2.common.IndexParam;
import java.util.*;

List<IndexParam> indexes = new ArrayList<>();
Map<String,Object> extraParams = new HashMap<>();
extraParams.put("inverted_index_algo": "DAAT_MAXSCORE");
indexes.add(IndexParam.builder()
.fieldName("sparse_vector")
.indexName("sparse_inverted_index")
.indexType(IndexParam.IndexType.SPARSE_INVERTED_INDEX)
.metricType(IndexParam.MetricType.IP)
.extraParams(extraParams)
.build());

```

```javascript
const indexParams = await client.createIndex({
index_name: 'sparse_inverted_index',
field_name: 'sparse_vector',
metric_type: MetricType.IP,
index_type: IndexType.SPARSE_WAND,
params: {
drop_ratio_build: 0.2,​
},
});
const indexParams = await client.createIndex({
index_name: 'sparse_inverted_index',
field_name: 'sparse_vector',
metric_type: MetricType.IP,
index_type: IndexType.SPARSE_WAND,
params: {
inverted_index_algo: 'DAAT_MAXSCORE',
},
});

```

```curl
export indexParams='[
{
"fieldName": "sparse_vector",
"metricType": "IP",
"indexName": "sparse_inverted_index",
"indexType": "SPARSE_INVERTED_INDEX",
"params":{"drop_ratio_build": 0.2}​
}
]'
export indexParams='[
{
"fieldName": "sparse_vector",
"metricType": "IP",
"indexName": "sparse_inverted_index",
"indexType": "SPARSE_INVERTED_INDEX",
"params":{"inverted_index_algo": "DAAT_MAXSCORE"}
}
]'

```

In the example above:​
In the example above:

- `index_type`: The type of index to create for the sparse vector field. Valid Values:

- `SPARSE_INVERTED_INDEX`: A general-purpose inverted index for sparse vectors.
- `SPARSE_WAND`: A specialized index type supported in Milvus v2.5.3 and earlier.

<div class="alert note">

From Milvus 2.5.4 onward, `SPARSE_WAND` is being deprecated. Instead, it is recommended to use `"inverted_index_algo": "DAAT_WAND"` for equivalency while maintaining compatibility.

</div>

- `metric_type`: The metric used to calculate similarity between sparse vectors. Valid Values:

- `IP` (Inner Product): Measures similarity using dot product.
- `BM25`: Typically used for full-text search, focusing on textual similarity.

For further details, refer to [Metric Types](metric.md) and [Full Text Search](full-text-search.md).

- `params.inverted_index_algo`: The algorithm used for building and querying the index. Valid values:

- An index of type `SPARSE_INVERTED_INDEX` is created for the sparse vector. For sparse vectors, you can specify `SPARSE_INVERTED_INDEX` or `SPARSE_WAND`. For details, refer to [​Sparse Vector Indexes](https://milvus.io/docs/index.md?tab=sparse).​
- `"DAAT_MAXSCORE"` (default): Optimized Document-at-a-Time (DAAT) query processing using the MaxScore algorithm. MaxScore provides better performance for high k values or queries with many terms by skipping terms and documents likely to have minimal impact. It achieves this by partitioning terms into essential and non-essential groups based on their maximum impact scores, focusing on terms that can contribute to the top-k results.

- For sparse vectors, `metric_type` only supports `IP` (Inner Product), used to measure the similarity between two sparse vectors. For more information on similarity, refer to [​Metric Types](metric.md).​
- `"DAAT_WAND"`: Optimized DAAT query processing using the WAND algorithm. WAND evaluates fewer hit documents by leveraging maximum impact scores to skip non-competitive documents, but it has a higher per-hit overhead. This makes WAND more efficient for queries with small k values or short queries, where skipping is more feasible.

- `drop_ratio_build` is an optional index parameter specifically for sparse vectors. It controls the proportion of small vector values excluded during index building. For example, with `{"drop_ratio_build": 0.2}`, the smallest 20% of vector values will be excluded during index creation, reducing computational effort during searches.​
- `"TAAT_NAIVE"`: Basic Term-at-a-Time (TAAT) query processing. While it is slower compared to `DAAT_MAXSCORE` and `DAAT_WAND`, `TAAT_NAIVE` offers a unique advantage. Unlike DAAT algorithms, which use cached maximum impact scores that remain static regardless of changes to the global collection parameter (avgdl), `TAAT_NAIVE` dynamically adapts to such changes.

### Create collection​

Expand Down

0 comments on commit 186644b

Please sign in to comment.