Skip to content

[Bug]: [streaming] When upgrading from 2.5 to 2.6, count(*) results are incorrect and the search fails: current load ratio is 0.851950 #42884

@ThreadDao

Description

@ThreadDao

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.5-20250619-30b2a66f-amd64 -> master-20250620-c3c51681-amd64
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka):  pulsar  
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

server

  • pulsar mq
  • config
    common:
      enabledGrowingSegmentJSONKeyStats: true
      enabledJsonKeyStats: true
      enabledOptimizeExpr: false
    dataCoord:
      enableActiveStandby: true
      enabledJSONKeyStatsInSort: false
    indexCoord:
      enableActiveStandby: true
    log:
      level: debug
    queryCoord:
      enableActiveStandby: true
    rootCoord:
      enableActiveStandby: true

client

  1. create a collection -> index -> insert 20m entities -> flush -> index again -> load
  2. concurrent requests: upsert + flush + query + search
    Image

upgrading image during concurrent test

  • 2.5-20250619-30b2a66f-amd64 -> master-20250620-c3c51681-amd64
    The entire upgrade process took about 1 hour and 12 minutes. During the upgrade process and for a period of time after the upgrade was successful, there were search and query failures.
  • query
  File "/src/fouram/client/check/func_check.py", line 338, in check_query_output_count
    assert int(query_count) == expected_query_count, f'{query_count} == {expected_query_count}'
AssertionError: 19998794 == 20000000

    assert int(query_count) == expected_query_count, f'{query_count} == {expected_query_count}'
AssertionError: 20002259 == 20000000
  • search
[2025-06-20 04:38:15,373 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=901, message=fail to search on QueryNode 9: stack trace: /workspace/source/pkg/tracer/stack_trace.go:51 github.com/milvus-io/milvus/pkg/v2/tracer.StackTrace
/workspace/source/internal/util/grpcclient/client.go:575 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call
/workspace/source/internal/util/grpcclient/client.go:589 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall
/workspace/source/internal/distributed/querynode/client/client.go:106 github.com/milvus-io/milvus/internal/distributed/querynode/client.wrapGrpcCall[...]
/workspace/source/internal/distributed/querynode/client/client.go:224 github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).SearchSegments
/workspace/source/internal/querynodev2/cluster/worker.go:195 github.com/milvus-io/milvus/internal/querynodev2/cluster.(*remoteWorker).SearchSegments
/workspace/source/internal/querynodev2/delegator/delegator.go:354 github.com/milvus-io/milvus/internal/querynodev2/delegator.(*shardDelegator).search.func3
/workspace/source/internal/querynodev2/delegator/delegator.go:823 github.com/milvus-io/milvus/internal/querynodev2/delegator.executeSubTasks[...].func1
/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78 golang.org/x/sync/errgroup.(*Group).Go.func1
/usr/local/go/src/runtime/asm_amd64.s:1700 runtime.goexit: node not found)>, [requestId: e6340d5352a64d50bb7a75fdd8add84d] (api_request.py:58)

[2025-06-20 05:45:12,718 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=503, message=fail to search on QueryNode 9: channel distribution is not serviceable, required load ratio is 1.000000, current load ratio is 0.851950: channel not available[channel=zong-roll-upsert-4-rootcoord-dml_0_458853861888623318v0])>, [requestId: bb86a2ed859f42b1a2e6304120df3f19] (api_request.py:58)

Expected Behavior

No response

Steps To Reproduce

https://argo-workflows.zilliz.cc/archived-workflows/qa/c9a7cb7a-9d2a-40e0-824e-ca1881b8bf3e?nodeId=zong-roll-upsert-4

Milvus Log

pods:

zong-roll-upsert-4-milvus-datanode-65c99cd5f6-2r9j9               1/1     Running       0                172m    10.104.19.239   4am-node28   <none>           <none>
zong-roll-upsert-4-milvus-datanode-65c99cd5f6-8psjt               1/1     Running       0                173m    10.104.34.65    4am-node37   <none>           <none>
zong-roll-upsert-4-milvus-mixcoord-66cd768b44-qm82l               1/1     Running       0                3h56m   10.104.18.223   4am-node25   <none>           <none>
zong-roll-upsert-4-milvus-proxy-5b499984bd-kfqdb                  1/1     Running       0                171m    10.104.34.68    4am-node37   <none>           <none>
zong-roll-upsert-4-milvus-querynode-1-c67459bcf-8f2bq             1/1     Running       0                3h24m   10.104.23.84    4am-node27   <none>           <none>
zong-roll-upsert-4-milvus-querynode-1-c67459bcf-h7q5b             1/1     Running       0                3h55m   10.104.19.220   4am-node28   <none>           <none>
zong-roll-upsert-4-milvus-streamingnode-5c6f6b58b8-7jcmb          1/1     Running       0                3h57m   10.104.18.221   4am-node25   <none>           <none>
zong-roll-upsert-4-minio-0                                        1/1     Running       0                5h3m    10.104.18.139   4am-node25   <none>           <none>
zong-roll-upsert-4-minio-1                                        1/1     Running       0                5h3m    10.104.16.215   4am-node21   <none>           <none>
zong-roll-upsert-4-minio-2                                        1/1     Running       0                5h3m    10.104.30.211   4am-node38   <none>           <none>
zong-roll-upsert-4-minio-3                                        1/1     Running       0                5h3m    10.104.19.185   4am-node28   <none>           <none>

Anything else?

No response

Metadata

Metadata

Labels

kind/bugIssues or changes related a bugseverity/criticalCritical, lead to crash, data missing, wrong result, function totally doesn't work.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions