-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Open
Labels
kind/bugIssues or changes related a bugIssues or changes related a bugtriage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.
Milestone
Description
Is there an existing issue for this?
- I have searched the existing issues
Environment
- Milvus version: master-20250611-a72463c6-amd64
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka): pulsar
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
config
dependencies:
pulsar:
inCluster:
values:
broker:
configData:
backlogQuotaDefaultLimitGB: "-1"
common:
enabledJSONKeyStats: true
dataCoord:
compaction:
clustering:
autoEnable: true
enableActiveStandby: true
indexCoord:
enableActiveStandby: true
log:
level: debug
mixCoord:
enableActiveStandby: true
queryCoord:
enableActiveStandby: true
queryNode:
enableSegmentPrune: true
rootCoord:
enableActiveStandby: true
client test
- create a collection
fouram_GCT4y7ke
(with clustering key)
{'auto_id': False,
'description': '',
'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float_vector', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 128}},
{'name': 'int32_1', 'description': '', 'type': <DataType.INT32: 4>}, {'name': 'float32_1', 'description': '', 'type': <DataType.FLOAT: 10>, 'is_clustering_key': True},
{'name': 'varchar_1', 'description': '', 'type': <DataType.VARCHAR: 21>, 'params': {'max_length': 100}},
{'name': 'varchar_2', 'description': '', 'type': <DataType.VARCHAR: 21>, 'params': {'max_length': 100, 'enable_match': True, 'enable_analyzer': True}},
{'name': 'array_varchar_1', 'description': '', 'type': <DataType.ARRAY: 22>, 'params': {'max_length': 100, 'max_capacity': 10}, 'element_type': <DataType.VARCHAR: 21>},
{'name': 'json_1', 'description': '', 'type': <DataType.JSON: 23>}, {'name': 'binary_vector', 'description': '', 'type': <DataType.BINARY_VECTOR: 100>, 'params': {'dim': 128}},
{'name': 'float16_vector', 'description': '', 'type': <DataType.FLOAT16_VECTOR: 102>, 'params': {'dim': 128}}, {'name': 'bfloat16_vector', 'description': '', 'type': <DataType.BFLOAT16_VECTOR: 103>, 'params': {'dim': 128}}],
'enable_dynamic_field': False} (base.py:329)
- create all kinds of index
'scalars_index': {'float32_1': {'index_type': 'STL_SORT'},
'int32_1': {'index_type': 'BITMAP'},
'varchar_1': {'index_type': 'TRIE'},
'array_varchar_1': {'index_type': 'INVERTED'}},
'vectors_index': {'binary_vector': {'metric_type': 'JACCARD',
'index_type': 'BIN_IVF_FLAT',
'index_param': {'nlist': 128}},
'float16_vector': {'metric_type': 'COSINE',
'index_type': 'IVF_SQ8',
'index_param': {'nlist': 128}},
'bfloat16_vector': {'metric_type': 'IP',
'index_type': 'IVF_FLAT',
'index_param': {'nlist': 128}}},
'index_params': {'index_type': 'HNSW', 'index_param': {'M': 16, 'efConstruction': 200}},
- insert 5m -> flush -> index again -> load
- concurrent requests: insert + delete + query + search + hybrid_search
After the test, the number of segments loaded by qn and the memory usage are as follows:
- bulk_import 10 laion1B_nolang parquet files and create index. The collection
import_1749718797_4573
schema is:
fields = [
FieldSchema(name="pk", dtype=DataType.INT64, is_primary=True, auto_id=False),
FieldSchema(name="pk_5b", dtype=DataType.INT64, is_clustering_key=True),
FieldSchema(name="caption", dtype=DataType.VARCHAR, max_length=8192, enable_analyzer=True,
enable_match=True),
FieldSchema(name="NSFW", dtype=DataType.VARCHAR, max_length=8192),
FieldSchema(name="similarity", dtype=DataType.DOUBLE),
FieldSchema(name="width", dtype=DataType.INT64, is_partition_key=True),
FieldSchema(name="height", dtype=DataType.INT64),
FieldSchema(name="original_width", dtype=DataType.INT64),
FieldSchema(name="original_height", dtype=DataType.INT64),
FieldSchema(name="md5", dtype=DataType.VARCHAR, max_length=8192),
FieldSchema(name="float32_vector", dtype=DataType.FLOAT_VECTOR, dim=VECTOR_DIM),
]
- load collection
import_1749718797_4573
and 3 queryNodes oom
zong-sn-base-op-53-4126-milvus-datanode-7b9bcd8864-c5fhk 1/1 Running 0 18h 10.104.14.175 4am-node18 <none> <none>
zong-sn-base-op-53-4126-milvus-datanode-7b9bcd8864-ddjhz 1/1 Running 0 38h 10.104.27.216 4am-node31 <none> <none>
zong-sn-base-op-53-4126-milvus-mixcoord-594686fcdc-jpz7f 1/1 Running 0 38h 10.104.19.143 4am-node28 <none> <none>
zong-sn-base-op-53-4126-milvus-proxy-66c4bd8f8-z27jx 1/1 Running 0 38h 10.104.34.25 4am-node37 <none> <none>
zong-sn-base-op-53-4126-milvus-querynode-0-56d54bcd5f-7xjdd 1/1 Running 1 (15m ago) 38h 10.104.9.195 4am-node14 <none> <none>
zong-sn-base-op-53-4126-milvus-querynode-0-56d54bcd5f-bjwr7 1/1 Running 0 38h 10.104.23.125 4am-node27 <none> <none>
zong-sn-base-op-53-4126-milvus-querynode-0-56d54bcd5f-dh2rg 1/1 Running 1 (15m ago) 18h 10.104.25.77 4am-node30 <none> <none>
zong-sn-base-op-53-4126-milvus-querynode-0-56d54bcd5f-gjbkd 1/1 Running 0 38h 10.104.14.52 4am-node18 <none> <none>
zong-sn-base-op-53-4126-milvus-querynode-0-56d54bcd5f-plt6j 1/1 Running 1 (15m ago) 38h 10.104.19.144 4am-node28 <none> <none>
zong-sn-base-op-53-4126-milvus-streamingnode-645bb4bbdd-jfls4 1/1 Running 0 38h 10.104.24.98 4am-node29 <none> <none>
zong-sn-base-op-53-4126-milvus-streamingnode-645bb4bbdd-msr49 1/1 Running 0 38h 10.104.20.30 4am-node22 <none> <none>
Actually, the load was successful, but insufficient memory during the load caused an OOM.
connections.connect(host="10.104.xx.xx")
utility.list_collections()
['fouram_GCT4y7ke', 'import_1749718797_4573']
c = Collection(name='import_1749718797_4573')
c.load() # qn oom
c.query('', output_fields=["count(*)"])
data: ["{'count(*)': 9683719}"]
c = Collection(name='fouram_GCT4y7ke')
c.query('', output_fields=["count(*)"])
data: ["{'count(*)': 10171780}"
metrics of load import collection L2 segments
Expected Behavior
No response
Steps To Reproduce
1. concurrent requests: https://argo-workflows.zilliz.cc/archived-workflows/qa/48e7e3d7-83a7-4a21-8065-c6c4ef19d79a?nodeId=zong-sn-labor-12-clustering-2657410768
2. bulk_import laion1B_nolang: https://argo-workflows.zilliz.cc/archived-workflows/qa/cf1871fb-2adb-4b88-963b-a63c57a7b2b2?nodeId=zong-sn-import-base-1
Milvus Log
No response
Anything else?
No response
Metadata
Metadata
Assignees
Labels
kind/bugIssues or changes related a bugIssues or changes related a bugtriage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.