-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Description
Is there an existing issue for this?
- I have searched the existing issues
Is your feature request related to a problem? Please describe.
JSON Field Pattern Matching in Milvus: json_match
Function Proposal
1. Overview
Milvus currently supports querying JSON fields using expressions like json_field["key"] == "value"
and functions such as json_contains
. However, it lacks the capability to perform pattern-based searches on both keys and values within JSON fields. To address this limitation, we propose introducing a new function, json_match
, which allows users to filter documents based on patterns matching both keys and their corresponding values.
2. Function Syntax
json_match(json_field, key_pattern="pattern1", value_pattern="pattern2")
json_field
: The name of the JSON field to search within.key_pattern
: A pattern to match against keys in the JSON object. Supports wildcard characters like%
and_
.value_pattern
: A pattern to match against values corresponding to the matched keys. Also supports wildcard characters.
Note: At least one of key_pattern
or value_pattern
must be provided.
3. Examples
Example 1: Match Both Key and Value Patterns
filter = 'json_match(metadata, key_pattern="%abc%", value_pattern="%xyz%")'
This filter retrieves documents where the metadata
JSON field contains at least one key matching %abc%
and its corresponding value matches %xyz%
.
Example 2: Match Only Key Pattern
filter = 'json_match(metadata, key_pattern="user_%")'
This filter retrieves documents where the metadata
JSON field contains keys starting with user_
, regardless of their values.
Example 3: Match Only Value Pattern
filter = 'json_match(metadata, value_pattern="%admin%")'
This filter retrieves documents where the metadata
JSON field contains any key with a value matching %admin%
.
4. Implementation Considerations
- Pattern Matching: Implement pattern matching using SQL-like wildcards (
%
for multiple characters,_
for a single character). - Performance: Pattern matching, especially with leading wildcards (e.g.,
%abc
), can be performance-intensive. Consider providing optimization guidelines for users. - Indexing: Currently, how to support this with key index and path index
- Nested JSON: The function should clearly define behavior with nested JSON structures, potentially treating them as plain strings initially.
Describe the solution you'd like.
No response
Describe an alternate solution.
No response
Anything else? (Additional Context)
No response