Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSQL filtering to support exist/not-exist checks for complex data (e.g. Target attributes/metadata) #2176

Open
vasilchev opened this issue Jan 8, 2025 · 3 comments

Comments

@vasilchev
Copy link
Contributor

vasilchev commented Jan 8, 2025

Currently all possible RSQL filters compare complex data (attributes, metadata) attribute.keys to some values - e.g.

- attribute.key == value (direct comparison/match where attribute.key exists and value is matched)
- attribute.key != value (direct comparison/match where attribute.key exists and value does not match OR attribute.key does not exist at all)
- attribute.key =in= (value1,value2) (comparison/match where attribute.key exists and value is matching any of the values in provided collection)
- attribute.key =out= (value1,value2) (comparison/match where attribute.key exists and value is not matching any of the values provided collection)

In above examples some filter cases cannot be established - e.g.:

  • find all Targets with attribute.key that value is out of collection of values, OR attribute.key does not exist at all - e.g.:
    !has(attribute.key) OR attribute.key=out=(value1,value2)

  • find all Targets that have given attribute.key regardless of value - e.g.:
    has(attribute.key)
    !! this actually can be achieved even now, but is not that intuitive (e.g. using wildcards attribute.key==*) and combining it in long rsql filter could be confusing

Applicable in all use-cases where given Targets are marked with different category - e.g.

  • some Targets could have attribute.category1_version -> e.g. targets that are of kind category1 and have some version
  • some Targets could have attribute.category2_version -> e.g. targets that are of kind category2 and have some version
  • some Targets could have both

If both has(attribute.key) / !has(attribute.key) could be supported by the RSQL filterng -> creating RSQL filters for above use-case could achieve easily (and intuitive) following searches (where currently not possible, or non-intuitive):
-- filter all devices that support category1 (meaning have categor1, regardless of values)
-- filter all devices that support category2 ( meaning have category2, regardless of values)
-- filter all devices that support both category1 and category2 (meaning have category1 and category2, regardless of values)
-- filter only category1 (meaning have attribute.category1 regardless of value and does not have attribute.category2 at all)
-- filter only category2 (meaning have attribute.category2 regardless of value and does not have attribute.category1 at all)

@avgustinmm
Copy link
Contributor

avgustinmm commented Jan 9, 2025

So, shortly written (pseudo code, null means not present, not "null" string):

  • == -> x == y
  • != -> x != y or x == null
  • =in= -> x == y1 or y2
  • =out= -> (x != y1 and x != y2) and x != null

and there is no support for:

  • x == null or (x != y1 and x != y2)

I see two problems:

  • != and ==out== have different behavior regarding null - it shall be the same
  • nevertheless what semantic is chosen (consider has or has not attribute) - the other filtering is not supported.

As far as I see there are two options:

  • (A) add 'has' operator:
    • attribute =haskey= x
    • attribute =hasnokey= x
    • =in= and =out= depending on semantics (about null) will support both cases with or/and + has/hasnokey?
  • (B) treat certain string value e.g. spring value config #{null} as not supported as string but treated as null
    • attribute.x == #{null}
    • attribute.x != #{null}
    • attribute.x =in= (1, 2, #{null})
    • attribute.x =out= (1, 2, #{null})
  • (C) introduce special analogs treating null not as a string but as null (something like workaround)
    • ===
    • !==
    • =in==
    • =out==

@vasilchev
Copy link
Contributor Author

After reading your comment noticed i've made a mistake regarding current behavior of complex data (attributes) with !=, =out=

Behavior is same - it does not include null values , meaning they are evaluated only on attributes with key that have any value.(exists)

Here is a detailed output of !=, ==, =in=, =out= for complex data:

attribute.key!=value

SELECT t1.id, t1.tenant, t1.address, t1.controller_id, t1.created_at, t1.created_by, t1.description, t1.install_date, t1.last_modified_at, t1.last_modified_by, t1.last_target_query, t1.name, t1.optlock_revision, t1.request_controller_attributes, t1.sec_token, t1.update_status, t1.assigned_distribution_set, t1.installed_distribution_set, t1.target_type 
FROM {oj sp_target t1 LEFT OUTER JOIN sp_target_attributes t0 ON (t0.target_id = t1.id)} 
WHERE (((t0.attribute_key = ?) AND ((t0.attribute_value IS NULL) OR (t0.attribute_value <> ?))) AND (t1.tenant = ?))
attribute.key==value

SELECT t1.id, t1.tenant, t1.address, t1.controller_id, t1.created_at, t1.created_by, t1.description, t1.install_date, t1.last_modified_at, t1.last_modified_by, t1.last_target_query, t1.name, t1.optlock_revision, t1.request_controller_attributes, t1.sec_token, t1.update_status, t1.assigned_distribution_set, t1.installed_distribution_set, t1.target_type 
FROM {oj sp_target t1 LEFT OUTER JOIN sp_target_attributes t0 ON (t0.target_id = t1.id)} 
WHERE (((t0.attribute_key = ?) AND (t0.attribute_value = ?)) AND (t1.tenant = ?))
attribute.key=in=value

SELECT t1.id, t1.tenant, t1.address, t1.controller_id, t1.created_at, t1.created_by, t1.description, t1.install_date, t1.last_modified_at, t1.last_modified_by, t1.last_target_query, t1.name, t1.optlock_revision, t1.request_controller_attributes, t1.sec_token, t1.update_status, t1.assigned_distribution_set, t1.installed_distribution_set, t1.target_type 
FROM {oj sp_target t1 LEFT OUTER JOIN sp_target_attributes t0 ON (t0.target_id = t1.id)} 
WHERE (((t0.attribute_key = ?) AND (t0.attribute_value IN (?))) AND (t1.tenant = ?))
attribute.key=out=value

SELECT t1.id, t1.tenant, t1.address, t1.controller_id, t1.created_at, t1.created_by, t1.description, t1.install_date, t1.last_modified_at, t1.last_modified_by, t1.last_target_query, t1.name, t1.optlock_revision, t1.request_controller_attributes, t1.sec_token, t1.update_status, t1.assigned_distribution_set, t1.installed_distribution_set, t1.target_type 
FROM {oj sp_target t1 LEFT OUTER JOIN sp_target_attributes t0 ON (t0.target_id = t1.id)} 
WHERE (((t0.attribute_key = ?) AND ((t0.attribute_value IS NULL) OR NOT ((t0.attribute_value IN (?))))) AND (t1.tenant = ?))

@avgustinmm
Copy link
Contributor

avgustinmm commented Jan 10, 2025

Then the current behavior seems fine regarding the consistency. All operators work on targets HAVING the attribute (with the key).
What we don't have is just "get targets without attribute" (targets having the attribute(s) could be got, nevertheless not so trivial, with attribute.key == *).
So, at the end, we need something like:

  • (A) attribute.key == #{null}
  • (B) attribute.key =is= null

I'd vote for the B option since it is clearer and more intuitive, no "special values", and most of all, it is SQL compatible.
Regarding targets HAVING the key the equivalent would be "attribute.key =isnot= null" or to reuse =is= - "attribute.key =is= notnull" (or "attribute.key =is= not_null")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants