-
Notifications
You must be signed in to change notification settings - Fork 566
Description
Is your feature request related to a problem?
We have a table partionend on one column and ZORDER-ed by a target column. AFAICs, delta-rs #4323f9e5 is wired up correctly with datafusion version 51.
With an equality expression, results are excellent:
> explain analyze select id,type,source from r where scope_id = 973;
...
DeltaScan, metrics=[files_pruned=694, files_scanned=1]
...
1 row(s) fetched.
Elapsed 0.561 seconds.
With some source of dynamic filters, no files get pruned. However, the exact range of the dynamic predicate is acknowledged.
...
DeltaScan, metrics=[files_pruned=0, files_scanned=695]
...
DataSourceExec ... projection=[scope_id, source, type, id], file_type=parquet, predicate=true AND DynamicFilter [ scope_id@0 >= 973 AND scope_id@0 <= 973 ], pruning_predicate=scope_id_null_count@1 != row_count@2 AND scope_id_max@0 >= 973 AND scope_id_null_count@1 != row_count@2 AND scope_id_min@3 <= 973
...
1 row(s) fetched.
Elapsed 151.741 seconds.
Describe the solution you'd like
I'm not sure if it's fixable with datafusion-only changes. Let me know, so I can also file or search for the underlying issue.
Eventually, I do like to see delta's ZORDER utilized by dynamic filters.
Describe alternatives you've considered
Given the trend towards more dynamic filtering capabilities, I only see workarounds instead of true alternatives.
Amazing library btw thx!
Priority
Medium - Would be helpful
Additional context
Attached the raw explain analyze outcomes.
delta_rs.issue.delt.scan_dynamic_file_pruning.txt
Contribution
- I'm willing to submit a pull request for this feature
- I can help with testing this feature
- I can help with documentation for this feature
Metadata
Metadata
Assignees
Labels
Type
Projects
Status