-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Is your feature request related to a problem or challenge?
Recent changes in the implementation of CaseExpr have replaced the usages of PhysicalExpr::evaluate_selection with custom filtering and selective evaluation logic. This was done to avoid overhead introduced by the generic approach evaluate_selection takes where it first filters the incoming record batch using the selection vector, calls plain evaluate with the filtered record batch, and then expands the result back to the length of the original record batch.
Because of all these changes, PhysicalExpr::evaluate_selection is no longer used at all in DataFusion itself. Since it's no longer used, and its use can be a cause of performance overhead, it might be better to deprecate (or even remove) it.
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
evaluate_selection was first introduced in #2068
As can be seen from the discussion at apache/arrow-rs#3620 actually implementing selective evaluation more efficiently is not trivial. Does it make sense to have this as an overridable trait function?