Skip to content

Consider deprecation (or removal) of PhysicalExpr::evaluate_selection #18499

@pepijnve

Description

@pepijnve

Is your feature request related to a problem or challenge?

Recent changes in the implementation of CaseExpr have replaced the usages of PhysicalExpr::evaluate_selection with custom filtering and selective evaluation logic. This was done to avoid overhead introduced by the generic approach evaluate_selection takes where it first filters the incoming record batch using the selection vector, calls plain evaluate with the filtered record batch, and then expands the result back to the length of the original record batch.

Because of all these changes, PhysicalExpr::evaluate_selection is no longer used at all in DataFusion itself. Since it's no longer used, and its use can be a cause of performance overhead, it might be better to deprecate (or even remove) it.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

evaluate_selection was first introduced in #2068

As can be seen from the discussion at apache/arrow-rs#3620 actually implementing selective evaluation more efficiently is not trivial. Does it make sense to have this as an overridable trait function?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions