Skip to content

Conversation

@joyhaldar
Copy link

Summary

This PR adds file pruning optimization for NOT IN and != predicates when a file contains a single distinct value (i.e., when min == max).

Problem

Currently, InclusiveMetricsEvaluator cannot prune files for NOT IN and != predicates, even when the file provably contains no matching rows.

Solution

When min == max and the file has no nulls, we can safely prune if:

  • For NOT IN: the single value is in the exclusion list
  • For !=: the single value equals the literal

Testing

  • Added unit tests for both notIn and notEq optimizations
  • Verified correct behavior with nulls (must scan) and without nulls (can prune)

Fixes #14592

@github-actions github-actions bot added the API label Nov 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NOT IN and != predicates do not prune files when min == max

1 participant