Skip to content

Push down join key filters for LEFT/RIGHT/ANTI joins #19917

@nuno-faria

Description

@nuno-faria

Describe the bug

If we have a filter on a join key, we can push it down to both relations. Currently, it is only pushed for inner joins, although it can be applied to other join types.

To Reproduce

In this example, k = 1 can be pushed to both tables, since t1.k = t2.k. Right now, it is only pushed to t1:

create table t1 (k int, v int);
create table t2 (k int, v int);

explain select * from t1 left join t2 on t1.k = t2.k where t1.k = 1;
+---------------+------------------------------------------------------------+
| plan_type     | plan                                                       |
+---------------+------------------------------------------------------------+
| physical_plan | ┌───────────────────────────┐                              |
|               | │        HashJoinExec       │                              |
|               | │    --------------------   │                              |
|               | │      join_type: Left      ├──────────────┐               |
|               | │        on: (k = k)        │              │               |
|               | └─────────────┬─────────────┘              │               |
|               | ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ |
|               | │      RepartitionExec      ││      RepartitionExec      │ |
|               | │    --------------------   ││    --------------------   │ |
|               | │ partition_count(in->out): ││ partition_count(in->out): │ |
|               | │          1 -> 12          ││          1 -> 12          │ |
|               | │                           ││                           │ |
|               | │    partitioning_scheme:   ││    partitioning_scheme:   │ |
|               | │      Hash([k@0], 12)      ││      Hash([k@0], 12)      │ |
|               | └─────────────┬─────────────┘└─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ |
|               | │         FilterExec        ││       DataSourceExec      │ |
|               | │    --------------------   ││    --------------------   │ |
|               | │      predicate: k = 1     ││          bytes: 0         │ |
|               | │                           ││       format: memory      │ |
|               | │                           ││          rows: 0          │ |
|               | └─────────────┬─────────────┘└───────────────────────────┘ |
|               | ┌─────────────┴─────────────┐                              |
|               | │       DataSourceExec      │                              |
|               | │    --------------------   │                              |
|               | │          bytes: 0         │                              |
|               | │       format: memory      │                              |
|               | │          rows: 0          │                              |
|               | └───────────────────────────┘                              |
|               |                                                            |
+---------------+------------------------------------------------------------+

Expected behavior

Filter pushed down to both tables (when possible).

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions