feat(datafusion): implement the partitioning node for DataFusion to define the partitioning #1620

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

fvaleye wants to merge 9 commits into apache:main from fvaleye:feature/implement-repartition-node-for-insert-into-datafusion

+928 −1

Contributor

fvaleye commented Aug 21, 2025 •

edited

Loading

Which issue does this PR close?

Closes Implement Repartition Node: Decide when the partitioning mode for the best parallelism #1543

What changes are included in this PR?

Implement a physical execution repartition node that determines the relevant DataFusion partitioning strategy based on the Iceberg table schema and metadata.

Implement hash partitioning for partitioned/bucketed tables
Use round-robin partitioning for unpartitioned tables

Minor change: I created a new schema_ref() helper method.

Are these changes tested?

Yes, with unit tests

fvaleye changed the title ~~feat(datafusion): implement repartition node for DataFusion with~~ feat(datafusion): implement the partitioning node for DataFusion to define the partitioning

liurenjie1024 reviewed

View reviewed changes

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated Show resolved Hide resolved

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated

    
                  ///

                  /// If no suitable hash columns are found (e.g., unpartitioned, non-bucketed table),

                  /// falls back to round-robin batch partitioning for even load distribution.

                  fn determine_partitioning_strategy(

Contributor

liurenjie1024 Aug 25, 2025

This is interesting to see. At first I just thought two cases:

If it's partitioned table, we should just hash partition.
If it's not partitioned, we should just use round robin partition.

However, this reminds me another case: range only partition, e.g. we only has partitions like date, time. I think in this case we should also use round robin partition since in this case most data are focused in several partitions.

Also I don't think we should take into account write.distribution-mode for now. The example you use are for spark, but not applicable for datafusion.

Contributor Author

fvaleye Aug 25, 2025 •

edited

Loading

However, this reminds me another case: range only partition, e.g. we only has partitions like date, time. I think in this case we should also use round robin partition since in this case most data are focused in several partitions.

Hum. You are right. The range partitions concentrate data in recent partitions, making hash partitioning counterproductive (considering a date with a temporal partition).
Since DataFusion doesn't provide Range, the fallback is round-robin and not hashing.

Briefly:

Hash partition: Only on bucket columns (partition spec + sort order)
Round-robin: Everything else (unpartitioned, range, identity, temporal transforms)

Also I don't think we should take into account write.distribution-mode for now. The example you use are for spark, but not applicable for datafusion.

Oh, good point, I misunderstood this. I thought it was an iceberg-rust table property.

liurenjie1024 mentioned this pull request

feat(datafusion): implement the project node to add the partition columns #1602

Merged

fvaleye force-pushed the feature/implement-repartition-node-for-insert-into-datafusion branch from e8eb255 to 48ffd26 Compare

August 29, 2025 12:10

liurenjie1024 reviewed

View reviewed changes

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated Show resolved Hide resolved

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated Show resolved Hide resolved

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated Show resolved Hide resolved

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated Show resolved Hide resolved

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated Show resolved Hide resolved

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated Show resolved Hide resolved

fvaleye force-pushed the feature/implement-repartition-node-for-insert-into-datafusion branch 2 times, most recently from 77ca951 to 1a8d65d Compare

September 1, 2025 18:28

fvaleye force-pushed the feature/implement-repartition-node-for-insert-into-datafusion branch from 0ca0994 to d3670f1 Compare

October 27, 2025 08:57

liurenjie1024 reviewed

View reviewed changes

crates/iceberg/src/table.rs Outdated Show resolved Hide resolved

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated Show resolved Hide resolved

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated Show resolved Hide resolved

fvaleye force-pushed the feature/implement-repartition-node-for-insert-into-datafusion branch 2 times, most recently from 820a827 to bede4ed Compare

October 27, 2025 10:39

liurenjie1024 reviewed

View reviewed changes

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated Show resolved Hide resolved

crates/integrations/datafusion/src/physical_plan/repartition.rs Outdated

    
                      }

                  }

                  let hash_column_names: Vec<&str> = partition_spec

Contributor

liurenjie1024 Oct 28, 2025

Why we need this part? The physical plan is generated by IcebergTableProvider, so we could ensure that the _partition column will be inserted.

Contributor Author

fvaleye Oct 28, 2025 •

edited

Loading

Since the physical plan is generated by IcebergTableProvider, which always calls project_with_partition() for partitioned tables... we can guarantee that the _partition column will be present... so it is unnecessary if we stick to the DataFusion plan creation! Let me rework this. I will add the need for the column _partition to be available.

crates/integrations/datafusion/src/physical_plan/repartition.rs

    
              }

              /// Returns whether repartitioning is actually needed by comparing input and desired partitioning

              fn needs_repartitioning(input: &Arc<dyn ExecutionPlan>, desired: &Partitioning) -> bool {

Contributor

liurenjie1024 Oct 28, 2025

I'm confused why we need this part. For partitioned table, the input is expected to be a ProjectExec, and we should always add the partitioning. For unpartitioned table, we should simply return RoundRobinBatch

Contributor Author

fvaleye Oct 28, 2025

The benefit of this check is to avoid redundant repartitioning when the input already has correct partitioning, when it's called multiple times, or during optimization passes.
Since DataFusion normally builds the plan once (no manual calls), it's okay to remove it.

Contributor Author

fvaleye Oct 28, 2025

Tell me what you think; I'm okay with removing this part of the integration of the other changes.

fvaleye force-pushed the feature/implement-repartition-node-for-insert-into-datafusion branch 2 times, most recently from d95b86b to ccc1b67 Compare

October 28, 2025 14:56

fvaleye added 6 commits

October 28, 2025 16:02


          feat(datafusion): implement repartition node for DataFusion, setting …

614d11d

…the best partition strategy for Iceberg for writing

- Implement hash partitioning for partitioned/bucketed tables
- Use round-robin partitioning for unpartitioned tables
- Support range distribution mode approximation via sort columns


          feat(datafusion): remove spark-specific distribution-mode, use round-…

ab9e4eb

…robin for range partitions

Signed-off-by: Florian Valeye <[email protected]>


          feat(datafusion): adapt the repartition to determine only the best pa…

…rtitioning strategy

Signed-off-by: Florian Valeye <[email protected]>


          feat(datafusion): remove sort order from partition strategy and remov…

38b38b7

…e unused parameters


          feat(datafusion): adapt repartition to use project new _partition column

bda7c0c

Signed-off-by: Florian Valeye <[email protected]>


          feat(datafusion): change visibility of repartition and adapt the part…

d5d4f86

…itioning strategy logic

Signed-off-by: Florian Valeye <[email protected]>

fvaleye force-pushed the feature/implement-repartition-node-for-insert-into-datafusion branch from ccc1b67 to 3547194 Compare

October 28, 2025 15:05


          feat(datafusion): simplify repartition by being only integrated in Da…

479a0f6

…taFusion plan and requiring the _partition column

Signed-off-by: Florian Valeye <[email protected]>

fvaleye force-pushed the feature/implement-repartition-node-for-insert-into-datafusion branch from 3547194 to 479a0f6 Compare

October 28, 2025 15:09

fvaleye added 2 commits

October 28, 2025 17:37


          Merge branch 'main' into feature/implement-repartition-node-for-inser…

f334776

…t-into-datafusion


          Merge branch 'main' into feature/implement-repartition-node-for-inser…

b5c0b51

…t-into-datafusion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet