Skip to content

Conversation

@yyy1000
Copy link

@yyy1000 yyy1000 commented Jul 29, 2024

What changes are proposed in this pull request, and why are they necessary?

Introduction: The goal of this PR is to introduce Incremental Plan Generation for Coral-Incremental. This feature will enable the generation of multiple incremental plans for a given RelNode (a logical plan) based on the number of sub-queries that will be computed incrementally. The number of generated plans will match the number of sub-queries in a SQL query. Each logical plan in these generated plans will update the materialized view with its results.

For example, in a multi-table join, we can choose between batch and incremental execution stages. Consider the following three-table join:

               LogicalProject#8
                    |
               LogicalJoin#7
               /        \
       LogicalProject#4   TableScan#5
               |
        LogicalJoin#3
             /   \
   TableScan#0  TableScan#1

We could generate plans as follows:

Incremental: Both joins are executed incrementally.
Part-Batch, Part-Incremental: The first join is executed incrementally, and the second join is executed in batch mode.
Batch: Both joins are executed in batch mode.

Each generated plan will be represented as a List, where each RelNode indicates whether its corresponding sub-query will be executed incrementally or in batch mode.

Some important changes:

  1. New RelNodeGenerationTransformer to rewrite incremental format RelNodes by materializing the join subqueries, and combine them to generate different complete plans.
  2. Added two helper classes within RelNodeGenerationTransformer:
    findJoinNeedsProject: Converts all RelNodes into a uniform format for future incremental plan generation.
    uniformFormat: Ensures consistent formatting of RelNodes.
  3. convertRelPrev method which could convert the tables of a RelNode into '_prev' format, thus distinguishing from the original one.

How was this patch tested?

Unit test

@yyy1000 yyy1000 marked this pull request as draft July 29, 2024 21:11
@yyy1000 yyy1000 changed the title Plan generation [Coral-Incremental] Incremental plan generation for RelNode incremental rewrite Jul 30, 2024
@yyy1000 yyy1000 marked this pull request as ready for review July 30, 2024 21:09
Comment on lines 313 to 314
* - For other RelNodes: recursively processing their inputs to ensure uniformity.
* <p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear what this achieves.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the example below show what it's doing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am referring to "to ensure uniformity". The idea about the whole method is to ensure uniformity. Making it exclusive to "other RelNodes" does not make sense.

@wmoustafa
Copy link
Contributor

I think overall, the PR could benefit from more extensive documentation. Current documentation on different methods is a good start. Each documentation should be expanded with concrete running example with actual table names in the plan, input and output. Each method responsibility throughout the transformation process could be illustrated by this running example. If you add the table names to your current example, it will make the documentation much more clear and concrete by giving examples of the input and output of each method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants