-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-37403] Convert interval join to regular join for updating source #26230
base: master
Are you sure you want to change the base?
Conversation
- Previously, if inputs to interval join contains updates the planner throws an exception. - This commit introduces an enhancement to convert interval join to regular join if any of the input to interval join contains updates
ModifyKind.UPDATE) || inputModifyKindSet.contains(ModifyKind.DELETE) | ||
if (containsUpdatesOrDeletes) { | ||
// Convert to regular join if the input contains Updates | ||
val isInnerJoin = intervalJoin.joinSpec.getJoinType == FlinkJoinType.INNER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why only inner and not all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as the RegularJoin condition
RegularJoin has conditions for INNER and SEMI. However, IntervalJoin doesn't support SEMI
Ref: https://github.com/apache/flink/pull/26230/files#diff-ddf95eb3949ab889e8e3bccdacb9f57553aac06657574fd6316ca17dd48321a1R262
@@ -35,7 +35,13 @@ public IntervalJoinRestoreTest() { | |||
public List<TableTestProgram> programs() { | |||
return Arrays.asList( | |||
IntervalJoinTestPrograms.INTERVAL_JOIN_EVENT_TIME, | |||
IntervalJoinTestPrograms.INTERVAL_JOIN_EVENT_TIME_UPDATING_SOURCE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need a restore test. Restoring doesn't need to be tested.
I added a SemanticTestBase
recently to test an execution only:
Take this as an example: #26236
intervalJoin.getCondition, | ||
intervalJoin.getJoinType, | ||
intervalJoin.getHints) | ||
createNewNode(regularJoin, children, providedTrait, requiredTrait, requester) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be that this approach has one major issue. What is the data type of the timestamp column? Is it still marked as a time attribute or is it a regular column? If it is still marked as a time attribute, windows or subsequent interval joins would still be allowed by the planner. Of course the output would be garbage in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a commit to materialize time attributes while processing IntervalJoin if the source has updates.
This makes the conversion in FlinkChangelogModeInferenceProgram
redundant.
- Materialize time attributes in IntervalJoin if any of the source contains updates - This will convert IntervalJoin to RegularJoin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a comment on the Jira asking a question; could you address that question please, in case it proves to be a better approach? The question is can we incorporate updates by amending the interval join processing? Changing the interval join to a regular join, seems strange as we are not honouring the requested join type - so could be confusing to the user, would a meaningful error message also be an option. WDYT?
What is the purpose of the change
Brief change log
Verifying this change
This change added tests and can be verified as follows:
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation