[BugFix] fix slot id conflicts in array_map (backport #52692) #52759

mergify · 2024-11-08T13:33:44Z

Why I'm doing:

What I'm doing:

Before we rewrite the lambda expression, we need to find the max slot id currently used, and the slot id of new rewritten ColumnRef needs to be allocated from then on.

In the previous implementation, max_used_slot_id only counts the slot ids used in the lambda expression.
If multiple children of the same expression contain lambda expression, such as coalesce(element_at(array_map(x->x+any_match(array_map(x->x<10,arr_1)), arr_1), 1),element_at(array_map(x->x+any_match(array_map(x->x<10,arr_2)), arr_2),1)), then there is a possibility of duplicate allocation of slot ids.

In this PR, I made some changes to avoid this problem:

When allocating slot ids, allocate from the maximum slot id used by the entire expression tree to avoid conflicts in slot ids within the expression tree. But this is not global after all. If there are multiple expression trees doing the same thing, there may still be problems with duplicate allocations.
Considering that the result of outer common expression will only be used locally by the lambda function, as long as we avoid polluting the input chunk, it doesn’t matter even if the local slot id is repeated, so we choose to use a temporary chunk to execute the lambda expression.

Note: This is just a temporary solution and a bit ugly. When we support expression reuse strategy of scan predicates, the slot id allocation of common expressions will be handed over to FE, which has a global perspective and will definitely ensure that there will be no duplicate allocations. This is the most elegant solution. I will support it later.

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

Bugfix cherry-pick branch check:

This is an automatic backport of pull request #52692 done by [Mergify](https://mergify.com). ## Why I'm doing:

What I'm doing:

Before we rewrite the lambda expression, we need to find the max slot id currently used, and the slot id of new rewritten ColumnRef needs to be allocated from then on.

In the previous implementation, max_used_slot_id only counts the slot ids used in the lambda expression.
If multiple children of the same expression contain lambda expression, such as coalesce(element_at(array_map(x->x+any_match(array_map(x->x<10,arr_1)), arr_1), 1),element_at(array_map(x->x+any_match(array_map(x->x<10,arr_2)), arr_2),1)), then there is a possibility of duplicate allocation of slot ids.

In this PR, I made some changes to avoid this problem:

When allocating slot ids, allocate from the maximum slot id used by the entire expression tree to avoid conflicts in slot ids within the expression tree. But this is not global after all. If there are multiple expression trees doing the same thing, there may still be problems with duplicate allocations.
Considering that the result of outer common expression will only be used locally by the lambda function, as long as we avoid polluting the input chunk, it doesn’t matter even if the local slot id is repeated, so we choose to use a temporary chunk to execute the lambda expression.

Note: This is just a temporary solution and a bit ugly. When we support expression reuse strategy of scan predicates, the slot id allocation of common expressions will be handed over to FE, which has a global perspective and will definitely ensure that there will be no duplicate allocations. This is the most elegant solution. I will support it later.

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

Signed-off-by: silverbullet233 <[email protected]> (cherry picked from commit 59dc2e1) # Conflicts: # be/src/exprs/array_map_expr.cpp # be/src/exprs/expr.cpp

mergify · 2024-11-08T13:33:45Z

Cherry-pick of 59dc2e1 has failed:

On branch mergify/bp/branch-3.2/pr-52692
Your branch is up to date with 'origin/branch-3.2'.

You are currently cherry-picking commit 59dc2e14df.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   be/src/exprs/expr.h
	modified:   be/src/exprs/lambda_function.cpp
	modified:   be/src/exprs/lambda_function.h
	modified:   test/sql/test_array_fn/R/test_array_map_2
	modified:   test/sql/test_array_fn/T/test_array_map_2

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   be/src/exprs/array_map_expr.cpp
	both modified:   be/src/exprs/expr.cpp

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

mergify · 2024-11-08T13:34:20Z

@mergify[bot]: Backport conflict, please reslove the conflict and resubmit the pr

silverbullet233 · 2024-11-15T08:20:42Z

@mergify rebase

mergify · 2024-11-15T08:20:46Z

rebase

☑️ Nothing to do

-conflict [📌 rebase requirement]
-closed [📌 rebase requirement]
queue-position = -1 [📌 rebase requirement]
any of:
- #commits-behind > 0 [📌 rebase requirement]
- #commits > 1 [📌 rebase requirement]
- -linear-history [📌 rebase requirement]

[BugFix] fix slot id conflicts in array_map (#52692)

b665839

Signed-off-by: silverbullet233 <[email protected]> (cherry picked from commit 59dc2e1) # Conflicts: # be/src/exprs/array_map_expr.cpp # be/src/exprs/expr.cpp

mergify bot added the conflicts label Nov 8, 2024

mergify bot mentioned this pull request Nov 8, 2024

[BugFix] fix slot id conflicts in array_map #52692

Merged

24 tasks

github-actions bot assigned silverbullet233 Nov 8, 2024

github-actions bot added the automerge label Nov 8, 2024

mergify bot closed this Nov 8, 2024

silverbullet233 reopened this Nov 15, 2024

silverbullet233 closed this Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BugFix] fix slot id conflicts in array_map (backport #52692) #52759

[BugFix] fix slot id conflicts in array_map (backport #52692) #52759

Uh oh!

mergify bot commented Nov 8, 2024 •

edited by wanpengfei-git

Loading

Uh oh!

mergify bot commented Nov 8, 2024

Uh oh!

mergify bot commented Nov 8, 2024

Uh oh!

silverbullet233 commented Nov 15, 2024

Uh oh!

mergify bot commented Nov 15, 2024

Uh oh!

Uh oh!

[BugFix] fix slot id conflicts in array_map (backport #52692) #52759

[BugFix] fix slot id conflicts in array_map (backport #52692) #52759

Uh oh!

Conversation

mergify bot commented Nov 8, 2024 • edited by wanpengfei-git Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why I'm doing:

What I'm doing:

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

What I'm doing:

What type of PR is this:

Checklist:

Uh oh!

mergify bot commented Nov 8, 2024

Uh oh!

mergify bot commented Nov 8, 2024

Uh oh!

silverbullet233 commented Nov 15, 2024

Uh oh!

mergify bot commented Nov 15, 2024

☑️ Nothing to do

Uh oh!

Uh oh!

mergify bot commented Nov 8, 2024 •

edited by wanpengfei-git

Loading