Skip to content

[SPARK-52007] [SQL] Expression IDs shouldn't be present in grouping expressions when using grouping sets #50791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

mihailoale-db
Copy link
Contributor

What changes were proposed in this pull request?

In this PR I propose that we change .toString to toPrettySQL when constructing grouping expressions in ResolveGroupingAnalytics rule.

Why are the changes needed?

Right now following query would pass (#x and #y are expression IDs generated with every cluster start):

select * from values(1,2) group by grouping sets (col1,col2,col1+col2) order by (col1#x + col2#y)``

But with next cluster restart, expression IDs would be regenerated and the query would fail. Because of that we need to fix this to disallow this nondeterministic behavior.

Does this PR introduce any user-facing change?

Some queries (and Dataframe programs) are going to fail but they would fail with every cluster restart (as explained above).

How was this patch tested?

Added tests.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label May 5, 2025
@mihailoale-db mihailoale-db force-pushed the fixgroupingsetsschema branch from 5181779 to 5a26cdb Compare May 5, 2025 17:41
@mihailoale-db
Copy link
Contributor Author

@cloud-fan could you PTAL when you have time (Docker test doesnt seem related)? Thanks

@cloud-fan
Copy link
Contributor

yea the docker test is unrelated, thanks, merging to master!

@cloud-fan cloud-fan closed this in 0eab1c0 May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants