-
Notifications
You must be signed in to change notification settings - Fork 200
[Coral-Hive] [Coral-Trino] Make named_struct a Coral IR operator and Migrate GenericProject Function #431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
yiqiangin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| * then the following operation is syntactically and semantically correct in Trino: CAST(colA as trinoTypeStringA) | ||
| */ | ||
| class RelDataTypeToTrinoTypeStringConverter { | ||
| public class RelDataTypeToTrinoTypeStringConverter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I know why these three classes including TrinoMapTransformValuesFunction and TrinoStructCastRowFunction are converted to public? I don't see any usage of this class in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These classes are used in GenericProjectTransformer for example here. Previously GenericProjectTransformer was in the same package as RelDataTypeToTrinoTypeStringConverter but now it's moved to another package.
What changes are proposed in this pull request, and why are they necessary?
This PR covers two migrations - (1)
named_struct()(2)generic_project()[1]
This PR uses code changes from open PR #412 and adds minor modifications on top of it to be compatible with the new API.
Summary from the PR#412:
This patch removes the transformation from
HiveConvertletTablethat converts named_struct to CAST (ROW() AS ROW()). Instead, it makesnamed_structa Coral IR operator. Engine translations on the RHS are also adapted to accommodate this change. This also eliminates the need to rewrite from CAST (ROW() AS ROW()) to named_struct on the Spark side, because named_struct is now maintained all along. CastToNamedStructTransformer on the Spark side will be removed in a future PR.This PR also introduces a Trino transformer,
NamedStructToCastTransformer, which converts the Coral IR operator:named_structto its equivalent Trino compatible operator.This PR should address #357 and also unblocks migration of CONCAT operator here #378
[2]
This PR also migrates the Rel transformer:
GenericProjectToTrinoConverterto a SqlCall transformer:GenericProjectTransformer.How was this patch tested?
./gradlew build
updated & added UTs
tested with production views for spark, avro, trino