-
Notifications
You must be signed in to change notification settings - Fork 200
Coral-Spark: Migrate 'CAST(ROW: RECORD_TYPE)' and 'extract_union' transformations from RelNode to SqlNode layer #354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coral-Spark: Migrate 'CAST(ROW: RECORD_TYPE)' and 'extract_union' transformations from RelNode to SqlNode layer #354
Conversation
…s from RelNode to SqlNode layer
aastha25
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @ljfgem for the PR!
Left minor comments around documentation.
...-spark/src/main/java/com/linkedin/coral/spark/transformers/CastToNamedStructTransformer.java
Show resolved
Hide resolved
|
|
||
|
|
||
| /** | ||
| * This transformer transforms `CAST(ROW: RECORD_TYPE)` to `named_struct` in Spark. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: transforms Coral IR function ..... to Spark compatible operator named_struct....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's suitable to call it Coral IR function, I think it's a function in CoralSqlNode converted from Coral IR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the path:
source SQL -> sourceSqlNode1 -> CoralSqlNode1 -> CoralRelNode -> CoralSqlNode2 -> TargetLangSqlNode -> target SQL
if an operator is present in CoralSqlNode1, CoralRelNode, CoralSqlNode2 and we transform this operator when we go to LanguageSqlNode, such operators can be called Coral IR functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as per the current status of our code base -
hive SQL -> HiveSqlNode (same as CoralSqlNode1) -> CoralRelNode -> CoralSqlNode2 -> SparkSqlNode -> Spark SQL
I think this operator qualifies as Coral Operator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarification, do you mean an operator is a Coral IR function if it appears in CoralSqlNode1 / CoralRelNode / CoralSqlNode2 or CoralSqlNode1 & CoralRelNode & CoralSqlNode2?
Not sure because it doesn't appear in CoralSqlNode1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for a corresponding input hive SQL, what is the operator we use in the SqlNode representation? how do we represent the named_struct on LHS?
but generally yes, an operator present in CoralSqlNode1 would be present in CoralRelNode and then generated back in CoralSqlNode2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still represented as named_struct in CoralSqlNode1, and Calcite converts it to CAST(ROW: RECORD_TYPE) in RelNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. In that case, could you investigate if it's possible to add a lightweight override in CoralRelNodeToCoralSqlNode to such that the coralSqlNode2 generated uses the same operator named_struct as coralSqlNode1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #357 for tracking.
...ark/src/main/java/com/linkedin/coral/spark/transformers/ExtractUnionFunctionTransformer.java
Show resolved
Hide resolved
|
@aastha25 Thanks for your review! Modified these 2 comments based on your suggestions. PTAL. |
Similar to #351, this PR migrates
CAST(ROW: RECORD_TYPE)andextract_uniontransformations fromRelNodelayer inIRRelToSparkRelTransformertoSqlNodelayer.In order to avoid creating a PR that is too big, I will open several more PRs to finish migrating all the transformations in
IRRelToSparkRelTransformertoSqlNodelayer and removeIRRelToSparkRelTransformer.Tests: