Skip to content

Conversation

@findinpath
Copy link
Contributor

@findinpath findinpath commented Nov 15, 2023

Consider the following scenario to translate via Coral:

CREATE TABLE test.table_join_view_scenario_simple_table (key string);
CREATE TABLE test.table_join_view_scenario_table_nested_columns (
      studentid struct<idtype:string,id:string>, 
      details struct<lastupdated:timestamp,name:string, misc:struct<enrollmentdate:date,major:string>>);

CREATE VIEW test.view_table_join_view_scenario_table_nested_columns AS 
SELECT studentid.id student_id, details.name as student_name, details.misc.major major, details.lastUpdated details_last_updated 
FROM test.table_join_view_scenario_table_nested_columns

CREATE VIEW test.view_table_join_with_view_table_with_nested_columns AS 
SELECT v.student_id, v.student_name, v.details_last_updated
FROM test.view_table_join_view_scenario_table_nested_columns v 
INNER JOIN test.table_join_view_scenario_simple_table t 
     ON concat('U', v.student_id) = t.key;

When we have an function call in the JOIN expression like in the example provided:

concat('U', v.student_id) = t.key

we are then having a new LogicalProject (different from the leaf LogicalProject corresponding to the view view_table_join_view_scenario_table_nested_columns see org.apache.calcite.sql2rel.SqlToRelConverter#leaves) which includes the expression CAST(concat('U', $0.id)):VARCHAR(2147483647) RexCall .
This project is not considered a leaf.

In the flattening algorithm from org.apache.calcite.sql2rel.SqlToRelConverter.Blackboard#flatten therefore the LogicalProject with the expressions

0 = {RexFieldAccess@8734} "$0.id"
1 = {RexFieldAccess@8735} "$1.name"
2 = {RexFieldAccess@8736} "$1.misc.major"
3 = {RexFieldAccess@8737} "$1.lastupdated"
4 = {RexCall@8738} "CAST(concat('U', $0.id)):VARCHAR(2147483647)"

will be flattened therefore again (because it is not a leaf) and we obtain therefore the LogicalTableScan corresponding to the table table_join_view_scenario_table_nested_columns which contains only two fields:

"studentid" -> {RelRecordType@8755} "RecordType:peek_no_expand(VARCHAR(2147483647) idtype, VARCHAR(2147483647) id)"
"details" -> {RelRecordType@8757} "RecordType:peek_no_expand(TIMESTAMP lastupdated, VARCHAR(2147483647) name, RecordType:peek_no_expand(DATE enrollmentdate, VARCHAR(2147483647) major) misc)"

Therefore the type corresponding to the LogicalTableScan having only two fields is being returned for usage in SqlToRelConverter.convertIdentifier.

However, details_last_updated has the index 3 in the LogicalProject

0 = {RexFieldAccess@8734} "$0.id"
1 = {RexFieldAccess@8735} "$1.name"
2 = {RexFieldAccess@8736} "$1.misc.major"
3 = {RexFieldAccess@8737} "$1.lastupdated"

This is why we stumble in an AssertionError in RexBuilder.makeFieldAccess :

java.lang.AssertionError: Field ordinal 3 is invalid for  type 

org.apache.calcite.sql2rel.SqlToRelConverter.Blackboard#flatten source code:

    public void flatten(
        List<RelNode> rels,
        int systemFieldCount,
        int[] start,
        List<Pair<RelNode, Integer>> relOffsetList) {
      for (RelNode rel : rels) {
        if (leaves.contains(rel) || rel instanceof LogicalMatch) {
          relOffsetList.add(
              Pair.of(rel, start[0]));
          start[0] += rel.getRowType().getFieldCount();
        } else {
          if (rel instanceof LogicalJoin
              || rel instanceof LogicalAggregate) {
            start[0] += systemFieldCount;
          }
          flatten(
              rel.getInputs(),
              systemFieldCount,
              start,
              relOffsetList);
        }
      }
    }

This PR serves only for showcasing a potential unidentified bug in calcite-core

Stacktrace:

java.lang.AssertionError: Field ordinal 3 is invalid for  type 'RecordType:peek_no_expand(RecordType:peek_no_expand(VARCHAR(2147483647) idtype, VARCHAR(2147483647) id) studentid, RecordType:peek_no_expand(TIMESTAMP lastupdated, VARCHAR(2147483647) name, RecordType:peek_no_expand(DATE enrollmentdate, VARCHAR(2147483647) major) misc) details)'
	at org.apache.calcite.rex.RexBuilder.makeFieldAccess(RexBuilder.java:197)
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3725)
	at org.apache.calcite.sql2rel.SqlToRelConverter.access$2200(SqlToRelConverter.java:217)
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4796)
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4092)
	at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:317)
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4656)
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:3939)
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:670)
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:627)
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3181)
	at com.linkedin.coral.hive.hive2rel.HiveSqlToRelConverter.convertQuery(HiveSqlToRelConverter.java:63)
	at com.linkedin.coral.common.ToRelConverter.toRel(ToRelConverter.java:157)
	at com.linkedin.coral.common.ToRelConverter.convertView(ToRelConverter.java:124)
	at com.linkedin.coral.trino.rel2trino.HiveToTrinoConverterTest.testViews(HiveToTrinoConverterTest.java:44)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
	at org.testng.internal.Invoker.invokeMethod(Invoker.java:701)
	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:893)
	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1218)
	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
	at org.testng.TestRunner.privateRun(TestRunner.java:758)
	at org.testng.TestRunner.run(TestRunner.java:613)
	at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
	at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:329)
	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:291)
	at org.testng.SuiteRunner.run(SuiteRunner.java:240)

What changes are proposed in this pull request, and why are they necessary?

How was this patch tested?

@DataProvider(name = "viewTestCases")
public Object[][] viewTestCasesProvider() {
return new Object[][] {
return new Object[][] { { "test", "view_table_join_with_view_table_with_nested_columns", "..." },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AssertionError happens in the translation of the view currently without reaching the point where the view translation gets compared with ....

Stacktrace:

java.lang.AssertionError: Field ordinal 3 is invalid for  type 'RecordType:peek_no_expand(RecordType:peek_no_expand(VARCHAR(2147483647) idtype, VARCHAR(2147483647) id) studentid, RecordType:peek_no_expand(TIMESTAMP lastupdated, VARCHAR(2147483647) name, RecordType:peek_no_expand(DATE enrollmentdate, VARCHAR(2147483647) major) misc) details)'
	at org.apache.calcite.rex.RexBuilder.makeFieldAccess(RexBuilder.java:197)
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3725)
	at org.apache.calcite.sql2rel.SqlToRelConverter.access$2200(SqlToRelConverter.java:217)
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4796)
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4092)
	at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:317)
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4656)
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:3939)
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:670)
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:627)
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3181)
	at com.linkedin.coral.hive.hive2rel.HiveSqlToRelConverter.convertQuery(HiveSqlToRelConverter.java:63)
	at com.linkedin.coral.common.ToRelConverter.toRel(ToRelConverter.java:157)
	at com.linkedin.coral.common.ToRelConverter.convertView(ToRelConverter.java:124)
	at com.linkedin.coral.trino.rel2trino.HiveToTrinoConverterTest.testViews(HiveToTrinoConverterTest.java:44)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
	at org.testng.internal.Invoker.invokeMethod(Invoker.java:701)
	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:893)
	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1218)
	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
	at org.testng.TestRunner.privateRun(TestRunner.java:758)
	at org.testng.TestRunner.run(TestRunner.java:613)
	at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
	at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:329)
	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:291)
	at org.testng.SuiteRunner.run(SuiteRunner.java:240)

@wmoustafa
Copy link
Contributor

Thanks for the PR. Could you simplify the table/view names in the test case? Also update the description to highlight the key reason why the test case fails (you could use the simplified names from the test). Right now, it is a bit hard to reverse engineer the problem from the test code alone.

@findinpath
Copy link
Contributor Author

@wmoustafa AC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants