Skip to content

Refactor SQL parsing components to Use CoralTable Instead of Hive Table #575

@aastha25

Description

@aastha25

This refactoring impacts multiple components that are currently coupled to Hive's Table class:

(1)ParseTreeBuilder (coral-hive) is tightly coupled to Hive's org.apache.hadoop.hive.metastore.api.Table class, preventing it from working with other table formats (e.g., Iceberg). This needs to be refactored to use the unified CoralTable interface to enable multi-format support.

(2) HiveFunctionResolver (coral-hive) - Resolves Hive and Dali function names to Calcite operators

  • Method: tryResolveAsDaliFunction(String functionName, @Nonnull Table table, int numOfOperands)
  • Uses: Calls HiveCalciteTableAdapter.getDaliFunctionParams() and getDaliUdfDependencies() for UDF resolution

(3) HiveCalciteTableAdapter (coral-common) - Calcite adapter for Hive tables

  • Provides: getDaliFunctionParams() - parses "functions" table property (format: "f:c1 g:c2")
  • Provides: getDaliUdfDependencies() - parses "dependencies" table property (format: "o1:m1:v1 o2:m2:v2?transitive=false")
  • Used by: HiveFunctionResolver for Dali UDF resolution
  • (previously called HiveTable)

(4) HiveCalciteViewAdapter (coral-common) - Calcite adapter for Hive views (extends HiveCalciteTableAdapter)

  • Inherits: Dali UDF metadata extraction methods
  • Used by: View expansion in Calcite's relational algebra conversion
  • (previously called HiveView)

(5) IcebergCalciteTableAdapter (coral-common) - Calcite adapter for Iceberg tables

(6) IcebergHiveTableConverter (coral-common) - Temporary bridge (to be removed)

Refactoring Scope:

  • Replace org.apache.hadoop.hive.metastore.api.Table parameters with CoralTable in all affected APIs
  • Move Dali UDF metadata extraction logic from HiveCalciteTableAdapter to work with CoralTable.properties()
  • Update HiveFunctionResolver to accept CoralTable instead of Hive Table
  • Remove IcebergHiveTableConverter once ParseTreeBuilder accepts CoralTable
  • Ensure both HiveCoralTable and IcebergCoralTable can provide UDF metadata through the unified interface

This will enable Coral to parse and translate SQL from both Hive and Iceberg tables using a unified API.

Background:
Coral is introducing CoralTable as a unified abstraction for representing tables across different formats (Hive, Iceberg, etc.) as part of #556. However, ParseTreeBuilder - a core component that converts view SQL -> Hive AST (no schema dependency, only view metadatata - definition & properties), - is still deeply integrated with Hive's metastore Table class throughout its API surface and internal implementation.

Minimal Example:

Before (Current - Hive-specific):

// Client code
import org.apache.hadoop.hive.metastore.api.Table;

Table hiveTable = hiveMetastore.getTable("mydb", "mytable");
ParseTreeBuilder builder = new ParseTreeBuilder(functionResolver);
SqlNode sqlNode = builder.process("SELECT * FROM mytable", hiveTable);

After (Desired - Format-agnostic):

// Client code - works with both Hive and Iceberg
import com.linkedin.coral.common.catalog.CoralTable;
import com.linkedin.coral.common.catalog.HiveCoralTable;

// For Hive tables
org.apache.hadoop.hive.metastore.api.Table hiveTable = hiveMetastore.getTable("mydb", "mytable");
CoralTable coralTable = new HiveCoralTable(hiveTable);

// OR for Iceberg tables (when available)
// CoralTable coralTable = new IcebergCoralTable(icebergTable);

ParseTreeBuilder builder = new ParseTreeBuilder(functionResolver);
SqlNode sqlNode = builder.process("SELECT * FROM mytable", coralTable);

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions