-
Notifications
You must be signed in to change notification settings - Fork 205
Description
This refactoring impacts multiple components that are currently coupled to Hive's Table class:
(1)ParseTreeBuilder (coral-hive) is tightly coupled to Hive's org.apache.hadoop.hive.metastore.api.Table class, preventing it from working with other table formats (e.g., Iceberg). This needs to be refactored to use the unified CoralTable interface to enable multi-format support.
(2) HiveFunctionResolver (coral-hive) - Resolves Hive and Dali function names to Calcite operators
- Method:
tryResolveAsDaliFunction(String functionName, @Nonnull Table table, int numOfOperands) - Uses: Calls
HiveCalciteTableAdapter.getDaliFunctionParams()andgetDaliUdfDependencies()for UDF resolution
(3) HiveCalciteTableAdapter (coral-common) - Calcite adapter for Hive tables
- Provides:
getDaliFunctionParams()- parses"functions"table property (format:"f:c1 g:c2") - Provides:
getDaliUdfDependencies()- parses"dependencies"table property (format:"o1:m1:v1 o2:m2:v2?transitive=false") - Used by:
HiveFunctionResolverfor Dali UDF resolution - (previously called
HiveTable)
(4) HiveCalciteViewAdapter (coral-common) - Calcite adapter for Hive views (extends HiveCalciteTableAdapter)
- Inherits: Dali UDF metadata extraction methods
- Used by: View expansion in Calcite's relational algebra conversion
- (previously called
HiveView)
(5) IcebergCalciteTableAdapter (coral-common) - Calcite adapter for Iceberg tables
- Current workaround: Uses
IcebergHiveTableConverterto convert Iceberg tables to Hive Table objects for UDF resolution - Blocks: Direct Iceberg table support in ParseTreeBuilder
- (Introduced as a new class in PR Add native Apache Iceberg table support with CoralCatalog abstraction #556 since (3) & (4) exist)
(6) IcebergHiveTableConverter (coral-common) - Temporary bridge (to be removed)
- Converts:
IcebergCoralTable→ HiveTablefor backward compatibility - Why exists: ParseTreeBuilder and HiveFunctionResolver can't accept CoralTable yet
- Will be removed: Once this issue is resolved
-
- (Introduced as a new class in PR Add native Apache Iceberg table support with CoralCatalog abstraction #556 since (1) through (5) exist)
Refactoring Scope:
- Replace
org.apache.hadoop.hive.metastore.api.Tableparameters withCoralTablein all affected APIs - Move Dali UDF metadata extraction logic from
HiveCalciteTableAdapterto work withCoralTable.properties() - Update
HiveFunctionResolverto acceptCoralTableinstead of HiveTable - Remove
IcebergHiveTableConverteronce ParseTreeBuilder acceptsCoralTable - Ensure both
HiveCoralTableandIcebergCoralTablecan provide UDF metadata through the unified interface
This will enable Coral to parse and translate SQL from both Hive and Iceberg tables using a unified API.
Background:
Coral is introducing CoralTable as a unified abstraction for representing tables across different formats (Hive, Iceberg, etc.) as part of #556. However, ParseTreeBuilder - a core component that converts view SQL -> Hive AST (no schema dependency, only view metadatata - definition & properties), - is still deeply integrated with Hive's metastore Table class throughout its API surface and internal implementation.
Minimal Example:
Before (Current - Hive-specific):
// Client code
import org.apache.hadoop.hive.metastore.api.Table;
Table hiveTable = hiveMetastore.getTable("mydb", "mytable");
ParseTreeBuilder builder = new ParseTreeBuilder(functionResolver);
SqlNode sqlNode = builder.process("SELECT * FROM mytable", hiveTable);After (Desired - Format-agnostic):
// Client code - works with both Hive and Iceberg
import com.linkedin.coral.common.catalog.CoralTable;
import com.linkedin.coral.common.catalog.HiveCoralTable;
// For Hive tables
org.apache.hadoop.hive.metastore.api.Table hiveTable = hiveMetastore.getTable("mydb", "mytable");
CoralTable coralTable = new HiveCoralTable(hiveTable);
// OR for Iceberg tables (when available)
// CoralTable coralTable = new IcebergCoralTable(icebergTable);
ParseTreeBuilder builder = new ParseTreeBuilder(functionResolver);
SqlNode sqlNode = builder.process("SELECT * FROM mytable", coralTable);