-
Notifications
You must be signed in to change notification settings - Fork 205
Add native Apache Iceberg table support with CoralCatalog abstraction #556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
aastha25
wants to merge
16
commits into
linkedin:master
Choose a base branch
from
aastha25:icebergtype
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
bbb4332
initial commit for iceberg schema integration
aastha25 fd6b6c0
preserve iceberg timestamp precision
aastha25 f1cdbf2
fix the wiring for icebergDataset
aastha25 5d84162
backward compatibility of iceberg table for parsing only
aastha25 f54221f
refactr IcebergTypeConverter
aastha25 9b072ea
decouple coralCatalog & hievmetastoreclient
aastha25 27061e2
minor version bump & mark old interfaces deprecated
aastha25 9908631
glue logic for CoralCatalog in coral-trino
aastha25 f38d897
rebase changes
aastha25 ee4d9de
include IcebergToCoral type system & related changes
aastha25 ec45fd0
enable viz tests
aastha25 5ee910c
rename classes & the integ
aastha25 c9762dc
link issue 575 in all relevant classes
aastha25 980cb1b
Rename user facing classes to HiveTable & IcebergTable
aastha25 fa8fb33
resolve some more comments
aastha25 06d926a
refactor code
aastha25 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
174 changes: 174 additions & 0 deletions
174
coral-common/src/main/java/com/linkedin/coral/common/CoralDatabaseSchema.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,174 @@ | ||
| /** | ||
| * Copyright 2017-2026 LinkedIn Corporation. All rights reserved. | ||
| * Licensed under the BSD-2 Clause license. | ||
| * See LICENSE in the project root for license information. | ||
| */ | ||
| package com.linkedin.coral.common; | ||
|
|
||
| import java.util.Collection; | ||
| import java.util.Set; | ||
|
|
||
| import javax.annotation.Nonnull; | ||
|
|
||
| import com.google.common.collect.ImmutableList; | ||
| import com.google.common.collect.ImmutableSet; | ||
|
|
||
| import org.apache.calcite.linq4j.tree.Expression; | ||
| import org.apache.calcite.rel.type.RelProtoDataType; | ||
| import org.apache.calcite.schema.*; | ||
|
|
||
| import com.linkedin.coral.common.catalog.CoralCatalog; | ||
| import com.linkedin.coral.common.catalog.CoralTable; | ||
| import com.linkedin.coral.common.catalog.HiveTable; | ||
| import com.linkedin.coral.common.catalog.IcebergTable; | ||
|
|
||
| import static com.google.common.base.Preconditions.checkNotNull; | ||
| import static com.linkedin.coral.common.catalog.TableType.VIEW; | ||
|
|
||
|
|
||
| /** | ||
| * Coral's database-level Calcite adapter for {@link CoralCatalog} integration. | ||
| * | ||
| * <p>This is the modern iteration of Coral's database-level schema adapter that bridges | ||
| * {@link CoralCatalog} to Calcite's {@link Schema} interface. It represents a specific database/namespace | ||
| * and dispatches table lookups to the appropriate format-specific implementations. | ||
| * | ||
| * <p><b>Multi-format Dispatch:</b> This class automatically dispatches to the correct table implementation | ||
| * based on the underlying table format: | ||
| * <ul> | ||
| * <li>{@link IcebergCalciteTableAdapter} for Iceberg tables ({@link IcebergTable})</li> | ||
| * <li>{@link HiveCalciteTableAdapter} for Hive tables ({@link HiveTable})</li> | ||
| * <li>{@link HiveCalciteViewAdapter} for Hive views ({@link HiveTable} with VIEW type)</li> | ||
| * </ul> | ||
| * | ||
| * <p><b>Relationship to HiveDbSchema:</b> | ||
| * <ul> | ||
| * <li>{@link CoralDatabaseSchema} (this class) - Modern CoralCatalog-based integration (new code)</li> | ||
| * <li>{@link HiveDbSchema} - Legacy dual-mode adapter supporting both CoralCatalog and HiveMetastoreClient (backward compatibility)</li> | ||
| * </ul> | ||
| * | ||
| * <p><b>Usage:</b> This class is instantiated by {@link CoralRootSchema} when a database subschema is requested. | ||
| * | ||
| * @see CoralRootSchema Root-level adapter (contains databases) | ||
| * @see HiveDbSchema Legacy adapter with HiveMetastoreClient support | ||
| * @see CoralCatalog Unified catalog interface | ||
| */ | ||
| public class CoralDatabaseSchema implements Schema { | ||
|
|
||
| /** | ||
| * Default database name. | ||
| */ | ||
| public static final String DEFAULT_DB = "default"; | ||
|
|
||
| private final CoralCatalog coralCatalog; | ||
| private final String dbName; | ||
|
|
||
| /** | ||
| * Creates a CoralDatabaseSchema for a specific database using CoralCatalog. | ||
| * | ||
| * @param coralCatalog Coral catalog providing unified access to tables | ||
| * @param dbName Database name (must not be null) | ||
| */ | ||
| CoralDatabaseSchema(@Nonnull CoralCatalog coralCatalog, @Nonnull String dbName) { | ||
| this.coralCatalog = checkNotNull(coralCatalog, "coralCatalog cannot be null"); | ||
| this.dbName = checkNotNull(dbName, "dbName cannot be null"); | ||
| } | ||
|
|
||
| /** | ||
| * Returns a Calcite Table for the specified table name. | ||
| * | ||
| * <p>This method performs format-aware dispatch: | ||
| * <ul> | ||
| * <li>Iceberg tables → {@link IcebergCalciteTableAdapter}</li> | ||
| * <li>Hive views → {@link HiveCalciteViewAdapter}</li> | ||
| * <li>Hive tables → {@link HiveCalciteTableAdapter}</li> | ||
| * </ul> | ||
| * | ||
| * @param name Table name | ||
| * @return Calcite Table implementation, or null if table doesn't exist | ||
| */ | ||
| @Override | ||
| public Table getTable(String name) { | ||
| CoralTable coralTable = coralCatalog.getTable(dbName, name); | ||
| if (coralTable == null) { | ||
| return null; | ||
| } | ||
|
|
||
| // Dispatch based on CoralTable implementation type | ||
| if (coralTable instanceof IcebergTable) { | ||
| return new IcebergCalciteTableAdapter((IcebergTable) coralTable); | ||
| } else if (coralTable instanceof HiveTable) { | ||
| HiveTable hiveTable = (HiveTable) coralTable; | ||
| // Check if it's a view | ||
| if (hiveTable.tableType() == VIEW) { | ||
| return new HiveCalciteViewAdapter(hiveTable, ImmutableList.of(CoralRootSchema.ROOT_SCHEMA, dbName)); | ||
| } else { | ||
| return new HiveCalciteTableAdapter(hiveTable); | ||
| } | ||
| } | ||
|
|
||
| return null; | ||
| } | ||
|
|
||
| /** | ||
| * Returns all table names in this database. | ||
| * | ||
| * @return Set of table names | ||
| */ | ||
| @Override | ||
| public Set<String> getTableNames() { | ||
| return ImmutableSet.copyOf(coralCatalog.getAllTables(dbName)); | ||
| } | ||
|
|
||
| @Override | ||
| public RelProtoDataType getType(String s) { | ||
| return null; | ||
| } | ||
|
|
||
| @Override | ||
| public Set<String> getTypeNames() { | ||
| return null; | ||
| } | ||
|
|
||
| @Override | ||
| public Collection<Function> getFunctions(String name) { | ||
| return ImmutableList.of(); | ||
| } | ||
|
|
||
| @Override | ||
| public Set<String> getFunctionNames() { | ||
| return ImmutableSet.of(); | ||
| } | ||
|
|
||
| /** | ||
| * A database does not have subschemas. | ||
| * | ||
| * @param name Subschema name | ||
| * @return null (databases don't have subschemas) | ||
| */ | ||
| @Override | ||
| public Schema getSubSchema(String name) { | ||
| return null; | ||
| } | ||
|
|
||
| @Override | ||
| public Set<String> getSubSchemaNames() { | ||
| return ImmutableSet.of(); | ||
| } | ||
|
|
||
| @Override | ||
| public Expression getExpression(SchemaPlus parentSchema, String name) { | ||
| return null; | ||
| } | ||
|
|
||
| @Override | ||
| public boolean isMutable() { | ||
| return true; | ||
| } | ||
|
|
||
| // TODO: This needs to be snapshot of current state of catalog | ||
| @Override | ||
| public Schema snapshot(SchemaVersion schemaVersion) { | ||
| return this; | ||
| } | ||
| } |
151 changes: 151 additions & 0 deletions
151
coral-common/src/main/java/com/linkedin/coral/common/CoralRootSchema.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,151 @@ | ||
| /** | ||
| * Copyright 2017-2026 LinkedIn Corporation. All rights reserved. | ||
| * Licensed under the BSD-2 Clause license. | ||
| * See LICENSE in the project root for license information. | ||
| */ | ||
| package com.linkedin.coral.common; | ||
|
|
||
| import java.util.Collection; | ||
| import java.util.Set; | ||
|
|
||
| import javax.annotation.Nonnull; | ||
|
|
||
| import com.google.common.collect.ImmutableList; | ||
| import com.google.common.collect.ImmutableSet; | ||
|
|
||
| import org.apache.calcite.linq4j.tree.Expression; | ||
| import org.apache.calcite.rel.type.RelProtoDataType; | ||
| import org.apache.calcite.schema.*; | ||
|
|
||
| import com.linkedin.coral.common.catalog.CoralCatalog; | ||
|
|
||
| import static com.google.common.base.Preconditions.checkNotNull; | ||
|
|
||
|
|
||
| /** | ||
| * Coral's root-level Calcite adapter for {@link CoralCatalog} integration. | ||
| * | ||
| * <p>This is the modern iteration of Coral's Calcite schema adapter that bridges | ||
| * {@link CoralCatalog} to Calcite's {@link Schema} interface. It represents the "root" schema | ||
| * that holds all databases as subschemas and contains no tables directly. | ||
| * | ||
| * <p><b>Multi-format Support:</b> This class supports multiple table formats (Hive, Iceberg, etc.) | ||
| * through the unified {@link CoralCatalog} interface, enabling format-agnostic SQL query processing. | ||
| * | ||
| * <p><b>Schema Name:</b> The schema name remains "hive" for backward compatibility with existing | ||
| * SQL queries and qualified table names, even though this class supports multiple formats beyond Hive. | ||
| * | ||
| * <p><b>Relationship to HiveSchema:</b> | ||
| * <ul> | ||
| * <li>{@link CoralRootSchema} (this class) - Modern CoralCatalog-based integration (new code)</li> | ||
| * <li>{@link HiveSchema} - Legacy dual-mode adapter supporting both CoralCatalog and HiveMetastoreClient (backward compatibility)</li> | ||
| * </ul> | ||
| * | ||
| * <p><b>Usage:</b> This class is instantiated by {@link ToRelConverter} when constructed with a | ||
| * {@link CoralCatalog} parameter. | ||
| * | ||
| * @see CoralDatabaseSchema Database-level adapter (contains tables) | ||
| * @see HiveSchema Legacy adapter with HiveMetastoreClient support | ||
| * @see CoralCatalog Unified catalog interface | ||
| */ | ||
| public class CoralRootSchema implements Schema { | ||
|
|
||
| /** | ||
| * Schema name used in Calcite (e.g., "hive"."database"."table"). | ||
| * Remains "hive" for backward compatibility despite multi-format support. | ||
| */ | ||
| public static final String ROOT_SCHEMA = "hive"; | ||
|
|
||
| /** | ||
| * Default database name. | ||
| */ | ||
| public static final String DEFAULT_DB = "default"; | ||
|
|
||
| private final CoralCatalog coralCatalog; | ||
|
|
||
| /** | ||
| * Creates a CoralRootSchema using CoralCatalog for unified multi-format table access. | ||
| * | ||
| * @param coralCatalog Coral catalog providing unified access to Hive, Iceberg, and other table formats | ||
| */ | ||
| public CoralRootSchema(@Nonnull CoralCatalog coralCatalog) { | ||
| this.coralCatalog = checkNotNull(coralCatalog, "coralCatalog cannot be null"); | ||
| } | ||
|
|
||
| /** | ||
| * This always returns null as root schema does not have tables. | ||
| * Tables are contained in database-level subschemas. | ||
| * | ||
| * @param name Table name | ||
| * @return null (root schema has no tables) | ||
| */ | ||
| @Override | ||
| public Table getTable(String name) { | ||
| return null; | ||
| } | ||
|
|
||
| @Override | ||
| public Set<String> getTableNames() { | ||
| return ImmutableSet.of(); | ||
| } | ||
|
|
||
| @Override | ||
| public RelProtoDataType getType(String s) { | ||
| return null; | ||
| } | ||
|
|
||
| @Override | ||
| public Set<String> getTypeNames() { | ||
| return null; | ||
| } | ||
|
|
||
| @Override | ||
| public Collection<Function> getFunctions(String name) { | ||
| return ImmutableList.of(); | ||
| } | ||
|
|
||
| @Override | ||
| public Set<String> getFunctionNames() { | ||
| return ImmutableSet.of(); | ||
| } | ||
|
|
||
| /** | ||
| * Returns a database-level subschema if the database exists. | ||
| * | ||
| * @param name Database/namespace name | ||
| * @return CoralDatabaseSchema for the database, or null if database doesn't exist | ||
| */ | ||
| @Override | ||
| public Schema getSubSchema(String name) { | ||
| if (!coralCatalog.namespaceExists(name)) { | ||
| return null; | ||
| } | ||
| return new CoralDatabaseSchema(coralCatalog, name); | ||
| } | ||
|
|
||
| /** | ||
| * Returns all database/namespace names in the catalog. | ||
| * | ||
| * @return Set of database names | ||
| */ | ||
| @Override | ||
| public Set<String> getSubSchemaNames() { | ||
| return ImmutableSet.copyOf(coralCatalog.getAllNamespaces()); | ||
| } | ||
|
|
||
| @Override | ||
| public Expression getExpression(SchemaPlus parentSchema, String name) { | ||
| return null; | ||
| } | ||
|
|
||
| @Override | ||
| public boolean isMutable() { | ||
| return true; | ||
| } | ||
|
|
||
| // TODO: This needs to be snapshot of current state of catalog | ||
| @Override | ||
| public Schema snapshot(SchemaVersion schemaVersion) { | ||
| return this; | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another reason why this needs to be a plugin. This should integrate with OSS Iceberg too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specifically I want to bring in the custom shaded distribution of li-iceberg-hive-metastore here. But in general, I feel ok but Coral depending on li-iceberg.