Skip to content

Conversation

@wmoustafa
Copy link
Contributor

Summary

Introduces a planner-agnostic Coral type system integrated with Calcite, supporting all major SQL types, nested and parametric types, and precise TIMESTAMP(p) semantics.
Includes a full unit test suite validating nullability, nesting, and precision consistency across primitives and complex types.

Key Aspects

Core Type System

Path: coral-common/src/main/java/com/linkedin/coral/common/types/

Defines:

  • CoralDataType (base interface) and CoralTypeKind
  • CoralPrimitiveType
  • CoralDecimalType (precision, scale)
  • CoralCharType (length)
  • CoralVarcharType (length)
  • CoralArrayType (element type)
  • CoralMapType (key/value types)
  • CoralStructType
  • CoralTimestampType (fractional seconds precision 0–9)

Establishes Coral’s independent type layer.

Calcite Integration

File: CoralTypeToRelDataTypeConverter.java

  • Converts all Coral types to Calcite RelDataType
  • Preserves nullability and nested ARRAY/MAP/STRUCT types
  • Maps STRINGVARCHAR (Hive-style precision)
  • Maps TIMESTAMP(p) correctly (0s, 3ms, 6µs)

Testing

File: CoralTypeSystemTest.java

Covers:

  • All primitives with nullability
  • DECIMAL (precision/scale)
  • CHAR / VARCHAR (length)
  • ARRAY / MAP (element nullability)
  • STRUCT (simple + nested)
  • TIMESTAMP precision mapping (0, 3, 6)

Copy link
Contributor

@aastha25 aastha25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @wmoustafa ! would you also be adding hive type -> coral type converter?

Comment on lines +14 to +15
// Primitive numeric types
BOOLEAN,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for backward compatibility (& a default mapping for unkown types), maybe we should also create NULL / OTHER as it exists today in TypeConverter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know if it is used in practice. Also coral conversion heavily relies on deterministic type inference. We can always add it in the future if the use case arises?

@wmoustafa
Copy link
Contributor Author

Thanks for the PR @wmoustafa ! would you also be adding hive type -> coral type converter?

Yes. Probably in a separate PR, but I think this PR is all what we need for #556?

@aastha25
Copy link
Contributor

aastha25 commented Nov 5, 2025

Yes. Probably in a separate PR, but I think this PR is all what we need for #556?

to make #556 functional, we would need the converter between hive/iceberg type system -> CoralTypeSystem since coralTable (hiveCoralTable, IcbergCoralTable) would now expose API - CoralDataType getSchema()

@wmoustafa
Copy link
Contributor Author

Yes. Probably in a separate PR, but I think this PR is all what we need for #556?

to make #556 functional, we would need the converter between hive/iceberg type system -> CoralTypeSystem since coralTable (hiveCoralTable, IcbergCoralTable) would now expose API - CoralDataType getSchema()

Oh you meant in that way. I thought that would be part of #556. Let us coordinate how to go about that.

@aastha25
Copy link
Contributor

Code changes look good to me. A couple of minor notes that I discovered through i-testing and will be fixing as part of a small follow up PR:

(1) timestamp in hive type system maps to timestamp in Calcite with -1 precision. For backward compatibility, especially with trino engine, we need to keep it that way.
(2) This PR expands uniontype' into a struct with 'n' fields. However, previously uniontype` was being expanded into 'n+1' fields, where the extra field is a "tag" field first (INTEGER), then "field0", "field1", etc.
(3) Following is a widely used pattern in view definitions:

SELECT ..., NULL AS dummyField, ... FROM t

and this means dummyField is a 'void' hive type which needs to be represented as Coral type : NULL and subsequently Calcite type : NULL (backward compatible behavior). I will add NULL as a new coralTypeKind and add the glue logic around it.

Rest, LGTM

Copy link
Contributor

@aastha25 aastha25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for pushing this through, major milestone for Coral!

return CharType.of(charType.getLength(), nullable);
case VOID:
case UNKNOWN:
return PrimitiveType.of(CoralTypeKind.STRING, true); // Map to nullable string as a fallback
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit to use nullable field here instead of hardcoding 'true', but not a blocker.

@wmoustafa wmoustafa merged commit ec23113 into linkedin:master Nov 17, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants