Skip to content

Conversation

@sanujbasu
Copy link
Collaborator

@sanujbasu sanujbasu commented Jan 22, 2026

🥞 Stacked PR

Use this link to review incremental changes.


What changes are proposed in this pull request?

  1. Adds the ability to explicitly enable table features via table properties
    using the delta.feature. = supported syntax, matching the
    Java Kernel's behavior. Only ALLOWED_DELTA_FEATURES can be set during
    create. Features get added to protocol features.

  2. Allows the min reader/ writer versions to be updated in the protocol using
    signal flags. Only protocol versions (3, 7) are supported.

Key Changes:

  • Add SET_TABLE_FEATURE_SUPPORTED_PREFIX and SET_TABLE_FEATURE_SUPPORTED_VALUE
    constants to table_features module. ALLOWED_DELTA_PROPERTIES moved to the table property
    configuration module.
  • Add TableFeature::from_name() to parse feature names from strings
  • Add TableFeature::is_reader_writer() to check feature type
  • Add TableCreationConfig struct to encapsulate parsing and validation of
    user-provided table properties during CREATE TABLE operations.
  • Extract delta.feature.* signal flags into reader/writer feature lists
  • Extract delta.minReaderVersion/minWriterVersion into protocol hints
  • Strip signal flags from properties, pass remaining to metadata
  • Reject unknown features and invalid feature flag values
Usage:
create_table("/path/to/table", schema, "MyApp/1.0")
    .with_table_properties([
        ("delta.minReaderVersion", "3"),
        ("delta.minWriterVersion", "7"),
    ])
    .build(&engine, Box::new(FileSystemCommitter::new()))?
    .commit(&engine)?;

The delta.feature.* properties are consumed during build() and not stored
in the final Metadata configuration, matching Java Kernel behavior.

How was this change tested?

Rust module unit tests

@codecov
Copy link

codecov bot commented Jan 22, 2026

Codecov Report

❌ Patch coverage is 97.91045% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.75%. Comparing base (a58cbb1) to head (d09e389).

Files with missing lines Patch % Lines
kernel/src/table_property_protocol_config.rs 98.16% 3 Missing and 1 partial ⚠️
kernel/src/table_features/mod.rs 96.66% 0 Missing and 1 partial ⚠️
kernel/src/transaction/create_table.rs 98.66% 0 Missing and 1 partial ⚠️
kernel/src/utils.rs 91.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1655      +/-   ##
==========================================
+ Coverage   84.64%   84.75%   +0.11%     
==========================================
  Files         125      126       +1     
  Lines       34721    35005     +284     
  Branches    34721    35005     +284     
==========================================
+ Hits        29388    29668     +280     
- Misses       3983     3986       +3     
- Partials     1350     1351       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

let table_path = temp_dir.path().to_str().expect("Invalid path").to_string();

let engine =
create_default_engine(&url::Url::from_directory_path(&table_path).expect("Invalid URL"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use ? to remove the .expect() calls and make this test more succinct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to weigh the tradeoffs here. "?" is useful when we are testing error propagation but strips some context
like the line number.

I personally prefer expect() and am kinda surprised that we don't do it in every test. I prefer test failures to be loud with all the debug context.

Copy link
Collaborator

@scottsand-db scottsand-db Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally fair. Perhaps a more experienced rust dev like @nicklan or @OussamaSaoudi could chime in here as a tie breaker?

All I want is: we kernel rust developers agree on the one best practice, and stick to it

@sanujbasu sanujbasu force-pushed the stack/create_table_3 branch from 1cbde24 to 352203b Compare January 23, 2026 23:55
@github-actions github-actions bot removed the breaking-change Change that require a major version bump label Jan 24, 2026
@sanujbasu sanujbasu force-pushed the stack/create_table_3 branch from 352203b to 484a0f2 Compare January 24, 2026 00:13
1. Adds the ability to explicitly enable table features via table properties
using the delta.feature.<featureName> = supported syntax, matching the
Java Kernel's behavior. Only ALLOWED_DELTA_FEATURES can be set during
create. Features get added to protocol features.

2. Allows the min reader/ writer versions to be updated in the protocol using
signal flags. Only protocol versions (3, 7) are supported.

Key Changes:
- Add SET_TABLE_FEATURE_SUPPORTED_PREFIX and SET_TABLE_FEATURE_SUPPORTED_VALUE
  constants to table_features module. Move the feature/ property allow/
  deny list to the table property configuration module
- Add TableFeature::from_name() to parse feature names from strings
- Add TableFeature::is_reader_writer() to check feature type
- Add TableCreationConfig struct to encapsulate parsing and validation of
user-provided table properties during CREATE TABLE operations.
- Extract delta.feature.* signal flags into reader/writer feature lists
- Extract delta.minReaderVersion/minWriterVersion into protocol hints
- Strip signal flags from properties, pass remaining to metadata
- Reject unknown features and invalid feature flag values

Usage:
create_table("/path/to/table", schema, "MyApp/1.0")
    .with_table_properties([
        ("delta.minReaderVersion", "3"),
        ("delta.minWriterVersion", "7"),
    ])
    .build(&engine, Box::new(FileSystemCommitter::new()))?
    .commit(&engine)?;

The delta.feature.* properties are consumed during build() and not stored
in the final Metadata configuration, matching Java Kernel behavior.
ensure_table_does_not_exist(storage.as_ref(), &delta_log_url, &self.path)?;
// Parse and validate table properties
let config = TablePropertyProtocolConfig::try_from(self.table_properties)?;
config.validate_for_create()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this is the right abstraction for property/feature management. I think the right approach is something that takes:

  • initial metadata (if present)
  • initial protocol (if present)
  • Table property updates
  • dataLayout (whenever we ship it)

And produces

  • metadata
  • protocol

Given new user defined properties , we should:

  • enable table features. These features may enable other features through auto-enablement
  • add new properties to the table (ex: inCommitTimestampEnablementVersion and inCommitTimestampEnablementTimstamp)
  • enforce requirements of properties/features. (Ex: minReaderVersion=2 will enforce that any updates we performed above do not upgrade this beyond version 2).
  • Update the schema according to column mapping mode and IcebergWriterCompatV1.

We take an initial P&M because this should also be shared with alter table/updates.

@scottsand-db would love input on this since I reckon you're familiar with this part of the code.

Comment on lines +71 to +72
let mut user_reader_version: Option<i32> = None;
let mut user_writer_version: Option<i32> = None;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we factor out reader/writer version extraction? We have a hashmap that we can directly query. The code blocks below are also mostly duplicated

            } else if key == MIN_WRITER_VERSION_PROP {
                // Parse delta.minWriterVersion
                let version: i32 = value.parse().map_err(|_| {
                    Error::generic(format!(
                        "Invalid value '{}' for '{}'. Must be an integer.",
                        value, key
                    ))
                })?;
                // Validate immediately: if specified, must be the required version
                if version != TABLE_FEATURES_MIN_WRITER_VERSION {
                    return Err(Error::generic(format!(
                        "Invalid value '{}' for '{}'. Only '{}' is supported.",
                        version, MIN_WRITER_VERSION_PROP, TABLE_FEATURES_MIN_WRITER_VERSION
                    )));
                }
                user_writer_version = Some(version);

Comment on lines +124 to +126
} else {
// Pass through to table properties
table_properties.insert(key, value);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still pass through any old delta.* feature. We only filter for SET_TABLE_FEATURE_SUPPORTED_PREFIX, and min reader/writer features. Example :

  • delta.enableDeletionVectors would be added to table_properties. We should probably very closely track that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose that we make three sets: delta.supported.* props, all other delta.* props, and everything else.

for the delta.* props, we should probably make that an allowlist as well. So we'll have two allowlists: one for features, and another for other delta table properties.

Comment on lines +139 to +144
let protocol = Protocol::try_new(
user_reader_version.unwrap_or(TABLE_FEATURES_MIN_READER_VERSION),
user_writer_version.unwrap_or(TABLE_FEATURES_MIN_WRITER_VERSION),
Some(reader_features.iter()),
Some(all_features.iter()),
)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should ensure that for a generated/constructed protocol+metadata that they are valid. We currently do this in the TableConfiguration try_new constructor. We shouldn't be writing out protocol and metadata without doing this validation. This is another reason why generating both metadata and protocol here is a good idea :)

Below is a snippet of the checks we do in TableConfiguration.

        let schema = Arc::new(metadata.parse_schema()?);
        let table_properties = metadata.parse_table_properties();
        let column_mapping_mode = column_mapping_mode(&protocol, &table_properties);

        // validate column mapping mode -- all schema fields should be correctly (un)annotated
        validate_schema_column_mapping(&schema, column_mapping_mode)?;

        validate_timestamp_ntz_feature_support(&schema, &protocol)?;

        validate_variant_type_feature_support(&schema, &protocol)?;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants