Skip to content

Conversation

@jordepic
Copy link

The DynamicIcebergSink does not currently allow providing a set of reasonable default table properties at table creation. This change will apply these properties when a table is created (not updated). This helps for specifying things like iceberg version, merge-on-read vs. copy-on-write, table location, and others.

@jordepic jordepic force-pushed the FLINK_INITIAL_TABLE_PROPERTIES branch from 06425db to 1ddd45d Compare November 12, 2025 23:01
@pvary
Copy link
Contributor

pvary commented Nov 13, 2025

I’ve added a few comments on the PR, but the bigger question is deciding what functionality we actually want.
Here are the ideas I’ve heard so far:

  • Apply the same table properties to every table at creation (current PR).
  • Set table properties at creation based on the DynamicRecord for each table individually.
  • Define the table location during creation (per table).
  • Update table properties to ensure a specific set of properties is always present.
  • Allow V2 → V3 migration (convert DVs) — related only in that operators need up-to-date knowledge of table properties.

We need to draw a clear line: what should the Dynamic Sink support, and what should be handled by external operators outside the Dynamic Iceberg Sink?

To make this decision, I’d like input from actual users: @jordepic, @mxm, @Guosmilesmile, or anyone else interested. Maybe someone could even start a discussion on the dev list to reach a wider audience.

If some changes are handled by external operators, we may need a mechanism for those operators to notify the Dynamic Iceberg Sink to refresh its caches. One option could be using something like the Flink Orchestrator restart nonce (https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/job-management/#application-restarts-without-spec-change). If the cache detects a new nonce (greater than the current one), it could invalidate the current table cache and reload the data.

@jordepic
Copy link
Author

I think our requirements, for now, are twofold:

  1. Sensible default properties on table creation
  2. Configurable locations per table (also just on creation)
    I don't think that we'll ever need to dynamically update table properties though due to a dynamic record!

This change is just addressing number 1! I'm happy to expand it to do number 2 as well, though I'm less sure of the business logic that everybody needs there, hence why I excluded it for now.

@jordepic jordepic force-pushed the FLINK_INITIAL_TABLE_PROPERTIES branch from 1ddd45d to 3805827 Compare November 13, 2025 17:33
@mxm
Copy link
Contributor

mxm commented Nov 14, 2025

Let's start with a narrow scope for this feature. We could always extend it to DynamicRecord, but I'm not convinced the feature should live in DynamicRecord because table properties aren't connected directly to the table data (that's why I kept it outside in #13883). The majority of the use cases will only require setting table properties on table creation. Concerning (2), we've had users request setting the table location as well, so I would suggest to add this feature alongside the table properties on table creation.

@pvary
Copy link
Contributor

pvary commented Nov 14, 2025

Do we need different properties for different tables, or we can use/set the same properties for every table?

@mxm
Copy link
Contributor

mxm commented Nov 14, 2025

I think users will need the table identifier.

@jordepic jordepic force-pushed the FLINK_INITIAL_TABLE_PROPERTIES branch from 3805827 to a1aca93 Compare November 14, 2025 15:53
@pvary
Copy link
Contributor

pvary commented Nov 14, 2025

Please rename the PR to match the new feature added

@jordepic jordepic force-pushed the FLINK_INITIAL_TABLE_PROPERTIES branch from a1aca93 to 59e291a Compare November 14, 2025 16:06
@jordepic jordepic changed the title Flink: Set table properties on DynamicIcebergSink table creation Flink: Add TableCreator interface to set table properties/location on DynamicIcebergSink table creation Nov 14, 2025
@jordepic
Copy link
Author

Please rename the PR to match the new feature added

Done.

The DynamicIcebergSink does not currently allow providing
a set of reasonable default table properties at creation time.
This change will apply these properties when a table is created
(not updated). This helps for specifying things like iceberg
version, merge-on-read vs. copy-on-write, table location,
and others.
@jordepic jordepic force-pushed the FLINK_INITIAL_TABLE_PROPERTIES branch from 59e291a to 274d118 Compare November 14, 2025 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants