Flink: add append capability to dynamic iceberg sink (#14526) #14559

bezdomniy · 2025-11-11T10:15:32Z

Added "append" snapshot operation to org.apache.iceberg.flink.sink.dynamic.DynamicIcebergSink
Used the same logic as currently in org.apache.iceberg.flink.sink.FlinkSink
Tested with org.apache.iceberg.flink.source.IcebergSource using StreamingStartingStrategy.INCREMENTAL_FROM_LATEST_SNAPSHOT and it resolves the issue described in #14526

pvary · 2025-11-11T10:25:24Z

flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicCommitter.java

+          Arrays.stream(result.dataFiles()).forEach(appendFiles::appendFile);
+        }
      }
+      String description = "append";


nit: newline after the block. See: https://iceberg.apache.org/contribute/#block-spacing

pvary · 2025-11-11T10:26:09Z

flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicCommitter.java

+          branch,
+          appendFiles,
+          summary,
+          description,


Can we use the string value here? We don't reuse the description variable anywhere else.

pvary · 2025-11-11T10:26:51Z

flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicCommitter.java

-        Arrays.stream(result.dataFiles()).forEach(rowDelta::addRows);
-        Arrays.stream(result.deleteFiles()).forEach(rowDelta::addDeletes);
+    if (summary.deleteFilesCount() == 0) {
+      // To be compatible with iceberg format V1.


Could we change this comment to describe correctly why we do this?

pvary · 2025-11-11T10:33:29Z

Please add test cases for the new feature to ensure the fix isn’t accidentally reverted in the future.

Also, implement the changes only in the latest Flink version (Flink 2.1). This will speed up the review process since you won’t need to ensure every requested change is merged across all versions. We’ll create a separate backport PR later to apply the agreed changes to older Flink versions.

mxm

Thank you for the PR @bezdomniy! Initially, we only change code in the latest Flink version (2.1). After merging the changes, we backport to the older versions. Could you remove the 1.20 and 2.0 changes?

I've left some suggestions inline.

mxm · 2025-11-11T10:40:25Z

flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicCommitter.java

-        // the position delete files that are being committed.
-        Arrays.stream(result.dataFiles()).forEach(rowDelta::addRows);
-        Arrays.stream(result.deleteFiles()).forEach(rowDelta::addDeletes);
+    if (summary.deleteFilesCount() == 0) {


I'm not sure about the granularity of this value, as every pending result could contain or not contain deletes. Probably best to check the WriteResults directly.

The deleteFilesCount should be correct.

This is how it is calculated:

iceberg/flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicCommitter.java

Line 223 in 059310e

summary.addAll(pendingResults);

And internally:

iceberg/flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/CommitSummary.java

Lines 46 to 66 in 059310e

public void addAll(NavigableMap<Long, List<WriteResult>> pendingResults) {

pendingResults.values().forEach(writeResults -> writeResults.forEach(this::addWriteResult));

}

private void addWriteResult(WriteResult writeResult) {

dataFilesCount.addAndGet(writeResult.dataFiles().length);

Arrays.stream(writeResult.dataFiles())

.forEach(

dataFile -> {

dataFilesRecordCount.addAndGet(dataFile.recordCount());

dataFilesByteCount.addAndGet(dataFile.fileSizeInBytes());

});

deleteFilesCount.addAndGet(writeResult.deleteFiles().length);

Arrays.stream(writeResult.deleteFiles())

.forEach(

deleteFile -> {

deleteFilesRecordCount.addAndGet(deleteFile.recordCount());

long deleteBytes = ScanTaskUtil.contentSizeInBytes(deleteFile);

deleteFilesByteCount.addAndGet(deleteBytes);

});

}

The value is correct but we commit at checkpoint level and the delete file count is done across checkpoints. Strictly speaking, there could be both append-only checkpoints and overwrite checkpoints as part of pendingResults.

Interesting decision.
In IcebergSink we commit multiple checkpoints together in a single commit, if we happen to accumulate multiple of them.

When we fixed #14182, we decided not to do that. In fact, you proposed not to do that: #14182 (comment) 🤓

IMHO this kind of optimization is a bit premature. In practice it is rare to even have multiple pending checkpoints.

Also, as I have mentioned in the comment, this could cause issues if "replacePartition" is used

So we should commit them 1-by-1

mxm · 2025-11-11T10:41:11Z

flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicCommitter.java

      String operatorId) {
-    for (Map.Entry<Long, List<WriteResult>> e : pendingResults.entrySet()) {
-      long checkpointId = e.getKey();
-      List<WriteResult> writeResults = e.getValue();


Can we keep the loop structure? We will need it for both types of snapshots. This should work:

for (Map.Entry<Long, List<WriteResult>> e : pendingResults.entrySet()) { long checkpointId = e.getKey(); List<WriteResult> writeResults = e.getValue(); boolean appendOnly = true; for (WriteResult writeResult : writeResults) { if (writeResult.deleteFiles().length > 0) { appendOnly = false; break; } } final SnapshotUpdate snapshotUpdate; if (appendOnly) { AppendFiles appendFiles = table.newAppend().scanManifestsWith(workerPool); for (WriteResult result : writeResults) { Arrays.stream(result.dataFiles()).forEach(appendFiles::appendFile); } snapshotUpdate = appendFiles } else { RowDelta rowDelta = table.newRowDelta().scanManifestsWith(workerPool); for (WriteResult result : writeResults) { Arrays.stream(result.dataFiles()).forEach(rowDelta::addRows); Arrays.stream(result.deleteFiles()).forEach(rowDelta::addDeletes); } snapshotUpdate = rowDelta; } commitOperation( table, branch, snapshotUpdate, summary, appendOnly ? "append" : "rowDelta", newFlinkJobId, operatorId, checkpointId); }

this looks good - only thing left is the checkState for result.referencedDataFiles().length == 0 (which exists in IcebergSink) which I will add to the loop checking it is appendOnly

mxm · 2025-11-12T09:25:18Z

flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicCommitter.java

+    if (summary.deleteFilesCount() == 0) {
+      // Use append snapshot operation where possible
+      AppendFiles appendFiles = table.newAppend().scanManifestsWith(workerPool);
+      for (List<WriteResult> resultList : pendingResults.values()) {


Can we move this loop up one level, as in https://github.com/apache/iceberg/pull/14559/files#r2513719871? This avoids repeating it.

Flink: add append capability to dynamic iceberg sink

6030782

github-actions bot added the flink label Nov 11, 2025

pvary reviewed Nov 11, 2025

View reviewed changes

mxm reviewed Nov 11, 2025

View reviewed changes

Ilia Chibaev added 3 commits November 11, 2025 11:42

revert changes to earlier versions

0621c1b

address PR comments

3f6c923

test append snapshot operation

4a852ad

mxm reviewed Nov 12, 2025

View reviewed changes

mxm mentioned this pull request Nov 13, 2025

Core: Classify RowDelta with data files only as APPEND #14581

Open

	public void addAll(NavigableMap<Long, List<WriteResult>> pendingResults) {
	pendingResults.values().forEach(writeResults -> writeResults.forEach(this::addWriteResult));
	}

	private void addWriteResult(WriteResult writeResult) {
	dataFilesCount.addAndGet(writeResult.dataFiles().length);
	Arrays.stream(writeResult.dataFiles())
	.forEach(
	dataFile -> {
	dataFilesRecordCount.addAndGet(dataFile.recordCount());
	dataFilesByteCount.addAndGet(dataFile.fileSizeInBytes());
	});
	deleteFilesCount.addAndGet(writeResult.deleteFiles().length);
	Arrays.stream(writeResult.deleteFiles())
	.forEach(
	deleteFile -> {
	deleteFilesRecordCount.addAndGet(deleteFile.recordCount());
	long deleteBytes = ScanTaskUtil.contentSizeInBytes(deleteFile);
	deleteFilesByteCount.addAndGet(deleteBytes);
	});
	}

Flink: add append capability to dynamic iceberg sink (#14526) #14559

Are you sure you want to change the base?

Flink: add append capability to dynamic iceberg sink (#14526) #14559

Uh oh!

Conversation

bezdomniy commented Nov 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pvary commented Nov 11, 2025

Uh oh!

mxm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mxm Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mxm Nov 11, 2025 •

edited

Loading