-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async Deletion of Previous Metadata and Statistics Files #312
base: main
Are you sure you want to change the base?
Conversation
tableMetadata.snapshots().stream() | ||
.map(Snapshot::manifestListLocation) | ||
.forEach(fileIO::deleteFile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we've already handled the manifest files and manifest list files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we've already handled the manifest files and manifest list files.
Thank you for bringing that up! I’ve just noticed that these are already handled in the ManifestFileCleanupTaskHandler. Have removed theses redundant deletion in the latest commit. Appreciate your input.
tableMetadata.previousFiles().stream() | ||
.map(TableMetadata.MetadataLogEntry::file) | ||
.forEach(fileIO::deleteFile); | ||
tableMetadata.statisticsFiles().stream() | ||
.map(StatisticsFile::path) | ||
.forEach(fileIO::deleteFile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should either add retries for these or submit them as separate tasks. As is, if one of these files fails to delete, we'll retry the whole task and resubmit a new task for each manifest. If the manifests are already deleted when we retry, we'll get stuck in a retry loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, Michael! I could refactor the deletion process by adding retry strategy, similar to the tryDelete
in ManifestFileCleanupTaskHandler
.
Personally, I think using a retry mechanism here might be more effective than creating separate tasks. With separate tasks,, there’s still a risk of issues like task creation failures, which could result in skipping the entire task (which contains multiple files). By using retries within the ManifestFileCleanupTaskHandler, we can manage failure handling at the file level, ensuring that each file is retried independently. This way, if a file deletion fails, we can retry just that file without needing to resubmit or skip the others files. This approach can offers more granular control over handling failures.
I’m open to your thoughts on this! Does this seem aligned with what you were suggesting, or do you see potential advantages in the separate task approach that I might be overlooking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, do you mean adding retry within the TableCleanupTaskHandler.java
? If so, that's fine with me. I'd at least use an Executor to attempt the execution in parallel (i.e., so file2 isn't blocked during the delay between retries for file1).
TBH, I'm not very familiar with how Iceberg generates the statistic files. Does there tend to be one per snapshot? one per data file? If the latter, we could be talking about a very large number of files. If that's the case, I think submitting separate tasks with batches of files makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are per snapshots, https://iceberg.apache.org/spec/#table-statistics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there only one statistics file per snapshot? The spec is not clear:
A table can contain many statistics files associated with different table snapshots.
Unlike the partition statistics file, which is very clear:
Each table snapshot may be associated with at most one partition statistics file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be multiple statistics files as the following code shows:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are per snapshots, https://iceberg.apache.org/spec/#table-statistics.
Could be multiple statistics files as the following code shows:
Thank you for providing the context, @flyrain :)
Sorry, do you mean adding retry within the
TableCleanupTaskHandler.java
? If so, that's fine with me. I'd at least use an Executor to attempt the execution in parallel (i.e., so file2 isn't blocked during the delay between retries for file1).TBH, I'm not very familiar with how Iceberg generates the statistic files. Does there tend to be one per snapshot? one per data file? If the latter, we could be talking about a very large number of files. If that's the case, I think submitting separate tasks with batches of files makes sense.
@collado-mike Yes, if we go for retries, the logic will be within TableCleanupTaskHandler.java
. But considering the information provided by Yufei, maybe separate tasks will be a more appropriate approach for stats file deletion?
@@ -49,6 +51,7 @@ public class TableCleanupTaskHandler implements TaskHandler { | |||
private final TaskExecutor taskExecutor; | |||
private final MetaStoreManagerFactory metaStoreManagerFactory; | |||
private final Function<TaskEntity, FileIO> fileIOSupplier; | |||
private static final int BATCH_SIZE = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tentatively set the BATCH_SIZE as 10, please feel free to let me know if there is a better option
Hi @collado-mike @flyrain , I’ve refactored the deletion logic to perform asynchronous deletion in batches, with each batch handled in separate tasks. I’d appreciate any feedback or suggestions you might have :) |
Hi @collado-mike @flyrain @eric-maynard I want to kindly request any feedback or suggestions about my new changes (deletion logic refactor with asynchronous batch deletions) whenever you have a chance. Thank you in advance for your time :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @danielhumanmod for working on it! left some coments.
fileIO, | ||
cleanupTask, | ||
metaStoreManager, | ||
polarisCallContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could merge these two method calls. How about collecting the file first, then dispatching them into tasks for deletion? We don't have to distinguish file types while collecting them.
/** | ||
* {@link TaskHandler} responsible for deleting previous metadata and statistics files of a table. | ||
*/ | ||
public class TableContentCleanupTaskHandler implements TaskHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new class is largely same as class ManifestFileCleanupTaskHandler
. Can we try to merge it with ManifestFileCleanupTaskHandler
?
LOGGER.error("Unable to delete data files from manifest {}", manifestFile.path(), e); | ||
return false; | ||
} | ||
private boolean cleanUpManifestFile( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the lots of changes here, but don’t worry—it’s mainly because I refactored the deletion logic for the manifest and all its data into a new method; no other changes were made in lines 91-135.
TableIdentifier tableId = cleanupTask.getTableId(); | ||
try (FileIO authorizedFileIO = fileIOSupplier.apply(task)) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combine manifest and content file cleanup in the same task, but ensure that each task processes only one type of data (either manifest or content files) at a time.
* files in a manifest and the manifest itself. Since data files may be present in multiple | ||
* manifests across different snapshots, we assume a data file that doesn't exist is missing because | ||
* it was already deleted by another task. 2. Table content files: It contains previous metadata and | ||
* statistics files, which are grouped and deleted in batch | ||
*/ | ||
public class ManifestFileCleanupTaskHandler implements TaskHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming this task will triger lots of relevent changes. If a rename is needed, we may want to handle it in a separate PR to avoid too much changes
@@ -226,4 +233,228 @@ public void deleteFile(String location) { | |||
assertThatPredicate((String f) -> TaskUtils.exists(f, fileIO)).rejects(dataFile2Path); | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test cases in this file primarily verify theses cleanup:
- straightforward cleanup
- cleanup when files do not exist
- cleanup with retries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test cases in this file primarily verify theses scenarios:
- Cleanup of a single metadata file with a single snapshot
- Cleanup of a single metadata file across multiple snapshots
- Cleanup of multiple metadata files across multiple snapshots (New)
@@ -76,7 +79,13 @@ public void testTableCleanup() throws IOException { | |||
TestSnapshot snapshot = | |||
TaskTestUtils.newSnapshot(fileIO, "manifestList.avro", 1, snapshotId, 99L, manifestFile); | |||
String metadataFile = "v1-49494949.metadata.json"; | |||
TaskTestUtils.writeTableMetadata(fileIO, metadataFile, snapshot); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Cleanup of a single metadata file with a single snapshot
@@ -303,7 +323,20 @@ public void testTableCleanupMultipleSnapshots() throws IOException { | |||
manifestFile1, | |||
manifestFile3); // exclude manifest2 from the new snapshot | |||
String metadataFile = "v1-295495059.metadata.json"; | |||
TaskTestUtils.writeTableMetadata(fileIO, metadataFile, snapshot, snapshot2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Cleanup of a single metadata file across multiple snapshots
} | ||
} | ||
|
||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Cleanup of multiple metadata files across multiple snapshots (New)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this file primarily attach previous metadata and statistics filewhen writing metadata
Thanks for the suggestions @flyrain , I've updated the code based on your comments and added some comment to help the code easier to understand. I’d appreciate any further feedback, and thanks for your time and patience! |
Description
Currently, when handling DROP TABLE PURGE, Polaris only deletes data files and the current metadata in
TableCleanupTaskHandler
. However, other related files, such as historical metadata.json files, stat files, and others, are not deleted.Iceberg has addressed this issue (see Iceberg PR #9305), but Polaris doesn’t adopt this solution due to factors like asynchronous deletion. This PR introduces additional deletions for the following files:
Fixes #289
Included Changes
TableCleanupTaskHandler.java
to schedule previous metadata and statistics deletion in batchManifestFileCleanupTaskHandler
to delete metadata and statistics file batchTaskTestUtils.java
to write metadata with previous metadata and statistics filesTableCleanupTaskHandlerTest.java
andTableContentCleanupTaskHandlerTest
Recommended Review Order:
TableCleanupTaskHandler.java
->ManifestFileCleanupTaskHandler.java
->TaskTestUtils.java
->TableCleanupTaskHandlerTest.java
andManifestFileCleanupTaskHandlerTest
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Unit Test:
Checklist:
Please delete options that are not relevant.