[SPARK-52008] [SS] Add StateStore TaskCompletionListener to abort store and throw error #50795
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Added TaskCompletionListener for both HDFS and RocksDB StateStore that will check whether the state is still in the UPDATING state (not committed or aborted) and abort it. If the task isn't failed or interrupted, it will fail the task with
STATE_STORE_UPDATING_AFTER_TASK_COMPLETION
.Why are the changes needed?
As explained in SPARK-52008, when a user defines a function with foreachBatch that does not completely consume the passed in iterator, state stores will be opened but not committed when the batch finishes and no error will be thrown. This will lead to "changelog/delta file not found" error for the next batch which confuses users.
Instead, we should explicitly throw an error in the TaskCompletionListener that will abort any state stores still in the updating state and throw an exception to fail the task (if the task is not already failed or interrupted).
Does this PR introduce any user-facing change?
Yes, throws
STATE_STORE_UPDATING_AFTER_TASK_COMPLETION
instead ofFileNotFound
error.How was this patch tested?
New FEB integration test and unit test.
Was this patch authored or co-authored using generative AI tooling?
No