-
Notifications
You must be signed in to change notification settings - Fork 490
fix: remove forced table update from python writer #3515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: remove forced table update from python writer #3515
Conversation
b043c71
to
b675bac
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3515 +/- ##
==========================================
- Coverage 74.25% 74.21% -0.04%
==========================================
Files 150 150
Lines 44739 44739
Branches 44739 44739
==========================================
- Hits 33220 33204 -16
- Misses 9374 9378 +4
- Partials 2145 2157 +12 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
b675bac
to
14dd901
Compare
@ion-elgreco done! Let me know if that covers what you wanted to see :) FWIW running the new test cases against main has two of them fail (as expected, the reason I opened this PR):
|
Signed-off-by: Ohan Fillbach <[email protected]>
14dd901
to
32ec073
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's breaking in behavior, but new behavior is more correct
Hi @roeap, anything I can do to help move this along? |
@ohanf - currently I am traveling and therefore unfortunately have no bandwidth to do proper reviews. I'll be back next week and have capacity again. Sorry for the delay! |
no worries, safe travels! |
Description
Performing (concurrent) idempotent blind-appends by comparing transaction identifiers from python is not possible. The reason for this is the internal table state is updated automatically before the actual write is performed. To perform idempotent blind appends (adapted from Spark's strategy for the same) one must load a table snapshot and compare their App ID + version to what they find in the snapshot, if the "local" version is > the one in the snapshot then the write can proceed, and any race conditions for that specific app+version are handled by the conflict checker and atomic commits. However when the table state is updated, the snapshot the app logic used for the comparison is replaced so the conflict check for updated transactions will always pass and the write will always be attempted.
This PR simply removes the automatic snapshot update from the python writer.
Running the following example script at the same time in two different windows (but same directory) should illustrate the issue:
screenshot from running against main

and against this branch

This should probably be considered a bug fix, but for some it may be "breaking" as writes were previously always attempted and now they may be aborted. However the scope is relatively small as the impact is for writes that included
Transaction
properties, if we remove the commit properties from the demo script we will see warnings on write that there is a mismatch in expected and written version, but no hard failures:Related Issue(s)
Documentation