-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recovery fails to commit dangling transactions if there are no in-flight transactions #53
Comments
I think a possible patch for this could be to change the value which BitronixTransactionManager#getOldestInFlightTransactionTimestamp() returns when there are no in-flight transactions. Instead of returning Long.MIN_VALUE, it should return Long.MAX_VALUE, because you want to recovery all dangling transactions.
|
In version 3.0.0-SNAPSHOT, see Line 313 of https://github.com/bitronix/btm/blob/master/btm/src/main/java/bitronix/tm/BitronixTransactionManager.java |
This specific check in BitronixTransactionManager.getOldestInFlightTransactionTimestamp() that makes it return Long.MIN_VALUE was added for a purpose. It took me time to remember why, but I think I found the original reason back. The Recoverer is the only consumer of this API, and it uses this timestamp to avoid stepping on the toes of the 2PC engine while it's still running on in-flight transactions. Basically, any transaction with a timestamp older than or equal to the oldest in-flight transaction's timestamp is considered in-flight and ignored by the recoverer. There's of course a special case: when there is no in-flight transaction. If we blindly returned Long.MAX_VALUE then all transaction would become a target for the recoverer, including the ones that started right after getOldestInFlightTransactionTimestamp() is read by the recoverer but before the recoverer started its job. This would create a race condition where the recoverer could conflict with in-flight transactions. This explains why Long.MIN_VALUE is returned in such case: to avoid that race condition. As you noticed, there is that ill-effect that if there isn't any activity, the recoverer will postpone its job forever. An easy fix would be to return MonotonicClock.currentTimeMillis() instead of Long.MIN_VALUE and explicitly state that in getOldestInFlightTransactionTimestamp()'s javadoc. |
Test Case (BTM 2.1.4):
The result is that the resource never has "commit" called again, unless BTM is restarted, or randomly a transaction is in-flight during recovery.
The text was updated successfully, but these errors were encountered: