-
Notifications
You must be signed in to change notification settings - Fork 6.1k
planner: activate stale-read txn when autocommit=0 #65908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Skipping CI for Draft Pull Request. |
|
Skipping CI for Draft Pull Request. |
2857898 to
da39e55
Compare
|
/test all |
|
/retest |
da39e55 to
ea4473c
Compare
|
/test all |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #65908 +/- ##
================================================
- Coverage 77.7756% 77.4591% -0.3166%
================================================
Files 2001 1923 -78
Lines 545524 538914 -6610
================================================
- Hits 424285 417438 -6847
- Misses 119577 121461 +1884
+ Partials 1662 15 -1647
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
ea4473c to
a487a74
Compare
|
/test all |
|
/retest |
1 similar comment
|
/retest |
|
/retest |
When `autocommit=0` and stale read is enabled, the first SELECT now activates a proper stale-read transaction instead of just replacing the provider. This ensures: 1. `InTxn()` becomes true after the first SELECT 2. Subsequent reads reuse the same snapshot (StartTS) 3. Write statements are rejected (existing behavior) 4. COMMIT/ROLLBACK ends the transaction normally
a487a74 to
3fa273c
Compare
pkg/sessiontxn/txn_context_test.go
Outdated
| tk.MustExec("create table t (id int primary key, v int)") | ||
| tk.MustExec("insert into t values (1, 10)") | ||
|
|
||
| time.Sleep(2 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use mock/injected ts allocator or failpoint so the Sleep here could be avoided?
Sleeping in CI test case may slow down CI processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried rewriting this using a mock or failpoint to eliminate the Sleep, but neither approach worked out.
I also looked at other similar tests, and they seem also rely on a sleep to ensure stale read returns the expected data (please let me know if I missed a better pattern).
As a small improvement, we could shorten the sleep to 1.2s; it passes reliably in repeated runs on my machine. Would you be OK with this change?
| require.Equal(t, uint64(99), count) | ||
| } | ||
|
|
||
| func TestStaleReadTxnWithAutocommitOff(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to also cover the plan cache path(so is the prepared statement) as key information of stale read is processed during planning statge.
Former critical issue encounterred #54652.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added test cases for this.
| tk.MustExec("create table t2 (id int)") | ||
| require.False(t, sessVars.InTxn(), "DDL should implicitly end the stale-read txn") | ||
| tk.MustExec("drop table t2") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should select as of timestamp be verified here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, using select as of timestamp together with autocommit = 0 yields uncertain results.
According to the Stale Read Product Design, SELECT ... AS OF TIMESTAMP
- Can not be used together with other Stale Read features or variables.
- If the variable tidb_read_staleness is set at the same time, it will be ignored and the tso in SELECT ... AS OF TIMESTAMP will take effect.
- If the variable tidb_external_ts is set at the same time,
- Only used in the implicit transaction with autocommit = 1.
Statement level stale read can not be used in an explicit transaction. Otherwise, there may be different staleness in one transaction.
There is a separate pr #65960 to resolve this, making the behavior conform to the intended design.
pkg/planner/core/preprocess.go
Outdated
| enterType := sessiontxn.EnterNewTxnWithReplaceProvider | ||
| if !p.sctx.GetSessionVars().IsAutocommit() { | ||
| // start a stale-read transaction so that subsequent reads reuse the same snapshot. | ||
| enterType = sessiontxn.EnterNewTxnWithBeginStmt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnterNewTxnWithBeginStmt seems only be used by by begin / start tarnsaction only. How about still to use EnterNewTxnWithReplaceProvider and check the IsAutocommit in EnterNewTxnWithReplaceProvider:: enterNewStaleTxnWithReplaceProvider and if IsAutocommit returns true, you can call activateStaleTxn.
tidb/pkg/sessiontxn/staleread/provider.go
Lines 78 to 88 in bbeaafd
| func (p *StalenessTxnContextProvider) OnInitialize(ctx context.Context, tp sessiontxn.EnterNewTxnType) error { | |
| p.ctx = ctx | |
| switch tp { | |
| case sessiontxn.EnterNewTxnDefault, sessiontxn.EnterNewTxnWithBeginStmt: | |
| return p.activateStaleTxn() | |
| case sessiontxn.EnterNewTxnWithReplaceProvider: | |
| return p.enterNewStaleTxnWithReplaceProvider() | |
| default: | |
| return errors.Errorf("Unsupported type: %v", tp) | |
| } | |
| } |
tidb/pkg/sessiontxn/staleread/provider.go
Lines 141 to 155 in bbeaafd
| func (p *StalenessTxnContextProvider) enterNewStaleTxnWithReplaceProvider() error { | |
| if p.is == nil { | |
| is, err := GetSessionSnapshotInfoSchema(p.sctx, p.ts) | |
| if err != nil { | |
| return err | |
| } | |
| p.is = is | |
| } | |
| txnCtx := p.sctx.GetSessionVars().TxnCtx | |
| txnCtx.TxnScope = kv.GlobalTxnScope | |
| txnCtx.IsStaleness = true | |
| txnCtx.InfoSchema = p.is | |
| return nil | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, maybe we should call ActivateTxn and call SetInTxn(true) in GetStmtReadTS if autocommit=0 to keep the autocommit behavior consistency with normal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And seems there is bug (not introduced by this PR):
> start transaction read only as of timestamp now()- interval 1 second;
> select * from information_schema.tables limit 1; -- the transaction will be closed after this statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, maybe we should call
ActivateTxnand callSetInTxn(true)inGetStmtReadTSifautocommit=0to keep the autocommit behavior consistency with normal.
👍 Good suggestion! This approach is indeed more elegant, and I've verified that it works.
I have one question: I noticed another method, GetSnapshotWithStmtReadTS, which is similar to GetStmtReadTS. However, it seems to be used only in PointGet / BatchPointGet during the execution phase and some tests. I assume we don't need to apply the same handling there, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
select * from information_schema.tables limit 1;
@lcwangchao I can confirm the bug. Considering the current PR's scope, would you prefer me to open a new issue to document this?
|
/retest |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/retest |
What problem does this PR solve?
Issue Number: close #64198
Problem Summary:
When
autocommit=0andtidb_read_stalenessis set, eachSELECTstatement recalculates its ownread_tsinstead of reusing the same snapshot. This causes inconsistent reads across multiple statements within what users expect to be a single transaction.What changed and how does it work?
Changed
EnterNewTxnWithReplaceProvidertoEnterNewTxnWithBeginStmtwhen!IsAutocommit()inupdateStateFromStaleReadProcessor(). This triggersactivateStaleTxn()inStalenessTxnContextProvider.This ensures:InTxn()becomes true after the firstSELECTStartTS)COMMIT/ROLLBACKends the transaction normallyCheck List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.