Skip to content

memdb: prevent iterator invalidation#1563

Merged
ti-chi-bot[bot] merged 15 commits intotikv:masterfrom
ekexium:memdb-prevent-iter-invalidation
Feb 19, 2025
Merged

memdb: prevent iterator invalidation#1563
ti-chi-bot[bot] merged 15 commits intotikv:masterfrom
ekexium:memdb-prevent-iter-invalidation

Conversation

@ekexium
Copy link
Contributor

@ekexium ekexium commented Jan 23, 2025

ref pingcap/tidb#59153

To prevent potential misuse and iterator invalidation, modify the iterators provided by ART memdb as follows:

  1. Iter and IterReverse now comes with an extra check: it is invalidated immediately by any write operation to the memdb after the creation of the iterator. Attempting to use such an invalidated iterator will result in a panic.
  2. SnapshotIter and SnapshotIterReverse will be replaced by BatchedSnapshotIter.
    2.1. SnapshotIter is different from Iter that it can be valid after write operations, but only becomes invalid if a write operation modifies the "snapshot".
    2.2. We need to introduce BatchedSnapshotIter instead of directly modifying SnapshotIter because SnapshotIter maintains internal states and pointers. Consider a situation where a write operation causes changes to the internal data structure making the pointers invalid, while the snapshot should remain valid.
    2.3. SnapshotIter and SnapshotIterReverse are not removed now for compatibility.

RBT is unchanged as it is no longer used.
Pipelined MemDB still doesn't support iterators as it was.

Performance

Iterator microbenchmark

go test -run=^$ -bench=BenchmarkSnapshotIter -benchtime=3s
goos: linux
goarch: amd64
pkg: github.com/tikv/client-go/v2/internal/unionstore
cpu: AMD Ryzen 9 9900X 12-Core Processor            
BenchmarkSnapshotIter/RBT-SnapshotIter-24                     1783           1826587 ns/op             144 B/op               2 allocs/op
BenchmarkSnapshotIter/ART-SnapshotIter-24                     1870           1718771 ns/op             496 B/op              11 allocs/op
BenchmarkSnapshotIter/ART-BatchedSnapshotIter-24              1120           2964170 ns/op          417461 B/op             370 allocs/op
BenchmarkSnapshotIter/ART-ForEachInSnapshot-24                1771           1832084 ns/op             496 B/op              11 allocs/op
PASS
ok          github.com/tikv/client-go/v2/internal/unionstore        14.598s

TiDB union scan executor
BatchedSnapshotIter

go test -run=^$ -bench=BenchmarkUnionScanRead -benchtime=10s
    3999           3063035 ns/op          793197 B/op           17578 allocs/op

SnapshotIter

go test -run=^$ -bench=BenchmarkUnionScanRead -benchtime=10s
    3862           2990516 ns/op          386942 B/op           17434 allocs/op

@ti-chi-bot ti-chi-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has signed the dco. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 23, 2025
@ekexium ekexium force-pushed the memdb-prevent-iter-invalidation branch from 8ac9c7a to 1a6d100 Compare January 23, 2025 08:02
@ekexium ekexium force-pushed the memdb-prevent-iter-invalidation branch from faf4853 to 4c16a14 Compare January 23, 2025 11:30
Signed-off-by: ekexium <[email protected]>
@ekexium ekexium force-pushed the memdb-prevent-iter-invalidation branch from 916238e to e1a3b5a Compare January 23, 2025 12:27
Signed-off-by: ekexium <[email protected]>
@ekexium ekexium force-pushed the memdb-prevent-iter-invalidation branch from 74a0617 to 29fc98e Compare January 24, 2025 05:54
@ekexium ekexium marked this pull request as ready for review January 24, 2025 05:54
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 24, 2025
@ekexium ekexium requested review from cfzjywxk and you06 January 24, 2025 05:54
Signed-off-by: ekexium <[email protected]>
@ekexium ekexium force-pushed the memdb-prevent-iter-invalidation branch from 7faf0e1 to 941d94b Compare February 10, 2025 06:56
@ekexium ekexium force-pushed the memdb-prevent-iter-invalidation branch from a5e313d to b15c57a Compare February 12, 2025 08:14
Signed-off-by: ekexium <[email protected]>
@ti-chi-bot ti-chi-bot bot requested a review from you06 February 12, 2025 08:50
Copy link
Contributor

@you06 you06 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

)
}

it.db.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we move the RLock operation before seqno check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's discussed above. SnapshotSeqNo is not supposed to be accessed concurrently. This is the property that SnapshotSeqNo must have. Should this case happen we'd better expose this race

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I get you, SnapshotSeqNo is changed in very low frequency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ekexium
Better to add some comments to explain the usage of RLock here, others may not be familiar with the upper MemDB and underlying ART implementation and usages.

Copy link
Contributor

@cfzjywxk cfzjywxk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Feb 19, 2025
@ti-chi-bot ti-chi-bot bot added the lgtm label Feb 19, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Feb 19, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cfzjywxk, you06

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Feb 19, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Feb 19, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-02-19 06:38:18.325182833 +0000 UTC m=+1029740.721404894: ☑️ agreed by cfzjywxk.
  • 2025-02-19 07:59:29.789850576 +0000 UTC m=+1034612.186072637: ☑️ agreed by you06.

@ti-chi-bot ti-chi-bot bot merged commit ddec823 into tikv:master Feb 19, 2025
11 checks passed
serprex added a commit to PeerDB-io/tikv-client-go that referenced this pull request Jun 30, 2025
* *: bump pd client (tikv#1575)

 

Signed-off-by: Ryan Leung <[email protected]>

* OWNERS: Auto Sync OWNERS files from community membership (tikv#1576)

 

Signed-off-by: Ti Chi Robot <[email protected]>

* sync etcd endpoints immediately after initializing the client (tikv#1573)

 

Signed-off-by: Vlad Dmitriev <[email protected]>

* p-dml: resolve locks concurrently (tikv#1584)

close tikv#1577

Signed-off-by: you06 <[email protected]>

* memdb: prevent iterator invalidation (tikv#1563)

ref pingcap/tidb#59153

Signed-off-by: ekexium <[email protected]>

* locate: refactor RegionRequestSender.SendReqCtx (tikv#1565)

 

Signed-off-by: zyguan <[email protected]>

* locate: fix the default settings of circuit breaker (tikv#1593)

ref tikv/pd#8678

Signed-off-by: Ryan Leung <[email protected]>

* util: define and implement core interfaces for async api (tikv#1591)

ref tikv#1586

Signed-off-by: zyguan <[email protected]>

* Add a retry when getting ts from PD for validating read ts (tikv#1600)

 

Signed-off-by: MyonKeminta <[email protected]>

* pdclient: Add caller info to pd client (tikv#1516)

ref tikv/pd#8593

Signed-off-by: okJiang <[email protected]>

* locate: fix TestTiKVClientReadTimeout (tikv#1601)

 

Signed-off-by: zyguan <[email protected]>

* *: update pd client (tikv#1605)

 

Signed-off-by: Ryan Leung <[email protected]>

* Validate ts only for stale read (tikv#1607)

ref pingcap/tidb#59402

Signed-off-by: ekexium <[email protected]>

* execdetails: export scheduler write details (tikv#1606)

 

Signed-off-by: Neil Shen <[email protected]>

* Update pd client (tikv#1615)

Signed-off-by: disksing <[email protected]>

* ci: allow use label to skip integration tests (tikv#1616)

 

Signed-off-by: disksing <[email protected]>

* remove useless metric tidb_tikvclient_cop_duration_seconds_bucket (tikv#1602)

 

Signed-off-by: XuHuaiyu <[email protected]>

* client: implement SendRequestAsync for RPCClient (tikv#1604)

ref tikv#1586

Signed-off-by: zyguan <[email protected]>

* execdetails: export grpc process and wait time to time details (tikv#1614)

 

Signed-off-by: Neil Shen <[email protected]>

Co-authored-by: Bisheng Huang <[email protected]>

* Refine pessimistic lock related metrics and stats (tikv#1620)

 

Signed-off-by: yibin87 <[email protected]>

* metrics: adjust bucket count to reduce metrics data (tikv#1609)

 

Signed-off-by: Lynn <[email protected]>

* update tidb for integration tests (tikv#1621)

 

Signed-off-by: disksing <[email protected]>

* support redact key in logs (tikv#1612)

ref pingcap/tidb#59279

Signed-off-by: tangenta <[email protected]>

Co-authored-by: you06 <[email protected]>

* update integration_test/go.mod (tikv#1624)

 

Signed-off-by: tangenta <[email protected]>

* memdb: introduce snapshot interface (tikv#1623)

 

Signed-off-by: you06 <[email protected]>

Co-authored-by: ekexium <[email protected]>

* pd: enable OutputMustContainAllKeyRange (tikv#1632)

 

Signed-off-by: lhy1024 <[email protected]>

* Fix backoff lose info when forked (tikv#1627)

ref pingcap/tidb#60271

Signed-off-by: yibin87 <[email protected]>

* tikv: disable health-feedback in next-gen (tikv#1635)

 

Signed-off-by: zyguan <[email protected]>

* enable ts validation for normal read (tikv#1619)

 

Signed-off-by: ekexium <[email protected]>

* Add txn write conflict metrics (tikv#1551)

close tikv#1550

Signed-off-by: sujuntao <[email protected]>

Co-authored-by: sujuntao <[email protected]>

* apicodec: fix a typo when encoding request for CmdMvccGetByKey (tikv#1638)

 

Signed-off-by: tiancaiamao <[email protected]>

Co-authored-by: cfzjywxk <[email protected]>

* *: update kvproto version (tikv#1636)

ref tikv#1631

Signed-off-by: Chao Wang <[email protected]>

* txn: provide more information in commit RPC / log mvcc debug info when commit failed for `TxnLockNotFound` (tikv#1640)

ref tikv#1631

Signed-off-by: Chao Wang <[email protected]>

* txn: handle undetermined error in client go (tikv#1642)

close tikv#1641

Signed-off-by: Chao Wang <[email protected]>

* txn: fix the implemention of undetermined error (tikv#1644)

close tikv#1641

Signed-off-by: Chao Wang <[email protected]>

* locate: implement SendReqAsync for RegionRequestSender (tikv#1618)

ref tikv#1586

Signed-off-by: zyguan <[email protected]>

* update pd client for resource group and keyspace (tikv#1645)

 

Signed-off-by: lhy1024 <[email protected]>

* tests: bump tidb to fix integration tests (tikv#1650)

 

Signed-off-by: zyguan <[email protected]>

* Fix some metrics that miss const labels (tikv#1652)

 

Signed-off-by: yibin87 <[email protected]>

* Replace etcd safe point with txn safe point for read safety check (tikv#1634)

 

Signed-off-by: MyonKeminta <[email protected]>

* Fix stale read metrics (tikv#1649)

close tikv#1648

Signed-off-by: you06 <[email protected]>

* *: support async batch get (tikv#1646)

ref tikv#1586

Signed-off-by: zyguan <[email protected]>

* Update kvproto dependancy and set keyspace name for rpc context (tikv#1667)

close tikv#1668

Signed-off-by: yibin87 <[email protected]>

* ci: add next-gen integration tests (tikv#1661)

 

Signed-off-by: ekexium <[email protected]>

* snapshot: set `ReplicaRead` to false when `ReplicaReadType` fallbacks to `ReplicaReadLeader` (tikv#1663)

ref pingcap/tidb#61745

Signed-off-by: you06 <[email protected]>

* resource_control: support collecting cross AZ traffic in ru consumption (tikv#1669)

 

Signed-off-by: glorv <[email protected]>

* txnkv: prevent some actions from being interrupted by kill (tikv#1665)

fix pingcap/tidb#61454

Signed-off-by: zyguan <[email protected]>

* region_cache: add ForceRefreshAllStores function (tikv#1686)

 

Signed-off-by: guo-shaoge <[email protected]>

* upgrade gRPC to allow consumption by peerdb

* Bump gprc version to 1.73.0

Remove references to `grpc.NewSharedBufferPool()`, it was removed from
the grpc package and causes build to fail - it's always on.
(https://pkg.go.dev/google.golang.org/grpc/experimental#WithBufferPool)

Signed-off-by: Tiago Scolari <[email protected]>

* go mod tidy

---------

Signed-off-by: Ryan Leung <[email protected]>
Signed-off-by: Ti Chi Robot <[email protected]>
Signed-off-by: Vlad Dmitriev <[email protected]>
Signed-off-by: you06 <[email protected]>
Signed-off-by: ekexium <[email protected]>
Signed-off-by: zyguan <[email protected]>
Signed-off-by: MyonKeminta <[email protected]>
Signed-off-by: okJiang <[email protected]>
Signed-off-by: Neil Shen <[email protected]>
Signed-off-by: disksing <[email protected]>
Signed-off-by: XuHuaiyu <[email protected]>
Signed-off-by: yibin87 <[email protected]>
Signed-off-by: Lynn <[email protected]>
Signed-off-by: tangenta <[email protected]>
Signed-off-by: lhy1024 <[email protected]>
Signed-off-by: Chao Wang <[email protected]>
Signed-off-by: glorv <[email protected]>
Signed-off-by: guo-shaoge <[email protected]>
Signed-off-by: Tiago Scolari <[email protected]>
Co-authored-by: Ryan Leung <[email protected]>
Co-authored-by: Ti Chi Robot <[email protected]>
Co-authored-by: Vlad Dmitriev <[email protected]>
Co-authored-by: you06 <[email protected]>
Co-authored-by: ekexium <[email protected]>
Co-authored-by: zyguan <[email protected]>
Co-authored-by: MyonKeminta <[email protected]>
Co-authored-by: okJiang <[email protected]>
Co-authored-by: Neil Shen <[email protected]>
Co-authored-by: disksing <[email protected]>
Co-authored-by: HuaiyuXu <[email protected]>
Co-authored-by: Bisheng Huang <[email protected]>
Co-authored-by: yibin <[email protected]>
Co-authored-by: Lynn <[email protected]>
Co-authored-by: tangenta <[email protected]>
Co-authored-by: lhy1024 <[email protected]>
Co-authored-by: JT <[email protected]>
Co-authored-by: sujuntao <[email protected]>
Co-authored-by: tiancaiamao <[email protected]>
Co-authored-by: cfzjywxk <[email protected]>
Co-authored-by: 王超 <[email protected]>
Co-authored-by: glorv <[email protected]>
Co-authored-by: guo-shaoge <[email protected]>
Co-authored-by: Kevin Biju <[email protected]>
Co-authored-by: Tiago Scolari <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved dco-signoff: yes Indicates the PR's author has signed the dco. lgtm size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants