Skip to content

internal/manifest: specialize LevelIterator seek methods#4681

Merged
jbowens merged 3 commits intocockroachdb:masterfrom
jbowens:leveliterator-seek-opt
May 8, 2025
Merged

internal/manifest: specialize LevelIterator seek methods#4681
jbowens merged 3 commits intocockroachdb:masterfrom
jbowens:leveliterator-seek-opt

Conversation

@jbowens
Copy link
Contributor

@jbowens jbowens commented May 2, 2025

Previously LevelIterator's seek methods accepted an opaque
func(*TableMetadata)bool to perform comparisons. This commit specializes each
instance as separate functions, ensuring comparisons can be inlined where
possible. Special attention is given to SeekGE, which is the hot path for point
lookups.

goos: darwin
goarch: arm64
pkg: github.com/cockroachdb/pebble/internal/manifest
cpu: Apple M1 Pro
                       │   old.txt   │             newest2.txt             │
                       │   sec/op    │   sec/op     vs base                │
LevelIteratorSeekGE-10   144.3n ± 2%   110.0n ± 1%  -23.74% (p=0.000 n=10)

@jbowens jbowens requested a review from a team as a code owner May 2, 2025 20:27
@jbowens jbowens requested a review from annrpom May 2, 2025 20:27
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! :lgtm:

Consider adding some unit tests for the new iterator methods

Reviewable status: 0 of 4 files reviewed, 8 unresolved discussions (waiting on @annrpom)


internal/manifest/btree.go line 1099 at r2 (raw file):

// seekSeqNumL0 seeks an iterator over L0 files (ordered by sequence number) to
// the provided table metadata if it exists.

[nit] And if it doesn't? Next file in L0 ordering?


internal/manifest/btree.go line 1127 at r2 (raw file):

			return
		}
		i.descend(i.n, i.pos)

descend() pushes onto i.s and we don't need that here, we could just inline the part we need


internal/manifest/btree.go line 1131 at r2 (raw file):

}

// seekLargest repositions the iterator over the first table whose largest key

seekLargestGE


internal/manifest/btree.go line 1133 at r2 (raw file):

// seekLargest repositions the iterator over the first table whose largest key
// is an upper bound for the given user key. seekLargest requires the iterator's
// B-Tree to be ordered by user keys (i.e, L1+ or a single sublevel of L0).

[nit] If there is no such table, positions the iterator before the first table.


internal/manifest/btree.go line 1170 at r2 (raw file):

}

// seekSmallest repositions the iterator over the first table whose smallest key

seekSmallestGT


internal/manifest/btree.go line 1171 at r2 (raw file):

// seekSmallest repositions the iterator over the first table whose smallest key
// is a lower bound for the given user key. seekSmallest requires the iterator's

is NOT a lower bound! (better yet just say it is > userKey)

You can also consider making this return the last table with smallest key < userKey, and don't do Prev() in the calling code. We'd have to modify the code below like this:

			h := int(uint(j+k+1) >> 1) // avoid overflow when computing h
			// j < h ≤ k
			if cmp(i.n.items[h].Smallest().UserKey, userKey) < 0 {
				j = h  // preserves INVARIANT A
			} else {
				k = h-1 // preserves INVARIANT B
			}

internal/manifest/btree.go line 1173 at r2 (raw file):

// is a lower bound for the given user key. seekSmallest requires the iterator's
// B-Tree to be ordered by user keys (i.e, L1+ or a single sublevel of L0).
func (i *iterator) seekSmallest(cmp base.Compare, userKey []byte) {

If there is no such table, positions the iterator after the last table.


internal/manifest/btree.go line 1188 at r2 (raw file):

			// j ≤ h < k
			if cmp(i.n.items[h].Smallest().UserKey, userKey) < 0 {
				j = h + 1 // preserves INVARIANT A

Are you sure about this? This table could still be the one we're looking for if the next table starts after userKey? Feels like this should be:

			h := int(uint(j+k+1) >> 1) // avoid overflow when computing h
			// j ≤ h < k
			if cmp(i.n.items[h].Smallest().UserKey, userKey) <= 0 {
				j = h  // preserves INVARIANT A
			} else {
				k = h-1 // preserves INVARIANT B
			}

@jbowens jbowens force-pushed the leveliterator-seek-opt branch from bb69b0f to dac3e5d Compare May 5, 2025 16:06
Copy link
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 7 unresolved discussions (waiting on @annrpom and @RaduBerinde)


internal/manifest/btree.go line 1099 at r2 (raw file):

Previously, RaduBerinde wrote…

[nit] And if it doesn't? Next file in L0 ordering?

Updated this to use the btree node find func, so that all calls to Find can make use.


internal/manifest/btree.go line 1127 at r2 (raw file):

Previously, RaduBerinde wrote…

descend() pushes onto i.s and we don't need that here, we could just inline the part we need

We do need the stack bit because the expectation is that the iterator is now positioned over the provided file.


internal/manifest/btree.go line 1131 at r2 (raw file):

Previously, RaduBerinde wrote…

seekLargestGE

I ended up inlining this func because it's only called once, and the semantics of the seekSmallestLT func are closely intertwined with the semantics of LevelIterator.SeekLT


internal/manifest/btree.go line 1170 at r2 (raw file):

Previously, RaduBerinde wrote…

seekSmallestGT

I ended up inlining this func because it's only called once, and the semantics of this func are closely intertwined with the semantics of LevelIterator.SeekLT


internal/manifest/btree.go line 1171 at r2 (raw file):

Previously, RaduBerinde wrote…

is NOT a lower bound! (better yet just say it is > userKey)

You can also consider making this return the last table with smallest key < userKey, and don't do Prev() in the calling code. We'd have to modify the code below like this:

			h := int(uint(j+k+1) >> 1) // avoid overflow when computing h
			// j < h ≤ k
			if cmp(i.n.items[h].Smallest().UserKey, userKey) < 0 {
				j = h  // preserves INVARIANT A
			} else {
				k = h-1 // preserves INVARIANT B
			}

If our seek key is less than all tables smallest keys, we want to ultimately end up before the first table. This is also complicated by the iterator start and end bounds which are both inclusive

@jbowens jbowens force-pushed the leveliterator-seek-opt branch 2 times, most recently from b518a4e to d43a20e Compare May 5, 2025 16:22
Copy link
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RaduBerinde i made some material changes if u want to take another look

Reviewable status: 0 of 5 files reviewed, 7 unresolved discussions (waiting on @annrpom and @RaduBerinde)

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 5 files reviewed, 8 unresolved discussions (waiting on @annrpom and @jbowens)


internal/manifest/level_metadata.go line 588 at r7 (raw file):

		if i.iter.n.leaf {
			if i.iter.pos == i.iter.n.count {
				i.iter.next()

[nit] I guess it doesn't matter, but it would be more intuitive if we set i.iter.pos to i.iter.n.count-1 in this case.

jbowens added 3 commits May 8, 2025 15:34
Add a new microbenchmark measuring seeking within a level.
Previously LevelIterator's seek methods accepted an opaque
func(*TableMetadata)bool to perform comparisons. This commit specializes each
instance as separate functions, ensuring comparisons can be inlined where
possible. Special attention is given to SeekGE, which is the hot path for point
lookups.

```
goos: darwin
goarch: arm64
pkg: github.com/cockroachdb/pebble/internal/manifest
cpu: Apple M1 Pro
                       │   old.txt   │             newest2.txt             │
                       │   sec/op    │   sec/op     vs base                │
LevelIteratorSeekGE-10   144.3n ± 2%   110.0n ± 1%  -23.74% (p=0.000 n=10)
```
@jbowens jbowens force-pushed the leveliterator-seek-opt branch from d43a20e to af82ded Compare May 8, 2025 19:37
@jbowens
Copy link
Contributor Author

jbowens commented May 8, 2025

TFTR!

@jbowens jbowens merged commit 3487ac2 into cockroachdb:master May 8, 2025
6 checks passed
@jbowens jbowens deleted the leveliterator-seek-opt branch May 8, 2025 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants