Skip to content

internal/manifest: specialize LevelIterator seek methods #4681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented May 2, 2025

Previously LevelIterator's seek methods accepted an opaque
func(*TableMetadata)bool to perform comparisons. This commit specializes each
instance as separate functions, ensuring comparisons can be inlined where
possible. Special attention is given to SeekGE, which is the hot path for point
lookups.

goos: darwin
goarch: arm64
pkg: github.com/cockroachdb/pebble/internal/manifest
cpu: Apple M1 Pro
                       │   old.txt   │             newest.txt              │
                       │   sec/op    │   sec/op     vs base                │
LevelIteratorSeekGE-10   171.2n ± 2%   134.6n ± 1%  -21.36% (p=0.000 n=20)

jbowens added 2 commits May 2, 2025 16:24
Add a new microbenchmark measuring seeking within a level.
Previously LevelIterator's seek methods accepted an opaque
func(*TableMetadata)bool to perform comparisons. This commit specializes each
instance as separate functions, ensuring comparisons can be inlined where
possible. Special attention is given to SeekGE, which is the hot path for point
lookups.

```
goos: darwin
goarch: arm64
pkg: github.com/cockroachdb/pebble/internal/manifest
cpu: Apple M1 Pro
                       │   old.txt   │             newest.txt              │
                       │   sec/op    │   sec/op     vs base                │
LevelIteratorSeekGE-10   171.2n ± 2%   134.6n ± 1%  -21.36% (p=0.000 n=20)
```
@jbowens jbowens requested a review from a team as a code owner May 2, 2025 20:27
@jbowens jbowens requested a review from annrpom May 2, 2025 20:27
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! :lgtm:

Consider adding some unit tests for the new iterator methods

Reviewable status: 0 of 4 files reviewed, 8 unresolved discussions (waiting on @annrpom)


internal/manifest/btree.go line 1099 at r2 (raw file):

// seekSeqNumL0 seeks an iterator over L0 files (ordered by sequence number) to
// the provided table metadata if it exists.

[nit] And if it doesn't? Next file in L0 ordering?


internal/manifest/btree.go line 1127 at r2 (raw file):

			return
		}
		i.descend(i.n, i.pos)

descend() pushes onto i.s and we don't need that here, we could just inline the part we need


internal/manifest/btree.go line 1131 at r2 (raw file):

}

// seekLargest repositions the iterator over the first table whose largest key

seekLargestGE


internal/manifest/btree.go line 1133 at r2 (raw file):

// seekLargest repositions the iterator over the first table whose largest key
// is an upper bound for the given user key. seekLargest requires the iterator's
// B-Tree to be ordered by user keys (i.e, L1+ or a single sublevel of L0).

[nit] If there is no such table, positions the iterator before the first table.


internal/manifest/btree.go line 1170 at r2 (raw file):

}

// seekSmallest repositions the iterator over the first table whose smallest key

seekSmallestGT


internal/manifest/btree.go line 1171 at r2 (raw file):

// seekSmallest repositions the iterator over the first table whose smallest key
// is a lower bound for the given user key. seekSmallest requires the iterator's

is NOT a lower bound! (better yet just say it is > userKey)

You can also consider making this return the last table with smallest key < userKey, and don't do Prev() in the calling code. We'd have to modify the code below like this:

			h := int(uint(j+k+1) >> 1) // avoid overflow when computing h
			// j < h ≤ k
			if cmp(i.n.items[h].Smallest().UserKey, userKey) < 0 {
				j = h  // preserves INVARIANT A
			} else {
				k = h-1 // preserves INVARIANT B
			}

internal/manifest/btree.go line 1173 at r2 (raw file):

// is a lower bound for the given user key. seekSmallest requires the iterator's
// B-Tree to be ordered by user keys (i.e, L1+ or a single sublevel of L0).
func (i *iterator) seekSmallest(cmp base.Compare, userKey []byte) {

If there is no such table, positions the iterator after the last table.


internal/manifest/btree.go line 1188 at r2 (raw file):

			// j ≤ h < k
			if cmp(i.n.items[h].Smallest().UserKey, userKey) < 0 {
				j = h + 1 // preserves INVARIANT A

Are you sure about this? This table could still be the one we're looking for if the next table starts after userKey? Feels like this should be:

			h := int(uint(j+k+1) >> 1) // avoid overflow when computing h
			// j ≤ h < k
			if cmp(i.n.items[h].Smallest().UserKey, userKey) <= 0 {
				j = h  // preserves INVARIANT A
			} else {
				k = h-1 // preserves INVARIANT B
			}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants