Skip to content

Conversation

@fulmicoton-dd
Copy link
Collaborator

No description provided.

@fulmicoton-dd fulmicoton-dd force-pushed the paul.masurel/skip-first-doc-on-intersection branch from 3bc26fa to 4800401 Compare December 30, 2025 17:45
@fulmicoton fulmicoton requested a review from PSeitz December 30, 2025 17:46
@fulmicoton-dd fulmicoton-dd force-pushed the paul.masurel/skip-first-doc-on-intersection branch 5 times, most recently from eac4c95 to 1898828 Compare December 30, 2025 22:30
@fulmicoton fulmicoton marked this pull request as ready for review December 30, 2025 22:41
@fulmicoton-dd fulmicoton-dd force-pushed the paul.masurel/skip-first-doc-on-intersection branch 7 times, most recently from 20ef4f2 to f98ba14 Compare December 31, 2025 17:15
tantivy requires Scorer to be positioned on a DocId at all time.
This decision is not performance neutral.

When we have an intersection of a heavy DocSet with a lighter one
forcing the positioning of the first doc is needlessly expensive.

This PR fixes this by introducing a seek_doc parameter in the weight function.
Weights may skip over documents when they create the Scorer.
@fulmicoton-dd fulmicoton-dd force-pushed the paul.masurel/skip-first-doc-on-intersection branch from f98ba14 to 98be1a5 Compare December 31, 2025 17:20
@fulmicoton-dd fulmicoton-dd force-pushed the paul.masurel/skip-first-doc-on-intersection branch from d954b74 to 601541e Compare January 2, 2026 12:07
}

/// Returns a priority number used to sort weights when running an
/// intersection.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a comment here what higher or lower numbers mean

///
/// You can either call `load_at_start` to load it its first block,
/// or skip a few blocks by calling `seek_and_load`.
pub(crate) struct BlockSegmentPostingsNotLoaded(BlockSegmentPostings);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we require the docid seek on the BlockSegmentPostings instead and drop this intermediate struct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants