Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update lastDoc in ScoreCachingWrappingScorer #13987

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ private ScoreCachingWrappingScorer(Scorable scorer) {
public float score() throws IOException {
if (lastDoc != curDoc) {
curScore = in.score();
curDoc = lastDoc;
lastDoc = curDoc;
}

return curScore;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,8 +114,9 @@ public void testNegativeScores() throws Exception {
Collector c = new PositiveScoresOnlyCollector(tdc);
LeafCollector ac = c.getLeafCollector(ir.leaves().get(0));
ac.setScorer(s);
while (s.iterator().nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
ac.collect(0);
int docId;
while ((docId = s.iterator().nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) {
ac.collect(docId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, good catch! So this was a latent pre-existing test bug, and with your above bug fix this test is now (correctly!) failing, and with your fix to this separate test bug, the test now passes?

}
TopDocs td = tdc.topDocs();
ScoreDoc[] sd = td.scoreDocs;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -157,4 +157,44 @@ public void testGetScores() throws Exception {
ir.close();
directory.close();
}

private static class CountingScorable extends FilterScorable {
int count;

public CountingScorable(Scorable in) {
super(in);
}

@Override
public float score() throws IOException {
count++;
return in.score();
}
}

public void testRepeatedCollectReusesScore() throws Exception {
Scorer s = new SimpleScorer();
CountingScorable countingScorable = new CountingScorable(s);
// We're going to collect every document twice, so we need to allocate a 2x larger array.
ScoreCachingCollector scc = new ScoreCachingCollector(scores.length * 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment about why we need * 2 here?

LeafCollector lc = scc.getLeafCollector(null);
lc.setScorer(countingScorable);

// We need to iterate on the scorer so that its doc() advances.
int doc;
while ((doc = s.iterator().nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) {
lc.collect(doc);
lc.collect(doc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's illegal to call collect() multiple times on the same doc, this shouldn't be necessary as collect() already calls Scorable#score() multiple times?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah -- that's why I'd hesitate to really call this a "bug fix".

I'm pretty sure you meant to write lastDoc = currDoc in #12407, but functionally currDoc = lastDoc works fine as long as you don't call collect() multiple times on the same doc.

I only noticed this because I was copy/pasting the logic to adapt it for LeafBucketCollector in OpenSearch and IntelliJ warned me that lastDoc could be final.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm why are we collecting the same doc twice? (I thought this is smelly -- we fixed the other two tests to stop doing that?).

Oh I see, is so score() happens more than once while on the same doc? Actually, since ScoreCachingCollector's LeafCollector is calling .score() three times for each .collect(), you should be able to do only one lc.collect(doc) here?

Does this new test case fail if you revert the bug fix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new test does fail if I revert the fix.

As I mentioned in #13987 (comment), the condition for this being an actual bug is probably not really valid (i.e. you need to call collect() on the ScoreCachingWrappingLeafCollector multiple times for the same document).

The caching does work if the wrapped LeafCollector calls score() multiple times, which can happen in MultiCollector (for example). The layering is roughly:

ScoreCachingWrappingLeafCollector wraps
  Some other LeafCollector whose score is
    ScoreCachingWrappingScorer, which wraps
      The real Scorable

The multiple calls to score() from Some other LeafCollector work just fine. Again, it's the call to collect() on ScoreCachingWrappingLeafCollector that was invalidating the cache.

Those other two tests broke because my "fix" causes the caching to work (but they didn't expect it).

}

// Verify that each score was collected twice.
for (int i = 0; i < scores.length; i++) {
assertEquals(scores[i], scc.mscores[i * 2], 0f);
assertEquals(scores[i], scc.mscores[i * 2 + 1], 0f);
}

// But we only call score() on the underlying scorer once per document, because scores are
// cached.
assertEquals(scores.length, countingScorable.count);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -359,7 +359,7 @@ public void testTotalHitsWithScore() throws Exception {
leafCollector.collect(1);

scorer.score = 4;
leafCollector.collect(1);
leafCollector.collect(2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pre-existing test bug fixed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the change, the call to leafCollector.collect(1) would get a score of 3 (since it's cached for document 1), instead of the 4 that the test expected.


TopDocs topDocs = collector.topDocs();
assertEquals(totalHitsThreshold < 4, scorer.minCompetitiveScore != null);
Expand Down