Skip to content

Conversation

@tarzanek
Copy link
Contributor

@tarzanek tarzanek commented Oct 7, 2025

No description provided.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Oct 7, 2025
@tarzanek
Copy link
Contributor Author

tarzanek commented Oct 7, 2025

see comments in #4570

public BitIntsHolder reduce(Collection<SuggestResultCollector> collectors) {
BitIntsHolder reduced = documentIds;
for (SuggestResultCollector collector : collectors) {
documentIds.or(collector.documentIds); //TODO fix as per https://github.com/apache/lucene/pull/766/files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be addressed for this PR ? I don't like adding TODOs in the code.

Copy link
Contributor

@idodeclare idodeclare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Lubos. A couple of comments and questions, mostly concerning paging handling

collector = TopScoreDocCollector.create(totalHits, Short.MAX_VALUE);
searcher.search(query, collector);
collectorManager = new TopScoreDocCollectorManager(totalHits, Short.MAX_VALUE);
hits = searcher.search(query, collectorManager).scoreDocs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the new version is compelling this slightly intrusive revision in order to use CollectorManager instead of Collector, may we rework at this time to call search() only once whether paging is true or false? That is, avoid the double search() on line 210 and 217 when paging is false (and go back to only setting hits once)?

Also that way we could make sure search() only happens prior to the Statistics.report() on line 212; in the current code, the Statistics are not reflecting the search() happening on line 217.

searcher.search(query, collector);
totalHits = collector.getTotalHits();
hits = searcher.search(query, collectorManager).scoreDocs;
totalHits = searcher.count(query);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formerly totalhits would have had its accuracy bounded by the Short.MAX_VALUE value of totalHitsThreshold (on the TopScoreDocCollector). For very large indexes, is there any possible performance penalty getting the entire, "fully accurate" int value of totalHits?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants