-
Notifications
You must be signed in to change notification settings - Fork 135
Open
Description
The interaction between threshold and topk is confusing, it's almost "stop the search whichever comes first", but not quite, because we don't actually start accumulating results towards topk until we clear the threshold.
I'd like to simplify this to
- Find a local maximum in the graph using an approach similar to TwoPhaseTracker
- Pull out the top k
However, this doesn't currently work because there's a ton of noise introduced by compression. We can fix this by performing step 1 using exact similarities for the top candidate.
We know that that introduces a lot of expensive full-resolution comparisons. But we might be able to make up for that if we push overquery into the search process. (So if user asks for 100 results we pass topk=100 into the search and let it deal with the noise.)
Metadata
Metadata
Assignees
Labels
No labels