You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
using --top_hits_only in usearch_global is very dangerous since there exists taxonomical mis-annotations in the reference database.
For example, if the identity of hit A is 99.124% while B is 99.123%, the option --top_hits_only will only keep hit A. However I have encountered frequently that A is taxonomical mis-labelled while B seems correct.
Currently, my strategy is to set a low identity threshold such as --id 0.6 to obtain as many hits as possible, and then select the top-N-hits. The remaining hits are useless.
I would be glad if there is an option called --top_N_hits_only N, while the conventional --top_hits_only is equivalent to --top_N_hits_only 1