-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Implement IndexFlatL2Panorama
#4645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| """Test when n_levels doesn't evenly divide dimension""" | ||
| test_cases = [(65, 4), (63, 8), (100, 7)] | ||
|
|
||
| # TODO(aknayar): Test functions like get_single_code(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add these tests in a follow-up PR.
|
I rebuilt with AVX2 on Linux and was unable to reproduce the failing tests seen here, any ideas what may have happened? |
|
Hey, let me just copy what @mnorris11 said and we can resume the thread from there.
Do you have faiss installed with numpy2? It's a recent integration, and that could be the reason for the difference. Let me know the conda steps you took to repro! |
|
@limqiying Thank you for the ideas! It seems like it was, in fact, some weird case involving a tie. Panorama also suffers from floating-point imprecision due to how we calculate squared L2 norm: Faiss does |
| n_levels(n_levels), | ||
| batch_size(batch_size), | ||
| pano(code_size, n_levels, batch_size) { | ||
| FAISS_THROW_IF_NOT(metric == METRIC_L2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be relaxed when we add inner product support.
| cum_sums.clear(); | ||
| } | ||
|
|
||
| void IndexFlatPanorama::reconstruct(idx_t key, float* recons) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can implement these in a follow-up PR since this one is quite big already.
|
@limqiying has imported this pull request. If you are a Meta employee, you can view this in D86234773. |
This PR adds
IndexFlatL2Panorama, integrating Panorama (as specified in the paper) intoIndexFlatL2. This is the first step in creating anIndexRefinePanorama, which will useIndexFlatL2Panorama(or anIndexPreTransformwith anIndexFlatL2Panorama) as itsrefine_index.Refactoring
Since the bulk of Panorama's refinement logic would be duplicated between
IndexFlatL2PanoramaandIndexIVFFlatPanorama, it has been factored out into a newPanoramastruct. This struct contains key parameters (batch_size,d, etc.) and the following utility functions:copy_codes_to_level_layout: Writes new vectors tocodesfollowing Panorama's storage layoutcompute_cumulative_sums: Computes the cumulative sums for new vectorscompute_query_cum_sums: Computes the cumulative sums for a new queryprogressive_filter_batch: Performs Panorama refinement on a batch of vectorsThese utilities will be shared by most Panorama indexes, which is why I have refactored them into their own utility.
IndexRefinePanorama
While the
IndexFlatL2Panoramaimplemented in this PR technically contains all the functionality needed to implementIndexRefinePanorama(performingsearchon a subset of indices), it is not ready to be used as arefine_index. The current implementation is not optimized for the case ofIndexRefine, where we perform search on a very small subset of the datapoints. This leads to vastly scattered memory accesses during thesearch, to the point where the overhead of maintainingactive_indicesandexact_distancescan thwart Panorama's speedups.As such, to optimize for
IndexRefinewe will need a standalone implementation ofsearch_subsetwhich instead does the following:i, compute its distance alone by Panorama refinement (essentially havingbatch_size= 1. In fact, for this very reason I have madebatch_sizea parameter in the constructor—IndexRefinewill require it to be 1 due to noncontiguous memory accesses, but typical workloads would benefit from 128-1024.)This will unfortunately mean we cannot reuse the search utilities in the
Panoramastruct in this specific case, but will allow us to squeeze 2-5x speedups during the reordering phase ofIndexRefine.Testing
tests/test_flat_l2_panorama.pybenchs/bench_flat_l2_panorama.py, yielding the following results:The recall being less than 1.0 is perhaps due to discrepancies between faiss results and the

ground_truthvalues.