Skip to content

Conversation

@mulugetam
Copy link
Contributor

This change adds AVX-512 implementations of the compute_cost and cost_update functions used in polysemous training. Benchmarks of the training phase with benchs/bench_polysemous_sift1m.py show ~1.5x speedup over the existing scalar implementation, which relied only on partial auto-vectorization.

Once the SIMD support overhaul is complete, I hope this can be considered for integration.
cc: @mdouze @subhadeepkaran

@meta-cla meta-cla bot added the CLA Signed label Sep 10, 2025
@bshethmeta
Copy link
Contributor

@mnorris11 @subhadeepkaran Do you have enough context to review this?

@subhadeepkaran
Copy link

@mnorris11 @subhadeepkaran Do you have enough context to review this?

Yep, you can assign it to me. the change can be reviewed and merged post dynamic dispatch landing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants