feat: add aarch64 neon simd support for distance calculations #127
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #126
Added simd distance (cosine and l2) calculations for aarch64 neon intrinsics. Tested on Macbook M1 Max, AWS EC2 t4g.medium, and even a Raspberry Pi 4. Benchmarks showed anywhere from 3-10x performance improvements depending on the machine. Seeing that many cloud managed db's use ARM (AWS RDS is all Gravitron3 chips IIRC), Neon SIMD would go a long ways.
Added a test between this new function and the unoptimized version to ensure distance calculations are equivalent. All tests pass.