Feature Request: Support for FP4 and FP8 Quantization in FAISS #4538
immortalshadow007
started this conversation in
Ideas
Replies: 1 comment
-
|
@immortalshadow007 The Faiss team welcomes code donations from the community. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Background
FAISS currently supports a variety of quantization formats such as FP16 (SQfp16) and signed INT8 (QT_8bit_direct_signed) for memory-efficient similarity search. These have been extremely valuable for large-scale vector databases and retrieval tasks.
With the rise of ultra-low precision formats like FP8 (already being adopted in NVIDIA Hopper/Blackwell architectures) and FP4 (second-generation Transformer Engines in NVIDIA GB200), the community is increasingly looking for vector search systems that can natively support these datatypes.
Motivation:
Proposed Features:
Potential Benefits:
#References:
Beta Was this translation helpful? Give feedback.
All reactions