GPU memory usage of IndexIVFFlat

**Question 1: options.useFloat16 cannot reduce GPU memory usage**


```py
embeddings = np.random.rand(5000000, 2048).astype(np.float32)

index = faiss.IndexFlatIP(2048)
index = faiss.IndexIVFFlat(index, 2048, 32768)

options = faiss.GpuMultipleClonerOptions()
options.shard = True
options.useFloat16 = True
gpu_index = faiss.index_cpu_to_all_gpus(index, co=options, ngpu=1)

gpu_index.train(embeddings[:1000000])

gpu_index.add(embeddings)
```
Whether I set `options.useFloat16 = True` or not, the GPU memory usage remains consistent. Does IndexIVF (e.g. IndexIVFFlat, IndexIVFPQ) require all vectors to be stored in fp32?

**Question 2: How to further reduce GPU memory usage**


I have a huge (56000000, 2048) embedding, which is similar to https://github.com/facebookresearch/faiss/issues/4502. I use 8 `A100 80G` GPUs to perform `index.add`. Even though this embedding alone consumes 56,000,000 * 2048 * 4 bytes, approximately 427GB, due to `useFloat16` not working and extra GPU memory usage, my total of 640GB across 8 A100 GPUs still couldn't smoothly complete the `index.add` operation.

Are there any methods to reduce GPU memory usage? Or is it impossible for me to use IndexIVFFlat in this situation?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPU memory usage of IndexIVFFlat #4650

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GPU memory usage of IndexIVFFlat #4650

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions