embeddings = np.random.randn(5000000, 2048).astype(np.float32)
index = faiss.IndexFlatIP(2048)
options = faiss.GpuMultipleClonerOptions()
gpu_index = faiss.index_cpu_to_all_gpus(index, co=options, ngpu=1)
gpu_index.add(embeddings)
index = faiss.IndexIVFFlat(index, 2048, 32768)
options = faiss.GpuMultipleClonerOptions()
gpu_index = faiss.index_cpu_to_all_gpus(index, co=options, ngpu=1)
gpu_index.train(embeddings[:1000000])
gpu_index.add(embeddings)
Using the code above, IndexFlatIP occupies 40GB, while IndexIVFFlat occupies about 60GB. I'd like to understand why IndexIVFFlat uses so much more GPU memory. Theoretically, it should only be an additional 32768(nlist) * 2048 * 4 bytes, which is about 0.25GB, right?