Skip to content

Support for parallel Faiss CAGRA index builds on single GPU #4588

@rchitale7

Description

@rchitale7

Hi,

I am trying to build two GPU indexes of the same size in two parallel processes, using GpuIndexCagra and the python API for faiss. I've noticed that this takes roughly 2x the time to build a single GPU index in one process. But, if i set the GpuIndexCagraConfig.device setting to different values for each of the scripts, then I get roughly the same performance as building a single GPU index. However, I believe this device setting can only be changed on multi-GPU hardware. For hardware with a single GPU, is it expected behavior to see ~2x slowdown when building two GPU indexes in parallel? I would like to utilize the memory of my single GPU instance as much as possible, so curious if there's a way to get a performance boost when building multiple GPU indexes at the same time.

I am creating a 1,000,000 x 768 numpy array for this test. The script I used is here: https://gist.github.com/rchitale7/8e0995e8231eec5657f42627d2cc1228

My Setup

I used an EC2 g6.12xlarge machine (for multi GPU test) and g5.2xlarge (for single GPU test), with the Deep Learning Base OSS Nvidia Driver GPU AMI (Amazon Linux 2023) 20250912 AMI: https://docs.aws.amazon.com/dlami/latest/devguide/aws-deep-learning-base-gpu-ami-amazon-linux-2023.html

output of nvidia-smi:

Tue Sep 23 19:00:56 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.172.08             Driver Version: 570.172.08     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A10G                    On  |   00000000:00:1E.0 Off |                    0 |
|  0%   23C    P8             15W /  300W |       0MiB /  23028MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Reproduction instructions

  1. On a server with GPUs and conda installed, setup the following conda environment:
conda create -n faiss_test_new -c conda-forge -c pytorch -c nvidia -c rapidsai python=3.12 faiss-gpu-cuvs=1.12.0
  1. Activate conda env:
conda activate faiss_test_new
  1. Download the test script: https://gist.github.com/rchitale7/8e0995e8231eec5657f42627d2cc1228

  2. Run the test script, supplying a device id (can only use 0 if there is one GPU on the instance):

python faiss_cagra_test.py 0
  1. In a separate terminal session, activate the conda env and launch the same test script.
conda activate faiss_test_new && python faiss_cagra_test.py 0
  1. Once both test scripts reach the pdb break point, at the same time, press c to continue the execution and run the scripts in parallel

  2. After both scripts have completed, compare the results to running only one script at a time.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions