Support for parallel Faiss CAGRA index builds on single GPU

Hi, 

I am trying to build two GPU indexes of the same size in two parallel processes, using `GpuIndexCagra` and the python API for `faiss`. I've noticed that this takes roughly 2x the time to build a single GPU index in one process. But, if i set the `GpuIndexCagraConfig.device` setting to different values for each of the scripts, then I get roughly the same performance as building a single GPU index. However, I believe this `device` setting can only be changed on multi-GPU hardware. For hardware with a single GPU, is it expected behavior to see ~2x slowdown when building two GPU indexes in parallel? I would like to utilize the memory of my single GPU instance as much as possible, so curious if there's a way to get a performance boost when building multiple GPU indexes at the same time. 

I am creating a 1,000,000 x 768 `numpy` array for this test. The script I used is here: https://gist.github.com/rchitale7/8e0995e8231eec5657f42627d2cc1228

**My Setup**

I used an EC2 g6.12xlarge machine (for multi GPU test) and g5.2xlarge (for single GPU test), with the `Deep Learning Base OSS Nvidia Driver GPU AMI (Amazon Linux 2023) 20250912` AMI: https://docs.aws.amazon.com/dlami/latest/devguide/aws-deep-learning-base-gpu-ami-amazon-linux-2023.html

output of `nvidia-smi`:
```
Tue Sep 23 19:00:56 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.172.08             Driver Version: 570.172.08     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A10G                    On  |   00000000:00:1E.0 Off |                    0 |
|  0%   23C    P8             15W /  300W |       0MiB /  23028MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
```

**Reproduction instructions**

1. On a server with GPUs and `conda` installed, setup the following `conda` environment:

```
conda create -n faiss_test_new -c conda-forge -c pytorch -c nvidia -c rapidsai python=3.12 faiss-gpu-cuvs=1.12.0
```

2. Activate conda env:
```
conda activate faiss_test_new
```

3. Download the test script: https://gist.github.com/rchitale7/8e0995e8231eec5657f42627d2cc1228

4. Run the test script, supplying a device id (can only use `0` if there is one GPU on the instance):

```
python faiss_cagra_test.py 0
```

5. In a separate terminal session, activate the conda env and launch the same test script. 

```
conda activate faiss_test_new && python faiss_cagra_test.py 0
```

6. Once both test scripts reach the `pdb` break point, at the same time, press `c` to continue the execution and run the scripts in parallel

7. After both scripts have completed, compare the results to running only one script at a time. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for parallel Faiss CAGRA index builds on single GPU #4588

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for parallel Faiss CAGRA index builds on single GPU #4588

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions