Different DDP ranks have varying batch normalization (BN) statistics after the PreciseBN hook because precise_bn in fvcore does not synchronize the batch size across ranks.

Bug Report: Different Batch Normalization (BN) Statistics After PreciseBN Hook
Steps to Reproduce:

Clone the repository:
bash
Copy code
git clone https://github.com/facebookresearch/moco.git
cd moco/detection
Run the following command:
bash
Copy code
python train_net.py \
--config-file configs/pascal_voc_R_50_C4_24k.yaml \
--num-gpus 8 \
OUTPUT_DIR "temp/train" \
SEED 0 \
SOLVER.MAX_ITER 1
Observations:

During update_bn_stats, the batch sizes differ across Distributed Data Parallel (DDP) ranks.
This lack of synchronization causes inconsistent BN statistics on different ranks after the PreciseBN hook.
Expected Behavior:

The BN statistics should be consistent across all ranks after the PreciseBN hook.
Environment:

OS: Linux
Python Version: 3.8.0
Framework: Detectron2 v0.6
PyTorch: 2.4.1+cu124
GPU: NVIDIA GeForce RTX 4090 (8 GPUs)
CUDA: Version 12.4
Other dependencies:
fvcore: 0.1.5.post20221221
iopath: 0.1.9
Command to Collect Full Environment Details:

bash
Copy code
wget -nc -q https://github.com/facebookresearch/detectron2/raw/main/detectron2/utils/collect_env.py
python collect_env.py
Note:
No private dataset is required to reproduce this bug. All required resources are publicly available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Different DDP ranks have varying batch normalization (BN) statistics after the PreciseBN hook because precise_bn in fvcore does not synchronize the batch size across ranks. #5409

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Different DDP ranks have varying batch normalization (BN) statistics after the PreciseBN hook because precise_bn in fvcore does not synchronize the batch size across ranks. #5409

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions