Skip to content

[ICCV 25] Official code for "OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth Integration"

License

Notifications You must be signed in to change notification settings

princeton-vl/OMNI-DC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[ICCV 25] OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth Integration

Yiming Zuo · Willow Yang · Zeyu Ma · Jia Deng

Princeton Vision & Learning Lab (PVL)

Logo

TL;DR

We present a depth completion model that works well on unseen datasets and various depth patterns (zero-shot). It can be used to regularize Gaussian Splatting models to achieve better rendering quality, or work with LiDARs for dense mapping.

Change Logs

  • 11/3/2025 Access pretrained weights through Huggingface 🤗.
  • 6/25/2025 Added a demo script for easy testing on custom data.
  • 6/25/2025 Paper accepted to ICCV 2025!
  • 3/21/2025 v1.1 relased.

V1.1

We find that concatenating the output of Depth Anyting v2 to the sparse depth map improves the results consistently, especially when the input depth is sparse (error reduced by 50% on NYU with 5 points compared to v1.0). Usage is the same as v1.0.

Detailed Accuracy Comparison Between v1.0 and v1.1

ETH3D: RMSE/REL; KITTIDC: RMSE/MAE

Method ETH3D_SfM_In ETH3D_SfM_Out KITTIDC_64 KITTIDC_16 KITTIDC_8
v1.0 0.605/0.090 1.069/0.053 1.191/0.270 1.682/0.441 2.058/0.597
v1.1 0.488/0.061 1.092/0.035 1.235/0.276 1.659/0.430 1.897/0.546

VOID: RMSE/MAE

Method VOID_1500 VOID_500 VOID_150
v1.0 0.555/0.150 0.551/0.164 0.650/0.211
v1.1 0.540/0.143 0.512/0.144 0.581/0.171

NYU: RMSE/REL

Method NYU_500 NYU_200 NYU_100 NYU_50 NYU_5
v1.0 0.111/0.014 0.147/0.021 0.180/0.029 0.225/0.041 0.536/0.142
v1.1 0.112/0.014 0.143/0.019 0.167/0.024 0.193/0.029 0.330/0.070

Virtual Depth Pattens (Avg on 4 datasets): RMSE/REL

Method 0.7% 0.1% 0.03% 5% Noise 10% Noise
v1.0 0.135/0.010 0.211/0.020 0.289/0.034 0.141/0.010 0.147/0.011
v1.1 0.126/0.009 0.172/0.016 0.213/0.025 0.130/0.010 0.134/0.010
Method ORB SIFT LiDAR-64 LiDAR-16 LiDAR-8
v1.0 0.247/0.045 0.211/0.037 0.121/0.008 0.164/0.014 0.231/0.023
v1.1 0.176/0.028 0.161/0.024 0.114/0.008 0.146/0.012 0.180/0.017

Environment Setup

We recommend creating a python enviroment with anaconda.

conda create -n OMNIDC python=3.8
conda activate OMNIDC
# For CUDA Version == 11.3
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install mmcv==1.4.4 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html 
pip install mmsegmentation==0.22.1 
pip install timm tqdm thop tensorboardX tensorboard opencv-python ipdb h5py ipython Pillow==9.5.0 plyfile einops
pip install huggingface_hub

NVIDIA Apex

We used NVIDIA Apex for multi-GPU training. Apex can be installed as follows:

git clone https://github.com/NVIDIA/apex
cd apex
git reset --hard 4ef930c1c884fdca5f472ab2ce7cb9b505d26c1a
conda install cudatoolkit-dev=11.3 -c conda-forge
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./ 

You may face the bug ImportError: cannot import name 'container_abcs' from 'torch._six'. In this case, change line 14 of apex/apex/_amp_state.py to import collections.abc as container_abcs and re-install apex.

Checkpoints

Option 1: Huggingface

Access the model here: https://huggingface.co/zuoym15/OMNI-DC. See demo.py for an example usage.

Option 2: Manually Download the Weights

Backbone Initialization Files

Download these .pt files to src/pretrained:

https://drive.google.com/drive/folders/1z2sOkIJHtg1zTYiSRhZRzff0AANprx4O?usp=sharing

Pretrained Checkpoints

Download from

# v1.0
https://drive.google.com/file/d/1SBRfdhozd-3j6uorjKOMgYGmrd578NvG/view?usp=sharing

# v1.1
https://drive.google.com/file/d/1ssJYFB3rQD5JEYgG7W6tRJg1hpQKvqPD/view?usp=sharing

and put it under the checkpoints folder.

(v1.1 only) DepthAnything Checkpoint

Download the Depth Anything checkpoint to src/depth_models/depth_anything_v2/checkpoints.

Demo

Run

cd src
sh testing_script/demo.sh 

Note: Do NOT directly run demo.py. The model configs are different than default and can cause shape mismatch when loading the checkpoints.

We prepare example images under the figures folder. Results are saved under experiments.

To run with your own data, prepare an rgb image and the corresponding sparse depth map as an npy file: 1) with the same resolution as the image; 2) 0 values indicate unknown depth.

Reproduce Results in the Paper

Prepare the datasets

We save all evaluation datasets in a unified form (uniformat), and you can directly download it from here. Put all npy files under the datasets/uniformat_release folder:

uniformat_release
  ├───ARKitScenes_test_100
  │   ├──000000.npy
 ...  ├──000001.npy
      └──...           

We also provide instructions on how to process the original datasets to get these npy files, check this link:

Testing

cd src

# v1.1
sh testing_scripts/test_v1.1.sh

# v1.0, the real and virtual patterns from the 5 datasets reported in tab.1 and tab.2 in the paper
sh testing_scripts/test_robust_DC.sh

# v1.0, additional results on the void and nyuv2 datasets
sh testing_scripts/test_void_nyu.sh

Test on Your Own Dataset

We recommend writing a dataloader to save your own dataset into the unified format we use (uniformat). You will need to provide an RGB image and a sparse depth map (with 0 indicating invalid pixels). A good starting point is the ibims dataloader.

Then follow the instructions here to convert your dataset into uniformat. Specifically, look into src/robust_dc_protocol/save_uniformat_dataset.py.

Once done, you can run evaluation just as on any other datasets we tested on. See examples under src/testing_scripts.

If you want to use our model for view synthesis (e.g., Gaussian Splatting), you may find the instructions here helpful. The ETH3D section describes how to convert COLMAP models to sparse depth maps.

Train from Scratch

Resource requirements

We use 10x48GB GPUs (e.g., RTX A6000) and ~6 days. You can adjust the batch size depending on the memory and the numbers of GPU cards you have.

Dataset preparation

See here for instructions.

Training

cd src
sh training_scripts/train_full.sh 

Citation

If you find our work helpful please consider citing our paper:

@inproceedings{zuo2025omni,
  title={Omni-dc: Highly robust depth completion with multiresolution depth integration},
  author={Zuo, Yiming and Yang, Willow and Ma, Zeyu and Deng, Jia},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={9287--9297},
  year={2025}
}

About

[ICCV 25] Official code for "OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth Integration"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published