Skip to content

Commit 4d112a0

Browse files
committed
unimatch release
0 parents  commit 4d112a0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

83 files changed

+491940
-0
lines changed

DATASETS.md

+74
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Datasets
2+
3+
4+
5+
## Optical Flow
6+
7+
The datasets used to train and evaluate our GMFlow model are as follows:
8+
9+
- [FlyingChairs](https://lmb.informatik.uni-freiburg.de/resources/datasets/FlyingChairs.en.html#flyingchairs)
10+
- [FlyingThings3D](https://lmb.informatik.uni-freiburg.de/resources/datasets/SceneFlowDatasets.en.html)
11+
- [Sintel](http://sintel.is.tue.mpg.de/)
12+
- [Virtual KITTI 2](https://europe.naverlabs.com/research/computer-vision/proxy-virtual-worlds-vkitti-2/)
13+
- [KITTI](http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php?benchmark=flow)
14+
- [HD1K](http://hci-benchmark.iwr.uni-heidelberg.de/)
15+
16+
By default the dataloader [dataloader/flow/datasets.py](dataloader/flow/datasets.py) assumes the datasets are located in the `datasets` directory.
17+
18+
It is recommended to symlink your dataset root to `datasets`:
19+
20+
```
21+
ln -s $YOUR_DATASET_ROOT datasets
22+
```
23+
24+
Otherwise, you may need to change the corresponding paths in [dataloader/flow/datasets.py](dataloader/flow/datasets.py).
25+
26+
27+
28+
## Stereo Matching
29+
30+
The datasets used to train and evaluate our GMStereo model are as follows:
31+
32+
- [Scene Flow](https://lmb.informatik.uni-freiburg.de/resources/datasets/SceneFlowDatasets.en.html)
33+
- [Virtual KITTI 2](https://europe.naverlabs.com/research/computer-vision/proxy-virtual-worlds-vkitti-2/)
34+
- [KITTI](https://www.cvlibs.net/datasets/kitti/eval_scene_flow.php?benchmark=stereo)
35+
- [TartanAir](https://github.com/castacks/tartanair_tools)
36+
- [Falling Things](https://research.nvidia.com/publication/2018-06_Falling-Things)
37+
- [HR-VS](https://drive.google.com/file/d/1SgEIrH_IQTKJOToUwR1rx4-237sThUqX/view)
38+
- [CREStereo Dataset](https://github.com/megvii-research/CREStereo/blob/master/dataset_download.sh)
39+
- [InStereo2K](https://github.com/YuhuaXu/StereoDataset)
40+
- [Middlebury](https://vision.middlebury.edu/stereo/data/)
41+
- [Sintel Stereo](http://sintel.is.tue.mpg.de/stereo)
42+
- [ETH3D](https://www.eth3d.net/datasets#low-res-two-view-training-data)
43+
44+
By default the dataloader [dataloader/stereo/datasets.py](dataloader/stereo/datasets.py) assumes the datasets are located in the `datasets` directory.
45+
46+
It is recommended to symlink your dataset root to `datasets`:
47+
48+
```
49+
ln -s $YOUR_DATASET_ROOT datasets
50+
```
51+
52+
Otherwise, you may need to change the corresponding paths in [dataloader/stereo/datasets.py](dataloader/flow/datasets.py).
53+
54+
55+
56+
## Depth Estimation
57+
58+
The datasets used to train and evaluate our GMDepth model are as follows:
59+
60+
- [DeMoN](https://github.com/lmb-freiburg/demon)
61+
- [ScanNet](http://www.scan-net.org/)
62+
63+
We support downloading and extracting the DeMoN dataset in our code: [dataloader/depth/download_demon_train.sh](dataloader/depth/download_demon_train.sh), [dataloader/depth/download_demon_test.sh](dataloader/depth/download_demon_test.sh), [dataloader/depth/prepare_demon_train.sh](dataloader/depth/prepare_demon_train.sh) and [dataloader/depth/prepare_demon_test.sh](dataloader/depth/prepare_demon_test.sh).
64+
65+
By default the dataloader [dataloader/depth/datasets.py](dataloader/depth/datasets.py) assumes the datasets are located in the `datasets` directory.
66+
67+
It is recommended to symlink your dataset root to `datasets`:
68+
69+
```
70+
ln -s $YOUR_DATASET_ROOT datasets
71+
```
72+
73+
Otherwise, you may need to change the corresponding paths in [dataloader/depth/datasets.py](dataloader/depth/datasets.py).
74+

MODEL_ZOO.md

+71
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Model Zoo
2+
3+
- The models are named as `model-dataset`.
4+
- Model definition: `scale1` denotes the 1/8 feature resolution model, `scale2` denotes the 1/8 & 1/4 model, `scaleX-regrefineY` denotes the `X`-scale model with additional `Y` local regression refinements.
5+
- The inference time is averaged over 100 runs, measured with batch size 1 on a single NVIDIA A100 GPU.
6+
- All pretrained models can be downloaded together at [pretrained.zip](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained.zip), or they can be downloaded individually as listed below.
7+
8+
9+
10+
## Optical Flow
11+
12+
- The inference time is measured for Sintel resolution: 448x1024
13+
14+
- The `*-mixdata` models are trained on several mixed public datasets, which are recommended for in-the-wild use cases.
15+
16+
17+
18+
| Model | Params (M) | Time (ms) | Download |
19+
| --------------------------------- | :--------: | :-------: | :----------------------------------------------------------: |
20+
| GMFlow-scale1-things | 4.7 | 26 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmflow-scale1-things-e9887eda.pth) |
21+
| GMFlow-scale1-mixdata | 4.7 | 26 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmflow-scale1-mixdata-train320x576-4c3a6e9a.pth) |
22+
| GMFlow-scale2-things | 4.7 | 66 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmflow-scale2-things-36579974.pth) |
23+
| GMFlow-scale2-sintel | 4.7 | 66 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmflow-scale2-sintel-3ed1cf48.pth) |
24+
| GMFlow-scale2-mixdata | 4.7 | 66 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmflow-scale2-mixdata-train320x576-9ff1c094.pth) |
25+
| GMFlow-scale2-regrefine6-things | 7.4 | 122 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmflow-scale2-regrefine6-things-776ed612.pth) |
26+
| GMFlow-scale2-regrefine6-sintelft | 7.4 | 122 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmflow-scale2-regrefine6-sintelft-6e39e2b9.pth) |
27+
| GMFlow-scale2-regrefine6-kitti | 7.4 | 122 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmflow-scale2-regrefine6-kitti15-25b554d7.pth) |
28+
| GMFlow-scale2-regrefine6-mixdata | 7.4 | 122 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmflow-scale2-regrefine6-mixdata-train320x576-4e7b215d.pth) |
29+
30+
31+
32+
## Stereo Matching
33+
34+
- The inference time is measured for KITTI resolution: 384x1248
35+
- The `*-resumeflowthings-*` denotes that the models are trained with GMFlow model as initialization, where GMFlow is trained on Chairs and Things dataset for optical flow task.
36+
- The `*-mixdata` models are trained on several mixed public datasets, which are recommended for in-the-wild use cases.
37+
38+
| Model | Params (M) | Time (ms) | Download |
39+
| ------------------------------------------------------ | :--------: | :-------: | :--------: |
40+
| GMStereo-scale1-sceneflow | 4.7 | 23 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmstereo-scale1-sceneflow-124a438f.pth) |
41+
| GMStereo-scale1-resumeflowthings-sceneflow | 4.7 | 23 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmstereo-scale1-resumeflowthings-sceneflow-16e38788.pth) |
42+
| GMStereo-scale2-sceneflow | 4.7 | 58 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmstereo-scale2-sceneflow-ab93ba6a.pth) |
43+
| GMStereo-scale2-resumeflowthings-sceneflow | 4.7 | 58 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmstereo-scale2-resumeflowthings-sceneflow-48020649.pth) |
44+
| GMStereo-scale2-regrefine3-sceneflow | 7.4 | 86 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmstereo-scale2-regrefine3-sceneflow-2dd12e97.pth) |
45+
| GMStereo-scale2-regrefine3-resumeflowthings-sceneflow | 7.4 | 86 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmstereo-scale2-regrefine3-resumeflowthings-sceneflow-f724fee6.pth) |
46+
| GMStereo-scale2-regrefine3-resumeflowthings-kitti | 7.4 | 86 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmstereo-scale2-regrefine3-resumeflowthings-kitti15-04487ebf.pth) |
47+
| GMStereo-scale2-regrefine3-resumeflowthings-middlebury | 7.4 | 86 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmstereo-scale2-regrefine3-resumeflowthings-middleburyfthighres-a82bec03.pth) |
48+
| GMStereo-scale2-regrefine3-resumeflowthings-eth3dft | 7.4 | 86 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmstereo-scale2-regrefine3-resumeflowthings-eth3dft-a807cb16.pth) |
49+
| GMStereo-scale2-regrefine3-resumeflowthings-mixdata | 7.4 | 86 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmstereo-scale2-regrefine3-resumeflowthings-mixdata-train320x640-ft640x960-e4e291fd.pth) |
50+
51+
52+
53+
## Depth Estimation
54+
55+
- The inference time is measured for ScanNet resolution: 480x640
56+
57+
- The `*-resumeflowthings-*` models are trained with a pretrained GMFlow model as initialization, where GMFlow is trained on Chairs and Things dataset for optical flow task.
58+
59+
60+
61+
| Model | Params (M) | Time (ms) | Download |
62+
| -------------------------------------------------- | :--------: | :-------: | :----------------------------------------------------------: |
63+
| GMDepth-scale1-scannet | 4.7 | 17 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmdepth-scale1-scannet-d3d1efb5.pth) |
64+
| GMDepth-scale1-resumeflowthings-scannet | 4.7 | 17 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmdepth-scale1-resumeflowthings-scannet-5d9d7964.pth) |
65+
| GMDepth-scale1-regrefine1-resumeflowthings-scannet | 4.7 | 17 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmdepth-scale1-regrefine1-resumeflowthings-scannet-90325722.pth) |
66+
| GMDepth-scale1-demon | 7.3 | 20 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmdepth-scale1-demon-bd64786e.pth) |
67+
| GMDepth-scale1-resumeflowthings-demon | 7.3 | 20 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmdepth-scale1-resumeflowthings-demon-a2fe127b.pth) |
68+
| GMDepth-scale1-regrefine1-resumeflowthings-demon | 7.3 | 20 | [download](https://s3.eu-central-1.amazonaws.com/avg-projects/unimatch/pretrained/gmdepth-scale1-regrefine1-resumeflowthings-demon-7c23f230.pth) |
69+
70+
71+

README.md

+132
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
<p align="center">
2+
<h1 align="center">Unifying Flow, Stereo and Depth Estimation</h1>
3+
<p align="center">
4+
<a href="https://haofeixu.github.io/">Haofei Xu</a>
5+
·
6+
<a href="https://scholar.google.com/citations?user=9jH5v74AAAAJ">Jing Zhang</a>
7+
·
8+
<a href="https://jianfei-cai.github.io/">Jianfei Cai</a>
9+
·
10+
<a href="https://scholar.google.com/citations?user=VxAuxMwAAAAJ">Hamid Rezatofghi</a>
11+
·
12+
<a href="https://www.yf.io/">Fisher Yu</a>
13+
·
14+
<a href="https://scholar.google.com/citations?user=RwlJNLcAAAAJ">Dacheng Tao</a>
15+
·
16+
<a href="http://www.cvlibs.net/">Andreas Geiger</a>
17+
</p>
18+
<h3 align="center"><a href="https://arxiv.org/abs/2211.xxxxx">Paper</a> | <a href="https://haofeixu.github.io/unimatch/">Project Page</a> | <a >Colab (Coming Soon)</a> </h3>
19+
<div align="center"></div>
20+
</p>
21+
22+
<p align="center">
23+
<a href="">
24+
<img src="./assets/teaser.svg" alt="Logo" width="70%">
25+
</a>
26+
</p>
27+
28+
<p align="center">
29+
A unified model for three motion and 3D perception tasks.
30+
</p>
31+
32+
33+
This project is developed based on our previous works:
34+
35+
- [GMFlow: Learning Optical Flow via Global Matching, CVPR 2022, Oral](https://github.com/haofeixu/gmflow)
36+
37+
- [High-Resolution Optical Flow from 1D Attention and Correlation, ICCV 2021, Oral](https://github.com/haofeixu/flow1d)
38+
39+
- [AANet: Adaptive Aggregation Network for Efficient Stereo Matching, CVPR 2020](https://github.com/haofeixu/aanet)
40+
41+
42+
43+
## Installation
44+
45+
Our code is developed based on pytorch 1.9.0, CUDA 10.2 and python 3.8. Higher version pytorch should also work well.
46+
47+
We recommend using [conda](https://www.anaconda.com/distribution/) for installation:
48+
49+
```
50+
conda env create -f conda_environment.yml
51+
conda activate unimatch
52+
```
53+
54+
Alternatively, we also support installing with pip:
55+
56+
```
57+
bash pip_install.sh
58+
```
59+
60+
61+
62+
## Model Zoo
63+
64+
All pretrained models for flow, stereo and depth on different datasets are available in [MODEL_ZOO.md](MODEL_ZOO.md).
65+
66+
We assume the downloaded weights are located under the `pretrained` directory.
67+
68+
Otherwise, you may need to change the corresponding paths in the scripts.
69+
70+
71+
72+
## Demo
73+
74+
Given an image pair or a video sequence, our code supports generating prediction results of optical flow, disparity and depth.
75+
76+
Please refer to [scripts/gmflow_demo.sh](scripts/gmflow_demo.sh), [scripts/gmstereo_demo.sh](scripts/gmstereo_demo.sh) and [scripts/gmdepth_demo.sh](scripts/depth_demo.sh) for example usages.
77+
78+
79+
80+
## Datasets
81+
82+
The datasets used to train and evaluate our models for all three tasks are given in [DATASETS.md](DATASETS.md)
83+
84+
85+
86+
## Evaluation
87+
88+
The evaluation scripts used to reproduce the numbers in our paper are given in [scripts/gmflow_evaluate.sh](scripts/gmflow_evaluate.sh), [scripts/gmstereo_evaluate.sh](scripts/gmstereo_evaluate.sh) and [scripts/gmdepth_evaluate.sh](scripts/gmdepth_evaluate.sh).
89+
90+
For submission to KITTI, Sintel, Middlebury and ETH3D online test sets, you can run [scripts/gmflow_submission.sh](scripts/gmflow_submission.sh) and [scripts/gmstereo_submission.sh](scripts/gmstereo_submission.sh) to generate the prediction results. The results can be submitted directly.
91+
92+
93+
94+
## Training
95+
96+
All training scripts for different model variants on different datasets can be found in [scripts/*_train.sh](scripts).
97+
98+
We support using tensorboard to monitor and visualize the training process. You can first start a tensorboard session with
99+
100+
```
101+
tensorboard --logdir checkpoints
102+
```
103+
104+
and then access [http://localhost:6006](http://localhost:6006/) in your browser.
105+
106+
107+
108+
## Citation
109+
110+
If you find our work useful in your research, please consider citing our paper:
111+
112+
```
113+
@article{xu2022unifying,
114+
title={Unifying Flow, Stereo and Depth Estimation},
115+
author={Xu, Haofei and Zhang, Jing and Cai, Jianfei and Rezatofighi, Hamid and Yu, Fisher and Tao, Dacheng and Geiger, Andreas},
116+
journal={arXiv preprint arXiv:2211.xxxxx},
117+
year={2022}
118+
}
119+
```
120+
121+
122+
123+
## Acknowledgements
124+
125+
This project would not have been possible without relying on some awesome repos: [RAFT](https://github.com/princeton-vl/RAFT), [LoFTR](https://github.com/zju3dv/LoFTR), [DETR](https://github.com/facebookresearch/detr), [Swin](https://github.com/microsoft/Swin-Transformer), [mmdetection](https://github.com/open-mmlab/mmdetection) and [Detectron2](https://github.com/facebookresearch/detectron2/blob/main/projects/TridentNet/tridentnet/trident_conv.py). We thank the original authors for their excellent work.
126+
127+
128+
129+
130+
131+
132+

assets/teaser.svg

+1
Loading

conda_environment.yml

+108
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
name: unimatch
2+
channels:
3+
- pytorch
4+
- defaults
5+
dependencies:
6+
- blas=1.0=mkl
7+
- brotli=1.0.9=ha925a31_2
8+
- ca-certificates=2022.07.19=haa95532_0
9+
- certifi=2022.6.15=py38haa95532_0
10+
- cloudpickle=2.0.0=pyhd3eb1b0_0
11+
- cudatoolkit=10.2.89=h74a9793_1
12+
- cycler=0.11.0=pyhd3eb1b0_0
13+
- cytoolz=0.11.0=py38he774522_0
14+
- dask-core=2022.7.0=py38haa95532_0
15+
- fonttools=4.25.0=pyhd3eb1b0_0
16+
- freetype=2.10.4=hd328e21_0
17+
- fsspec=2022.7.1=py38haa95532_0
18+
- icc_rt=2019.0.0=h0cc432a_1
19+
- icu=58.2=ha925a31_3
20+
- imageio=2.9.0=pyhd3eb1b0_0
21+
- intel-openmp=2022.0.0=haa95532_3663
22+
- jpeg=9b=hb83a4c4_2
23+
- kiwisolver=1.4.2=py38hd77b12b_0
24+
- libpng=1.6.37=h2a8f88b_0
25+
- libtiff=4.2.0=he0120a3_1
26+
- libuv=1.40.0=he774522_0
27+
- libwebp=1.2.2=h2bbff1b_0
28+
- locket=1.0.0=py38haa95532_0
29+
- lz4-c=1.9.3=h2bbff1b_1
30+
- matplotlib=3.5.1=py38haa95532_1
31+
- matplotlib-base=3.5.1=py38hd77b12b_1
32+
- mkl=2020.2=256
33+
- mkl-service=2.3.0=py38h196d8e1_0
34+
- mkl_fft=1.3.0=py38h46781fe_0
35+
- mkl_random=1.1.1=py38h47e9c7a_0
36+
- munkres=1.1.4=py_0
37+
- networkx=2.8.4=py38haa95532_0
38+
- ninja=1.10.2=haa95532_5
39+
- ninja-base=1.10.2=h6d14046_5
40+
- numpy=1.19.2=py38hadc3359_0
41+
- numpy-base=1.19.2=py38ha3acd2a_0
42+
- openssl=1.1.1q=h2bbff1b_0
43+
- packaging=21.3=pyhd3eb1b0_0
44+
- partd=1.2.0=pyhd3eb1b0_1
45+
- pillow=9.0.1=py38hdc2b20a_0
46+
- pip=21.2.2=py38haa95532_0
47+
- pyparsing=3.0.4=pyhd3eb1b0_0
48+
- pyqt=5.9.2=py38hd77b12b_6
49+
- python=3.8.13=h6244533_0
50+
- python-dateutil=2.8.2=pyhd3eb1b0_0
51+
- pytorch=1.9.0=py3.8_cuda10.2_cudnn7_0
52+
- pywavelets=1.3.0=py38h2bbff1b_0
53+
- pyyaml=6.0=py38h2bbff1b_1
54+
- qt=5.9.7=vc14h73c81de_0
55+
- scikit-image=0.19.2=py38hf11a4ad_0
56+
- scipy=1.6.2=py38h14eb087_0
57+
- sip=4.19.13=py38hd77b12b_0
58+
- six=1.16.0=pyhd3eb1b0_1
59+
- sqlite=3.38.5=h2bbff1b_0
60+
- tifffile=2020.10.1=py38h8c2d366_2
61+
- tk=8.6.12=h2bbff1b_0
62+
- toolz=0.11.2=pyhd3eb1b0_0
63+
- torchvision=0.10.0=py38_cu102
64+
- tornado=6.1=py38h2bbff1b_0
65+
- typing_extensions=4.1.1=pyh06a4308_0
66+
- vc=14.2=h21ff451_1
67+
- vs2015_runtime=14.27.29016=h5e58377_2
68+
- wheel=0.37.1=pyhd3eb1b0_0
69+
- wincertstore=0.2=py38haa95532_2
70+
- xz=5.2.5=h8cc25b3_1
71+
- yaml=0.2.5=he774522_0
72+
- zlib=1.2.12=h8cc25b3_2
73+
- zstd=1.5.2=h19a0ad4_0
74+
- pip:
75+
- absl-py==1.1.0
76+
- cachetools==5.2.0
77+
- charset-normalizer==2.1.0
78+
- cmapy==0.6.6
79+
- colorama==0.4.5
80+
- configargparse==1.5.3
81+
- google-auth==2.9.0
82+
- google-auth-oauthlib==0.4.6
83+
- grpcio==1.47.0
84+
- h5py==3.7.0
85+
- idna==3.3
86+
- imageio-ffmpeg==0.4.7
87+
- importlib-metadata==4.12.0
88+
- joblib==1.2.0
89+
- lz4==4.0.2
90+
- markdown==3.3.7
91+
- oauthlib==3.2.0
92+
- opencv-python==4.6.0.66
93+
- path==16.5.0
94+
- protobuf==3.19.4
95+
- pyasn1==0.4.8
96+
- pyasn1-modules==0.2.8
97+
- requests==2.28.1
98+
- requests-oauthlib==1.3.1
99+
- rsa==4.8
100+
- scikit-video==1.1.11
101+
- setuptools==59.5.0
102+
- tensorboard==2.9.1
103+
- tensorboard-data-server==0.6.1
104+
- tensorboard-plugin-wit==1.8.1
105+
- tqdm==4.64.1
106+
- urllib3==1.26.9
107+
- werkzeug==2.1.2
108+
- zipp==3.8.0

dataloader/__init__.py

Whitespace-only changes.

dataloader/depth/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)