Skip to content

Commit 07919b1

Browse files
committed
Initial commit
0 parents  commit 07919b1

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+3541
-0
lines changed

.gitignore

+115
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
data/*
2+
!data/download.sh
3+
model/output/
4+
5+
#===========================================================================
6+
7+
# Created by https://www.gitignore.io/api/python
8+
# Edit at https://www.gitignore.io/?templates=python
9+
10+
### Python ###
11+
# Byte-compiled / optimized / DLL files
12+
__pycache__/
13+
*.py[cod]
14+
*$py.class
15+
16+
# C extensions
17+
*.so
18+
19+
# Distribution / packaging
20+
.Python
21+
build/
22+
develop-eggs/
23+
dist/
24+
downloads/
25+
eggs/
26+
.eggs/
27+
lib/
28+
lib64/
29+
parts/
30+
sdist/
31+
var/
32+
wheels/
33+
pip-wheel-metadata/
34+
share/python-wheels/
35+
.installed.cfg
36+
*.egg
37+
MANIFEST
38+
39+
# PyInstaller
40+
# Usually these files are written by a python script from a template
41+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
42+
*.manifest
43+
*.spec
44+
45+
# Installer logs
46+
pip-log.txt
47+
pip-delete-this-directory.txt
48+
49+
# Unit test / coverage reports
50+
htmlcov/
51+
.tox/
52+
.nox/
53+
.coverage
54+
.coverage.*
55+
.cache
56+
nosetests.xml
57+
coverage.xml
58+
*.cover
59+
.hypothesis/
60+
.pytest_cache/
61+
62+
# Translations
63+
*.mo
64+
*.pot
65+
66+
# Scrapy stuff:
67+
.scrapy
68+
69+
# Sphinx documentation
70+
docs/_build/
71+
72+
# PyBuilder
73+
target/
74+
75+
# pyenv
76+
.python-version
77+
78+
# pipenv
79+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
80+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
81+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
82+
# install all needed dependencies.
83+
#Pipfile.lock
84+
85+
# celery beat schedule file
86+
celerybeat-schedule
87+
88+
# SageMath parsed files
89+
*.sage.py
90+
91+
# Spyder project settings
92+
.spyderproject
93+
.spyproject
94+
95+
# Rope project settings
96+
.ropeproject
97+
98+
# Mr Developer
99+
.mr.developer.cfg
100+
.project
101+
.pydevproject
102+
103+
# mkdocs documentation
104+
/site
105+
106+
# mypy
107+
.mypy_cache/
108+
.dmypy.json
109+
dmypy.json
110+
111+
# Pyre type checker
112+
.pyre/
113+
114+
# End of https://www.gitignore.io/api/python
115+

LICENSE

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright 2020 Institute for Automotive Engineering of RWTH Aachen University.
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

+219
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
# <img src="assets/logo.png" width=50> Cam2BEV
2+
3+
<img src="assets/teaser.gif" align="right" width=320 height=200>
4+
5+
This repository contains the official implementation of our methodology for the computation of a semantically segmented bird's eye view (BEV) image given the images of multiple vehicle-mounted cameras as presented in our paper:
6+
7+
> **A Sim2Real Deep Learning Approach for the Transformation of Images from Multiple Vehicle-Mounted Cameras to a Semantically Segmented Image in Bird’s Eye View** ([arXiv](https://arxiv.org/abs/XXXX.XXXXX))
8+
>
9+
> Lennart Reiher, [Bastian Lampe](https://www.ika.rwth-aachen.de/en/institute/staff/bastian-lampe-m-sc.html), and [Lutz Eckstein](https://www.ika.rwth-aachen.de/en/institute/management/univ-prof-dr-ing-lutz-eckstein.html)
10+
> [Institute for Automotive Engineering (ika), RWTH Aachen University](https://www.ika.rwth-aachen.de/en/)
11+
12+
> _**Abstract**_ — Accurate environment perception is essential for automated driving. When using monocular cameras, the distance estimation of elements in the environment poses a major challenge. Distances can be more easily estimated when the camera perspective is transformed to a bird's eye view (BEV). For flat surfaces, _Inverse Perspective Mapping_ (IPM) can accurately transform images to a BEV. Three-dimensional objects such as vehicles and vulnerable road users are distorted by this transformation making it difficult to estimate their position relative to the sensor. This paper describes a methodology to obtain a corrected 360° BEV image given images from multiple vehicle-mounted cameras. The corrected BEV image is segmented into semantic classes and includes a prediction of occluded areas. The neural network approach does not rely on manually labeled data, but is trained on a synthetic dataset in such a way that it generalizes well to real-world data. By using semantically segmented images as input, we reduce the reality gap between simulated and real-world data and are able to show that our method can be successfully applied in the real world. Extensive experiments conducted on the synthetic data demonstrate the superiority of our approach compared to IPM.
13+
14+
- [Repository Structure](#repository-structure)
15+
- [Installation](#installation)
16+
- [Data](#data)
17+
- [Preprocessing](#preprocessing)
18+
- [Training](#training)
19+
- [Neural Network Architectures](#neural-network-architectures)
20+
- [Customization](#customization)
21+
22+
## Repository Structure
23+
24+
```
25+
Cam2BEV
26+
├── data # where our synthetic datasets are downloaded to by default
27+
├── model # training scripts and configurations
28+
│ ├── architecture # TensorFlow implementations of neural network architectures
29+
│ └── one_hot_conversion # files defining the one-hot encoding of semantically segmented images
30+
└── preprocessing # preprocessing scripts
31+
├── camera_configs # files defining the intrinsics/extrinsics of the cameras used in our datasets
32+
├── homography_converter # script to convert an OpenCV homography for usage within the uNetXST SpatialTransformers
33+
├── ipm # script for generating a classical homography image by means of IPM
34+
└── occlusion # script for introducing an occluded class to the BEV images
35+
```
36+
37+
## Installation
38+
39+
We suggest to setup a **Python 3.7** virtual environment (e.g. by using _virtualenv_ or _conda_). Inside the virtual environment, users can then use _pip_ to install all package dependencies. The most important packages are _TensorFlow 2.1_ and _OpenCV 4.2_
40+
```bash
41+
pip install -r requirements.txt
42+
```
43+
44+
## Data
45+
46+
We provide two synthetic datasets, which can be used to train the neural networks. The datasets are hosted in the [Cam2BEV Data Repository](https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data). Both datasets were used to produce the results presented in our paper:
47+
- [*Dataset 1_FRLR*](https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data/-/tree/master/1_FRLR): images from four vehicle-mounted cameras, ground-truth BEV image centered above the ego vehicle
48+
- [*Dataset 2_F*](https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data/-/tree/master/2_F): images from one frontal vehicle camera; ground-truth BEV image left-aligned with ego vehicle
49+
50+
For more information regarding the data, please refer to the [repository's README](https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data).
51+
52+
Both datasets can easily be downloaded and extracted by running the provided download script:
53+
```bash
54+
./data/download.sh
55+
```
56+
57+
_**Note**: Download size is approximately 3.7GB, uncompressed size of both datasets is approximately 7.7GB._
58+
59+
## Preprocessing
60+
61+
Our paper describes two preprocessing techniques:
62+
(1) introducing an _occluded_ class to the label images and
63+
(2) generating the homography image.
64+
65+
### 1) Dealing with Occlusions
66+
67+
Traffic participants and static obstacles may occlude parts of the environment making predictions for those areas in a BEV image mostly impossible. In order to formulate a well-posed problem, an additional semantic class needs to be introduced to the label images for areas in BEV, which are occluded in the camera perspectives. To this end, [preprocessing/occlusion](preprocessing/occlusion/) can be used. See below for an example of the occlusion preprocessing.
68+
69+
![original](preprocessing/occlusion/assets/example-original.png) ![occluded](preprocessing/occlusion/assets/example-occluded.png)
70+
71+
72+
Run the following command to process the original label images of _dataset 1_FRLR_ and introduce an _occluded_ class. You need to provide camera intrinsics/extrinsics for the drone camera and all vehicle-attached cameras (in the form of the yaml files).
73+
74+
_**Note**: In batch mode, this script utilizes multiprocessing. It can however still take quite some time to process the entire dataset. Therefore, we also provide already preprocessed data._
75+
76+
```bash
77+
cd preprocessing/occlusion
78+
```
79+
```bash
80+
./occlusion.py \
81+
--batch ../../data/1_FRLR/train/bev \
82+
--output ../../data/1_FRLR/train/bev+occlusion \
83+
../camera_configs/1_FRLR/drone.yaml \
84+
../camera_configs/1_FRLR/front.yaml \
85+
../camera_configs/1_FRLR/rear.yaml \
86+
../camera_configs/1_FRLR/left.yaml \
87+
../camera_configs/1_FRLR/right.yaml
88+
```
89+
90+
See [preprocessing/occlusion/README.md](preprocessing/occlusion/README.md) for more information.
91+
92+
### 2) Projective Preprocessing
93+
94+
As part of the incorporation of the Inverse Perspective Mapping (IPM) technique into our methods, the homographies, i.e. the projective transformations between vehicle camera frames and BEV need to be computed. As a preprocessing step to the first variation of our approach (Section III-C), IPM is applied to all images from the vehicle cameras. The transformation is set up to capture the same field of view as the ground truth BEV image. To this end, [preprocessing/ipm](preprocessing/ipm) can be used. See below for an example homography image computed from images of four vehicle-mounted cameras.
95+
96+
![ipm](preprocessing/ipm/assets/example.png)
97+
98+
Run the following command to compute a homography BEV image from all camera images of _dataset 1_FRLR_. You need to provide camera intrinsics/extrinsics for the drone camera and all vehicle-attached cameras (in the form of the yaml files).
99+
100+
_**Note**: To save time, we also provide already preprocessed data._
101+
102+
```bash
103+
cd preprocessing/ipm
104+
```
105+
```bash
106+
./ipm.py --batch --cc \
107+
--output ../../data/1_FRLR/train/homography \
108+
--drone ../camera_configs/1_FRLR/drone.yaml \
109+
../camera_configs/1_FRLR/front.yaml \
110+
../../data/1_FRLR/train/front \
111+
../camera_configs/1_FRLR/rear.yaml \
112+
../../data/1_FRLR/train/rear \
113+
../camera_configs/1_FRLR/left.yaml \
114+
../../data/1_FRLR/train/left \
115+
../camera_configs/1_FRLR/right.yaml \
116+
../../data/1_FRLR/train/right
117+
```
118+
119+
See [preprocessing/ipm/README.md](preprocessing/ipm/README.md) for more information.
120+
121+
## Training
122+
123+
Use the scripts [model/train.py](model/train.py), [model/evaluate.py](model/evaluate.py), and [model/predict.py](model/predict.py) to train a model, evaluate it on validation data, and make predictions on a testing dataset.
124+
125+
Input directories, training parameters, and more can be set via CLI arguments or in a config file. Run the scripts with `--help`-flag or see one of the provided exemplary config files for reference. We provide config files for either one of the networks and datasets:
126+
- [model/config.1_FRLR.deeplab-mobilenet.yml](model/config.1_FRLR.deeplab-mobilenet.yml)
127+
- [model/config.1_FRLR.deeplab-xception.yml](model/config.1_FRLR.deeplab-mobilenet.yml)
128+
- [model/config.1_FRLR.unetxst.yml](model/config.1_FRLR.unetxst.yml)
129+
- [model/config.2_F.deeplab-mobilenet.yml](model/config.2_F.deeplab-mobilenet.yml)
130+
- [model/config.2_F.deeplab-xception.yml](model/config.2_F.deeplab-xception.yml)
131+
- [model/config.2_F.unetxst.yml](model/config.2_F.unetxst.yml)
132+
133+
The following commands will guide you through training _uNetXST_ on _dataset 1_FRLR_.
134+
135+
### Training
136+
137+
Start training _uNetXST_ by passing the provided config file [model/config.1_FRLR.unetxst.yml](model/config.1_FRLR.unetxst.yml). Training will automatically stop if the MIoU score on the validation dataset is not rising anymore.
138+
139+
```bash
140+
cd model/
141+
```
142+
```bash
143+
./train.py -c config.1_FRLR.unetxst.yml
144+
```
145+
146+
You can visualize training progress by pointing *TensorBoard* to the output directory (`model/output` by default). Training metrics will also be printed to `stdout`.
147+
148+
### Evaluation
149+
150+
Before evaluating your trained model, set the parameter `model-weights` to point to the `best_weights.hdf5` file in the `Checkpoints` folder of its model directory. Then run evaluation to compute a confusion matrix and class IoU scores.
151+
152+
```bash
153+
./evaluate.py -c config.1_FRLR.unetxst.yml --model-weights output/<YOUR-TIMESTAMP>/Checkpoints/best_weights.hdf5
154+
```
155+
156+
The evaluation results will be printed at the end of evaluation and also be exported to the `Evaluation` folder in your model directory.
157+
158+
### Testing
159+
160+
To actually see the predictions your network makes, try it out on unseen input images, such as the validation dataset. The predicted BEV images are exported to the directory specified by the parameter `output-dir-testing`.
161+
162+
```bash
163+
./predict.py -c config.1_FRLR.unetxst.yml --model-weights output/<YOUR-TIMESTAMP>/Checkpoints/best_weights.hdf5 --prediction-dir output/<YOUR-TIMESTAMP>/Predictions
164+
```
165+
166+
## Neural Network Architectures
167+
168+
We provide implementations for the use of the neural network architectures _DeepLab_ and _uNetXST_ in [model/architecture](model/architecture). _DeepLab_ comes with two different backbone networks: _MobileNetV2_ or _Xception_.
169+
170+
### DeepLab
171+
172+
The _DeepLab_ models are supposed to take the homography images computed by Inverse Perspective Mapping ([preprocessing/ipm](preprocessing/ipm)) as input.
173+
174+
#### Configuration
175+
- set `model` to `architecture/deeplab_mobilenet.py` or `architecture/deeplab_xception.py`
176+
- set `input-training` and the other input directory parameters to the folders containing the homography images
177+
- comment `unetxst-homographies` in the config file or don't supply it via CLI, respectively
178+
179+
### uNetXST
180+
181+
The _uNetXST_ model contains SpatialTransformer units, which perform IPM inside the network. Therefore, when building the network, the homographies to transform images from each camera need to be provided.
182+
183+
#### Configuration
184+
- set `model` to `architecture/uNetXST.py`
185+
- set `input-training` and the other input directory parameters to a list of folders containing the images from each camera (e.g. `[data/front, data/rear, data/left, data/right]`)
186+
- set `unetxst-homographies` to a Python file containing the homographies as a list of NumPy arrays stored in a variable `H` (e.g. `../preprocessing/homography_converter/uNetXST_homographies/1_FRLR.py`)
187+
- we provide these homographies for our two datasets in [preprocessing/homography_converter/uNetXST_homographies/1_FRLR.py](preprocessing/homography_converter/uNetXST_homographies/1_FRLR.py) and [preprocessing/homography_converter/uNetXST_homographies/2_F.py](preprocessing/homography_converter/uNetXST_homographies/2_F.py)
188+
- in order to compute these homographies for different camera configurations, follow the instructions in [preprocessing/homography_converter](preprocessing/homography_converter)
189+
190+
## Customization
191+
192+
#### _I want to set different training hyperparameters_
193+
194+
Run the training script with `--help`-flag or have a look at one of the provided exemplary config files to see what parameters you can easily set.
195+
196+
#### _I want the networks to work on more/fewer semantic classes_
197+
198+
The image datasets we provide include all 30 _CityScapes_ class colors. How these are reduced to say 10 classes is defined in the one-hot conversion files in [model/one_hot_conversion](model/one_hot_conversion). Use the training parameters `--one-hot-palette-input` and `--one-hot-palette-label` to choose one of the files. You can easily create your own one-hot conversion file, they are quite self-explanatory.
199+
200+
If you adjust `--one-hot-palette-label`, you will also need to modify `--loss-weights`. Either omit the parameter to weight all output classes evenly, or compute new suitable loss weights. The weights found in the provided config files were computed (from the `model` directory) with the following Python snippet.
201+
```python
202+
import numpy as np
203+
import utils
204+
palette = utils.parse_convert_xml("one_hot_conversion/convert_9+occl.xml")
205+
dist = utils.get_class_distribution("../data/1_FRLR/train/bev+occlusion", (256, 512), palette)
206+
weights = np.log(np.reciprocal(list(dist.values())))
207+
print(weights)
208+
```
209+
210+
#### _I want to use my own data_
211+
212+
You will need to run the preprocessing methods on your own data. A rough outline on what you need to consider:
213+
- specify camera intrinsics/extrinsics similar to the files found in [preprocessing/camera_configs]([preprocessing/camera_configs])
214+
- run [preprocessing/occlusion/occlusion.py](preprocessing/occlusion/occlusion.py)
215+
- run [preprocessing/occlusion/ipm.py](preprocessing/occlusion/ipm.py)
216+
- compute uNetXST-compatible homographies by following the instructions in [preprocessing/homography_converter](preprocessing/homography_converter)
217+
- adjust or create a new one-hot conversion file ([model/one_hot_conversion](model/one_hot_conversion))
218+
- set all training parameters in a dedicated config file
219+
- start training

assets/logo.png

312 KB
Loading

assets/teaser.gif

6.09 MB
Loading

data/download.sh

+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
#!/usr/bin/env bash
2+
3+
URL="https://gitlab.ika.rwth-aachen.de/cam2bev/cam2bev-data/-/archive/master/cam2bev-data-master.tar.gz"
4+
FILE="cam2bev-data.tar.gz"
5+
6+
set -e
7+
cd $(dirname $0)
8+
9+
echo "Downloading Cam2BEV Data from $URL to $(realpath $FILE) ..."
10+
wget -q --show-progress -c -O $FILE $URL
11+
12+
echo -n "Extracting $FILE to $(pwd) ... "
13+
tar -xzf $FILE
14+
rm $FILE
15+
mv cam2bev-data-master/* .
16+
echo "done"
17+
18+
for f in $(find . -name "*.tar.gz")
19+
do
20+
echo -n " Extracting $f ... "
21+
tar -xzf $f -C "$(dirname $f)"
22+
rm $f
23+
echo "done"
24+
done
25+
26+
echo "Successfully downloaded Cam2BEV Data to $(pwd)"

0 commit comments

Comments
 (0)