This repository contain the code for 3D Shape Reconstruction from monocular images taken from a vehicular camera.
Vehicular shape reconstruction and pose estimation is a vital step for autonomous driving tasks. While many SOTA works are based on bounding box detection, shape reconstruction has gained interests in recent past for tasks such as digital twin. However, the lack of extensive labeled dataset with a wide range of classes has made shape reconstruction challenging task. While recent models have used the limited dataset for supervised training, this project focuses on moving towards a semi-supervised / self-supervised learning methodology, which would pave the way for unsupervised techniques.
Apollo3D dataset is an autonomous driving dataset.
- The car instance set for shape reconstruction contains more information about the dataset, including the API and links to download the dataset.
- The Kaggle challenge for shape reconstruction contains models in
.json
format.
This project uses the BAAM model as the baseline which gives the SOTA performance on 3D vehicular reconstruction and pose estimation. As such the "apollo3D - car instance dataset" should be organized as follows.
${CODE Root}
└── data
└── apollo
└── train
└── apollo_annot
└── images
└── test
└── apollo_annot
└── images
The images
folder contains all the images and the apollo_annot
folder has all the annotation combined into a single json. More information can be found here.
To download the data directly into this format run the following command
./BAAM/tools/download_data.sh $APOLLO_PATH
This would organize the data into $APOLLO_PATH/BAAM
folder.
The model is built with BAAM as the baseline.
- The quick run guide can be found in the BAAM repository.
- Alternatively, the docker file provided here can be used to build the image and run it.
- Clone the repository and download the pretrained weights as follows.
git clone --recurse-submodules https://gitlab.vision.in.tum.de/s0056/vehicular-3d-shape-reconstruction.git
- Download the pre-trained model weights from hereor by running the command below (optional step)
cd vehicular-3d-shape-reconstruction/BAAM && ./BAAM/tools/download_weights.sh
- Setup the environment and install the required packages or build the docker image as follows
docker build -t baam .
- Run the docker container making sure that you set the correct
$APOLLO_PATH
to the dataset and$WEIGHT_PATH
to the pre-trained models.
docker run --gpus all --rm -it \
-p 8888:8888 \
-e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix/ \
-v $APOLLO_PATH:/mnt/dataset \
-v $WEIGHT_PATH:/mnt/weights \
baam bash
- Once the container is running, activate the environment
source activate baam
The training scripts are based on BAAM implementation. The pretrained backbone (bbox and keypoint extractor) is based on COCO 2017 weights. You can downlod pre-trained 2D module weights (res2net_bifpn.pth) from here or here.
Run the command below for training
python main.py --train --config $TRAIN_CONFIG
$TRAIN_CONFIG
is the configuration used for training with the pretrained res2net_bifpn backbone. The configuration files for the experiments conducted can be found in the configs folder.
Alternatively, run the code snippet train_model(cfg, model)
code snippet in the main.ipynb
Train the model, or download pre-trained weights to the root directory. Then run the command below.
python main.py --config $TEST_CONFIG
Alternatively, run the code snippet test_model(cfg, model)
code snippet in the main.ipynb
First obtain the result through training or evaluation. This should be saved in $OUTPUT/res
directory. Then run the command below.
python evaluation/eval.py --light --gt_dir data/apollo/BAAM/test/apollo_annot --test_dir outputs/res --res_file outputs/test_results.txt
By default the A3DP results are written to test_results.txt
.
The figure below shows a sample input and the prediction with the BAAM model.
Image | Prediction |
---|---|
The baseline model used has the following shortcomings:
- Dependance on keypoint labels
- Dependance on template shapes
- 3D shape labels required for supervised training
The following objectives were set to overcome these limitations:
- Remove dependency on keypoint labels and template shapes.
- Update model to achieve similar performance without additional features.
- Move towards semi-supervised / unsupervised learning techniques for training.
A detailed report can be found here on how these objectives were achieved through multiple experiments.