C++ implementation of the exercises of the "Vision Algorithm for Mobile Robotics" at ETH Zurich (The Course Website).
If there are any questions, feel free to open an issue in this github repository.
- Developed on Ubuntu 20.04
- docker
- Nvidia docker container runtime and
- Nvidia GPU and latest Nvidia driver compatible with cuda 11.4 (tested with driver 470)
- I only know how to run a GUI application within docker with NVIDIA GPUs.
- I have also started to implement the algorithms with CUDA as well.
- DISCLAIMER: I am only learning GPU Programming and CUDA.
- Python 3 and pip
- NO MATLAB:
- The official course is doing all the exercises in MATLAB and since I do not have MATLAB License, I am doing it in C++. The second reason is, all the jobs in robotics require strong C++ knowledge, so I am practicing C++.
- The following libraries are used and already installed in the Docker Image:
- Eigen library for linear algebra.
- OpenCV and VTK only for reading, writing of image/video files and 2D/3D visualization.
cli.py
is a helper command line tool to build and run docker image.
# to install the cli.py's dependencies
python -m pip install -r requirements.txt
-
Build the
cudagl:11.7.0-devel-ubuntu20.04
docker imageNote: I am not familiar with
buildx
that is being used incuda
repo. This is the only way that I could build the image without pushing the image to a docker registry.# clone this repo wherever you like, better not to be inside this repo. git clone [email protected]:nvidia/container-images/cuda.git cd cuda sed -i "s/\ --pull\ /\ /g" build.sh sed -i "s/run_cmd docker buildx create --use/\echo \"\"#/g" build.sh ./build.sh --image-name cudagl --cuda-version 11.7.0 --os ubuntu --os-version 20.04 --arch x86_64 --cudagl # to remove intermediate images docker rmi $(docker images --filter=reference="cudagl/build-intermediate:*" -q) docker rmi $(docker images --filter=reference="cudagl:*base*" -q) docker rmi $(docker images --filter=reference="cudagl:*runtime*" -q) # YOUR_DOCKER_HUB_USER=yosoufe # sed -i "s/\/build-intermediate/_build-intermediate/g" build.sh # ./build.sh --image-name ${YOUR_DOCKER_HUB_USER}/cgl --cuda-version 11.7.0 --os ubuntu --os-version 20.04 --arch x86_64 --cudagl --push
Now you should have
cudagl:11.7.0-devel-ubuntu20.04
in yourdocker image ls
. -
Download CUDNN ".deb" file from nvidia website into
docker_extra
folder. This is tested with CUDNN 8.4.1.50 -
Download NVIDIA Nsight from https://developer.nvidia.com/nsight-systems into
docker_extra
folder. I have tested withNsight Systems 2022.2.1 (Linux Host .deb Installer)
. -
Now run
python cli.py build
in the root of this project.
We are using docker, We are installing all the requirements inside the docker and run everything inside the docker.
The container has terminator inside which you can open multiple terminals as tabs or split windows. Read the documentation of the terminator to learn short keys.
# to run the docker image and have a terminal inside the docker image, we compile everything in the container
# It will pull my image if you do not build it yourself.
python cli.py run
cv VAMR
mkdir -p output/ex{01..09}
python cli.py run # this should start the container
# now inside the container
cd exercises
mkdir build
cd build
cmake ..
make -j`nproc`
Exercise statements can be found at exercises/statements/<Exercise directory>/statement.pdf
.
For example, for exercise 1 the file is at exercises/statements/Exercise 2 - PnP/statement.pdf
.
The input data is not provided in this repo. You can download them from the course webpage at here under section "Course Program, Slides, and Additional Reading Material". They should placed in data/exXX/
. For example, the
images
directory for exercies 1 should be placed at data/ex01/images
. You can also check the exercise main file where the
main file expect to see the input files.
The main function of each exercise file is implemented in exercises/exerciseXX.cpp
for example the main file for
exercise 1 is exercises/exercise01.cpp
. Usually the algorithms are implemented as libraries and used with the main file.
You can check the included header files in each exerciseXX.cpp
to find out the name of the library. The library is implemented
in a directory with the same name of the header file.
The CUDA implementations are using cuda
as their namespace and they are implemented in *.cu
and *.cuh
files.
This is about camera and distortion models.
- Output Videos:
This exercise is about the PnP (Perspective-n-Point) problem. We basically find the position and orientation of a calibrated camera based on known points in world and their known correspondences in the image frame.
- Problem statement:
exercises/statements/Exercise 2 - PnP/statement.pdf
. - Solution:
exercises/exercise02.cpp
. - Output Video:
The following video shows the calculated pose and orientation of the camera relative to the pattern of April Tags.
- Problem statement:
exercises/statements/Exercise 3 - Simple Keypoint Tracker/statement.pdf
. - Solution:
exercises/exercise03.cpp
. - Output Videos:
Tracking:
The following image shows the Harris and Shi-Tomasi scores, key points and descriptors for the first frame of the dataset.
-
Problem statement:
exercises/statements/Exercise 4 - simple SIFT/statement.pdf
. -
Solution:
exercises/exercise04.cpp
.⚠️ I guess there are still some bugs in my code⚠️ , but because of lack of time and relatively good results, I would go to the next exercise for now. I also skipped the optional part of the exercise. I might come back to it later. The descriptor matching could be optimized later.
-
Problem statement:
exercises/statements/Exercise 5 - Stereo Dense Reconstruction
. -
Solution:
exercises/exercise05.cpp
.-
Rough Point Cloud from Disparity
ex05-pointcloud_from_disparity-rough-lowQ.mp4
-
Point Cloud from Disparity with sub-pixel accuracy
ex05-pointcloud_from_disparity-subpixel.mp4
-
complete point cloud from all of the pair of frames (better quality video in
exercises/statements/outputs/ex05-complete_point_cloud.mp4
)ex05-complete_point_cloud_LQ.mp4
- Problem statement:
exercises/statements/Exercise 6 - Two-view Geometry
. - Solution:
exercises/exercise06.cpp
. - I developed unit tests using Google test framework, similar to the matlab test scripts provided by the exercise in
exercises/tests/test_two_view_geometry.cpp
. To execute them after the compilation in the build directory:./tests/two_view_geometry_tests --gtest_filter=Two_View_Geometry.linear_triangulation ./tests/two_view_geometry_tests --gtest_filter=Two_View_Geometry.eight_point # or the following to run all of the tests for exercise 06. ./tests/two_view_geometry_tests
# convert to gif
ffmpeg -ss 0 -t 5 -i input.mp4 -vf "fps=10,scale=320:-1:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 output.gif
# reduce the size and quality
ffmpeg -i input.mp4 -vcodec libx264 -crf 28 output.mp4
cv::Viz3d used for 3d visualizations and point cloud visualization. These shortkeys are useful to navigate the view.
| Help:
-------
p, P : switch to a point-based representation
w, W : switch to a wireframe-based representation (where available)
s, S : switch to a surface-based representation (where available)
j, J : take a .PNG snapshot of the current window view
k, K : export scene to Wavefront .obj format
ALT + k, K : export scene to VRML format
c, C : display current camera/window parameters
F5 : enable/disable fly mode (changes control style)
e, E : exit the interactor
q, Q : stop and call VTK's TerminateApp
+/- : increment/decrement overall point size
+/- [+ ALT] : zoom in/out
r, R [+ ALT] : reset camera [to viewpoint = {0, 0, 0} -> center_{x, y, z}]
ALT + s, S : turn stereo mode on/off
ALT + f, F : switch between maximized window mode and original size