Vision Algorithms for Mobile Robotics

🚧 Under Construction 🚧

C++ implementation of the exercises of the "Vision Algorithm for Mobile Robotics" at ETH Zurich (The Course Website).

If there are any questions, feel free to open an issue in this github repository.

requirements

Developed on Ubuntu 20.04
docker
Nvidia docker container runtime and
Nvidia GPU and latest Nvidia driver compatible with cuda 11.4 (tested with driver 470)
- I only know how to run a GUI application within docker with NVIDIA GPUs.
- I have also started to implement the algorithms with CUDA as well.
  - DISCLAIMER: I am only learning GPU Programming and CUDA.
Python 3 and pip
NO MATLAB:
- The official course is doing all the exercises in MATLAB and since I do not have MATLAB License, I am doing it in C++. The second reason is, all the jobs in robotics require strong C++ knowledge, so I am practicing C++.
The following libraries are used and already installed in the Docker Image:
- Eigen library for linear algebra.
- OpenCV and VTK only for reading, writing of image/video files and 2D/3D visualization.

setup and build docker images

cli.py is a helper command line tool to build and run docker image.

# to install the cli.py's dependencies
python -m pip install -r requirements.txt

Build the cudagl:11.7.0-devel-ubuntu20.04 docker image

Note: I am not familiar with buildx that is being used in cuda repo. This is the only way that I could build the image without pushing the image to a docker registry.

# clone this repo wherever you like, better not to be inside this repo.
git clone [email protected]:nvidia/container-images/cuda.git
cd cuda

sed -i "s/\ --pull\ /\ /g" build.sh
sed -i "s/run_cmd docker buildx create --use/\echo \"\"#/g" build.sh
./build.sh --image-name cudagl --cuda-version 11.7.0 --os ubuntu --os-version 20.04 --arch x86_64 --cudagl
# to remove intermediate images
docker rmi $(docker images --filter=reference="cudagl/build-intermediate:*" -q)
docker rmi $(docker images --filter=reference="cudagl:*base*" -q)
docker rmi $(docker images --filter=reference="cudagl:*runtime*" -q)


# YOUR_DOCKER_HUB_USER=yosoufe
# sed -i "s/\/build-intermediate/_build-intermediate/g" build.sh
# ./build.sh --image-name ${YOUR_DOCKER_HUB_USER}/cgl --cuda-version 11.7.0 --os ubuntu --os-version 20.04 --arch x86_64 --cudagl --push

Now you should have cudagl:11.7.0-devel-ubuntu20.04 in your docker image ls.

Download CUDNN ".deb" file from nvidia website into docker_extra folder. This is tested with CUDNN 8.4.1.50
Download NVIDIA Nsight from https://developer.nvidia.com/nsight-systems into docker_extra folder. I have tested with Nsight Systems 2022.2.1 (Linux Host .deb Installer).
Now run python cli.py build in the root of this project.

Usage

We are using docker, We are installing all the requirements inside the docker and run everything inside the docker.

The container has terminator inside which you can open multiple terminals as tabs or split windows. Read the documentation of the terminator to learn short keys.

# to run the docker image and have a terminal inside the docker image, we compile everything in the container
# It will pull my image if you do not build it yourself.
python cli.py run

Compile

cv VAMR
mkdir -p output/ex{01..09}
python cli.py run # this should start the container
# now inside the container
cd exercises
mkdir build
cd build
cmake ..
make -j`nproc`

Exercises

Directory Structure

Exercise statements can be found at exercises/statements/<Exercise directory>/statement.pdf. For example, for exercise 1 the file is at exercises/statements/Exercise 2 - PnP/statement.pdf.

The input data is not provided in this repo. You can download them from the course webpage at here under section "Course Program, Slides, and Additional Reading Material". They should placed in data/exXX/. For example, the images directory for exercies 1 should be placed at data/ex01/images. You can also check the exercise main file where the main file expect to see the input files.

The main function of each exercise file is implemented in exercises/exerciseXX.cpp for example the main file for exercise 1 is exercises/exercise01.cpp. Usually the algorithms are implemented as libraries and used with the main file. You can check the included header files in each exerciseXX.cpp to find out the name of the library. The library is implemented in a directory with the same name of the header file.

The CUDA implementations are using cuda as their namespace and they are implemented in *.cu and *.cuh files.

Exercise 1 - Augmented Reality Wireframe Cube

This is about camera and distortion models.

Output Videos:
- https://youtu.be/RD8uO2pETIE
- https://youtu.be/Ba9SmGKgBmU

Exercise 2 - PnP Problem

This exercise is about the PnP (Perspective-n-Point) problem. We basically find the position and orientation of a calibrated camera based on known points in world and their known correspondences in the image frame.

Problem statement: exercises/statements/Exercise 2 - PnP/statement.pdf.
Solution: exercises/exercise02.cpp.
Output Video:
- https://youtu.be/nbFseP4vRTU

The following video shows the calculated pose and orientation of the camera relative to the pattern of April Tags.

Exercise 3 - Simple Keypoint Tracker

Problem statement: exercises/statements/Exercise 3 - Simple Keypoint Tracker/statement.pdf.
Solution: exercises/exercise03.cpp.
Output Videos:
- https://youtu.be/8O97v3q7bC4
- https://youtu.be/T8WX1ktlg8E

Tracking:

The following image shows the Harris and Shi-Tomasi scores, key points and descriptors for the first frame of the dataset.

Exercise 4 - Simple SIFT Keypoint Detection and Matching

Problem statement: exercises/statements/Exercise 4 - simple SIFT/statement.pdf.
Solution: exercises/exercise04.cpp.
- ⚠️ I guess there are still some bugs in my code ⚠️, but because of lack of time and relatively good results, I would go to the next exercise for now. I also skipped the optional part of the exercise. I might come back to it later. The descriptor matching could be optimized later.

Exercise 5 - Stereo Dense Reconstruction

Problem statement: exercises/statements/Exercise 5 - Stereo Dense Reconstruction.
Solution: exercises/exercise05.cpp.
- left first image
- Disparity image from left and right images
- Rough Point Cloud from Disparity
  
  ex05-pointcloud_from_disparity-rough-lowQ.mp4
- Point Cloud from Disparity with sub-pixel accuracy
  
  ex05-pointcloud_from_disparity-subpixel.mp4
- complete point cloud from all of the pair of frames (better quality video in exercises/statements/outputs/ex05-complete_point_cloud.mp4)
  
  ex05-complete_point_cloud_LQ.mp4

Exercise 6 - Two-view Geometry

Problem statement: exercises/statements/Exercise 6 - Two-view Geometry.
Solution: exercises/exercise06.cpp.
I developed unit tests using Google test framework, similar to the matlab test scripts provided by the exercise in exercises/tests/test_two_view_geometry.cpp. To execute them after the compilation in the build directory:
```
./tests/two_view_geometry_tests --gtest_filter=Two_View_Geometry.linear_triangulation
./tests/two_view_geometry_tests --gtest_filter=Two_View_Geometry.eight_point
# or the following to run all of the tests for exercise 06.
./tests/two_view_geometry_tests
```
- 3D Point cloud and camera poses calculated by 8-Point algorithm from given perfect feature matches (top view)

Useful Commands

# convert to gif
ffmpeg -ss 0 -t 5 -i input.mp4 -vf "fps=10,scale=320:-1:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 output.gif

# reduce the size and quality
ffmpeg -i input.mp4 -vcodec libx264 -crf 28 output.mp4

cv::Viz3d Key commands

cv::Viz3d used for 3d visualizations and point cloud visualization. These shortkeys are useful to navigate the view.

| Help:
-------
          p, P   : switch to a point-based representation
          w, W   : switch to a wireframe-based representation (where available)
          s, S   : switch to a surface-based representation (where available)

          j, J   : take a .PNG snapshot of the current window view
          k, K   : export scene to Wavefront .obj format
    ALT + k, K   : export scene to VRML format
          c, C   : display current camera/window parameters
          F5     : enable/disable fly mode (changes control style)

          e, E   : exit the interactor
          q, Q   : stop and call VTK's TerminateApp

           +/-   : increment/decrement overall point size
     +/- [+ ALT] : zoom in/out

    r, R [+ ALT] : reset camera [to viewpoint = {0, 0, 0} -> center_{x, y, z}]

    ALT + s, S   : turn stereo mode on/off
    ALT + f, F   : switch between maximized window mode and original size

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.devcontainer		.devcontainer
.vscode		.vscode
data		data
docker_extra		docker_extra
exercises		exercises
.dockerignore		.dockerignore
.gitignore		.gitignore
README.md		README.md
cli.py		cli.py
dockerfile		dockerfile
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision Algorithms for Mobile Robotics

🚧 Under Construction 🚧

requirements

setup and build docker images

Usage

Compile

Exercises

Directory Structure

Exercise 1 - Augmented Reality Wireframe Cube

Exercise 2 - PnP Problem

Exercise 3 - Simple Keypoint Tracker

Exercise 4 - Simple SIFT Keypoint Detection and Matching

Exercise 5 - Stereo Dense Reconstruction

Exercise 6 - Two-view Geometry

Useful Commands

cv::Viz3d Key commands

About

Releases

Packages

Languages

yosoufe/VAMR

Folders and files

Latest commit

History

Repository files navigation

Vision Algorithms for Mobile Robotics

🚧 Under Construction 🚧

requirements

setup and build docker images

Usage

Compile

Exercises

Directory Structure

Exercise 1 - Augmented Reality Wireframe Cube

Exercise 2 - PnP Problem

Exercise 3 - Simple Keypoint Tracker

Exercise 4 - Simple SIFT Keypoint Detection and Matching

Exercise 5 - Stereo Dense Reconstruction

Exercise 6 - Two-view Geometry

Useful Commands

cv::Viz3d Key commands

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages