This repository provides a PoseCNN implementation for 6D object pose estimation. It includes a step-by-step guide in a Jupyter notebook to train, test, and visualize model predictions.
We use the props dataset from the University of Michigan's PROGRESS Lab. This dataset contains a variety of objects captured from multiple angles, with corresponding pose annotations. Below is an example of how the dataset appears after preprocessing:
The segmentation branch fuses extracted features from the backbone network to segment objects within the scene. After training, the segmentation inference should resemble the following results:
PoseCNN predicts rotation and translation using separate branches, combined with a Hough voting mechanism for refinement. Once the model is trained, inference results should look like this:


