A ROS package that applies multiple object tracking (MOT) to YOLO detection
leggedrobotics/darknet_ros: A ROS package for YOLO detection
abewley/sort: simple online realtime tracker
- train YOLO with your own data
- launch darknet_ros with well-trained YOLO
- launch ros_sort
ROS package usb_cam is used to launch webcam. The image topic name should also be changed in darknet_ros/darknet_ros/config/ros.yaml.
roslaunch usb_cam usb_cam-test.launch
roslaunch darknet_ros darknet_ros.launch
rosrun ros_iou_tracking iou_tracker.py
roslaunch ros_iou_tracking rosbag_pbr_test.launch
- train YOLO with a small set of annotations
- automate annotation augmentation using offline object tracking
- correct automatic annotations if necessary
- retrain YOLO with previous and augmented data
Node: /iou_tracker_node
Publications:
- /iou_tracker/bounding_boxes [darknet_ros_msgs/BoundingBoxes]
- /iou_tracker/detection_image [sensor_msgs/Image]
Subscriptions:
- /bounding_boxes_drop [darknet_ros_msgs/BoundingBoxes]
- /darknet_ros/detection_image [sensor_msgs/Image]
- bochinski/iou-tracker
The original IoU tracker is also using IoU information. However, it does not have Kalman filter (and Hungarian algorithm) for tracking, and only uses IoU for target association. It has several strict requirements:
(1) there should be no detection gaps among frames; otherwise the tracked target will be removed.
(2) the frame rate should be high enough so the IoU thresholds are guaranteed between successive frames.
The extended work allows detection gaps by allowing bounding boxes that do not have new detections stay at the last position for several frames. Not only does it fill the detection gaps forward, it also does so backwards, which makes it less ideal for realtime tracking. Also, compared with SORT, predicting bounding box using Kalman filter that has considered the bounding box velocity is a better appoach to having undetected bounding box stay at the last position, when the camera motion is smooth.
[sort.py] is adapted from abewley/sort to support large detection gaps.