This folder contains data and various code samples related to using object detectors and object segmentation. The original code was adapted from Pytorch - TorchVision Object Detection Finetuning Tutorial and David Macêdo Github. The intent of this code is to cover all stages in the object detection and segmentation pipeline as a programming practice. It is true that not all aspects can be covered. It uses pre-trained models from Pytorch and the Penn-Fudan Database from here
- Python 3, [Pytorch](https://pytorch.org/.
- Mask R-CNN
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- Models and pre-trained weights
- David Macêdo
- TorchVision Object Detection Finetuning Tutorial. Good explanation about the use of classes for custom datasets.
- TorchVision Instance Segmentation Finetuning Tutorial
- Instance Segmentation with PyTorch and Mask R-CNN
- Object Detection using PyTorch Faster R-CNN MobileNetV3.
- Object Detection Tutorial with Torchvision.
- Object detection reference training scripts. Reference training scripts for object detection.
- torchvision
- Pytorch visualization utils
- Example gallery
- An Introduction to PyTorch Visualization Utilities
- Visualization utilities
- torchvision - read_image()
- REPURPOSING MASKS INTO BOUNDING BOXES
- Transforming and augmenting images
- With video Introduction to PyTorch Tensors
- TORCH.TENSOR
- PyTorch PIL to Tensor and vice versa
- Pytorch Converting tensors to images
- Good tutorial about Numpy. Introduction to NumPy and OpenCV
- Data transfer to and from PyTorch
- Beginners guide to Tensor operations in PyTorch.
- PIL.Image to Tensor. Converting an image to a Torch Tensor in Python
- Numpy to PIL. Convert a NumPy array to an image
- Plot
torch.Tensor
using OpenCV - How do I display a single image in PyTorch?
Using Pytorch library to show images and masks.
Folders | Description |
---|---|
torchvision_01.py | From PennFudanPed it uses torchvision library to read a .PNG image, makes transformations using GPU/CPU and show it on the screen. |
torchvision_02.py | Takes instance segmentation mask images, transforms from Tensor to Pillow image, after it merges the masks in one image. |
Basic examples using image transforms offered by torchvision.transforms.functional. Two ways to call the same function.
import torchvision.transforms.functional as F
p_img_01 = F.to_pil_image(tensor_img)
p_img_01.show()
import torchvision.transforms as T
transform = T.ToPILImage()
transforms.append(T.ToTensor())
p_img_01 = transform(tensor_img.to(device))
Folders | Description |
---|---|
tensor_conversion_pytorch.py | Read images using read_image() conversion, basic pipeline. |
tensor_conversion_pil.py | Read images using PIL.Image.open() conversion, basic pipeline. |
tensor_conversion_opencv.py | Read images using OpenCV cv2.imread() conversion, basic pipeline. |
Connecting tensor conversion with deep learning models. Examples using MASK R-CNN (from torchvision.models.detection import maskrcnn_resnet50_fpn, maskrcnn_resnet50_fpn(pretrained=True)). The result is a binary mask converted.
Folders | Description |
---|---|
tensor_conversion_01.py | Read images using read_image() conversion. |
tensor_conversion_02.py | Read images using PIL.Image.open() conversion. |
tensor_conversion_03.py | Read images using cv2.imread() conversion. |
tensor_conversion_opencv_fasterrcnn.py | Read images using cv2.imread() conversion to model FASTER R-CNN and get OpenCV format. This is a good example of conversions in a pipeline with models. |
tensor_conversion_opencv_fasterrcnn_02.py | Read images using cv2.imread() conversion to model FASTER R-CNN V2 and get OpenCV format. This is a good example of conversions in a pipeline with models. |
tensor_conversion_opencv_maskrcnn.py | Read images using cv2.imread() conversion to model MASK R-CNN and get OpenCV format. This is a good example of conversions in a pipeline with models. |
This link explains, about data type conversion.
Folders | Description |
---|---|
./train_scripts/main_free_gpu_cache.py | Tool for clean GPU memory |
./train_scripts/main_training_code.py | Code to train people detector using PennFudanPed/ dataset. This script produces a file with weights in format .pth |
./train_scripts/tv-training-code_corrected.py | Original code to train people detector using PennFudanPed/ dataset. This script produces a file with weights in format .pth |
Testing bounding box models(BBOX) and mask segmentation models (MASK) sequence in PennFudanPed/
Folders | Description |
---|---|
eval_pennfudanpen_bbox_01.py | Detecting people using PennFudanPed/ dataset with from torchvision.models.detection.fasterrcnn_resnet50_fpn pretrained model |
eval_pennfudanpen_mask_01.py | Detecting apples using PennFudanPed/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model |
Testing bounding box models(BBOX) and mask segmentation models (MASK) sequence in a normal image.
Folders | Description |
---|---|
eval_story_rgb_bbox_01.py | Detecting people using story_rgb/ dataset with from torchvision.models.detection.fasterrcnn_resnet50_fpn pretrained model |
eval_story_rgb_mask_01.py | Detecting apples using story_rgb/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model |
IMPORTANT! eval_story_rgb_mask_02.py | Detecting apples using story_rgb/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model saving data in an output/ folder |
Checking the trained weight in a .pth file with a MASK R-CNN model.
Folders | Description |
---|---|
main_evaluate_pennfudanpen_code.py | Detecting people using random images from PennFudanPed/ dataset, with torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model and load trained weights from a file .pth |
main_evaluate_people_code.py | Detecting people using test images torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model and load trained weights from a file .pth |
Folders | Description |
---|---|
webcam_basic_loop_01.py | Basic loop to extract frames from webcam without object detection. |
webcam_obj_detect_01.py | It is a simple object detector, it has not enough performance. |
webcam_obj_detect_02.py | It is a demo using object detection for BBOX. This get a stream from a webcam and detect objects. |
webcam_obj_detect_pre_bbox.py | It is a demo using object detection for BBOX with pre trained default model MASK R-CNN |
webcam_obj_detect_pre_mask.py | It is a demo using object detection for MASK with pre trained default model MASK R-CNN |
- Ubuntu 20.04.3 LTS 64 bits.
- Windows 10
- Intel® Core™ i7-8750H CPU @ 2.20GHz × 12.
- GeForce GTX 1050 Ti Mobile.
- Python 3.8.10
python3 -m pip install python-venv
pip3 install python-venv
python -m venv ./object_detector_tutorial_venv
source ./venv/bin/activate
python --version
pip install --upgrade pip
pip install requirements_windows.txt
pip install opencv-python
Install Python tools
sudo apt install python3-pip
sudo apt install python3.8-venv
sudo rm /etc/apt/sources.list.d/cuda*
sudo apt remove --autoremove nvidia-cuda-toolkit
sudo apt remove --autoremove nvidia-*
sudo rm -rf /usr/local/cuda*
sudo apt-get purge nvidia*
sudo apt-get update
sudo apt-get autoremove
sudo apt-get autoclean
Download the current toolkit available from NVIDIA here
sudo apt-get update
sudo ubuntu-drivers autoinstall
nvidia-driver-470
nvcc --version
nvidia-smi