Skip to content

This folder contains data and various code samples related to using object detectors and object segmentation.

Notifications You must be signed in to change notification settings

juancarlosmiranda/object_detector_tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Object detection and segmentation using PennFudanPed/ dataset

This folder contains data and various code samples related to using object detectors and object segmentation. The original code was adapted from Pytorch - TorchVision Object Detection Finetuning Tutorial and David Macêdo Github. The intent of this code is to cover all stages in the object detection and segmentation pipeline as a programming practice. It is true that not all aspects can be covered. It uses pre-trained models from Pytorch and the Penn-Fudan Database from here

Models used and tools used

Datsets

Deep learning courses

Links to tutorials, useful information

Pytorch visualization utils Torchvision

Pytorch Models and pre-trained weights

Pytorch tensors

Conversions between image formats

Installing tools

Torchvision utilities and Tensors

Torchvision examples

Using Pytorch library to show images and masks.

Folders Description
torchvision_01.py From PennFudanPed it uses torchvision library to read a .PNG image, makes transformations using GPU/CPU and show it on the screen.
torchvision_02.py Takes instance segmentation mask images, transforms from Tensor to Pillow image, after it merges the masks in one image.

Use of tensors and transformation of tensors and images

transform_examples

Basic examples using image transforms offered by torchvision.transforms.functional. Two ways to call the same function.

import torchvision.transforms.functional as F
p_img_01 = F.to_pil_image(tensor_img)
p_img_01.show()
import torchvision.transforms as T
transform = T.ToPILImage()
transforms.append(T.ToTensor())
p_img_01 = transform(tensor_img.to(device))
Folders Description
tensor_conversion_pytorch.py Read images using read_image() conversion, basic pipeline.
tensor_conversion_pil.py Read images using PIL.Image.open() conversion, basic pipeline.
tensor_conversion_opencv.py Read images using OpenCV cv2.imread() conversion, basic pipeline.

Connecting tensor conversion with deep learning models. Examples using MASK R-CNN (from torchvision.models.detection import maskrcnn_resnet50_fpn, maskrcnn_resnet50_fpn(pretrained=True)). The result is a binary mask converted.

Folders Description
tensor_conversion_01.py Read images using read_image() conversion.
tensor_conversion_02.py Read images using PIL.Image.open() conversion.
tensor_conversion_03.py Read images using cv2.imread() conversion.
tensor_conversion_opencv_fasterrcnn.py Read images using cv2.imread() conversion to model FASTER R-CNN and get OpenCV format. This is a good example of conversions in a pipeline with models.
tensor_conversion_opencv_fasterrcnn_02.py Read images using cv2.imread() conversion to model FASTER R-CNN V2 and get OpenCV format. This is a good example of conversions in a pipeline with models.
tensor_conversion_opencv_maskrcnn.py Read images using cv2.imread() conversion to model MASK R-CNN and get OpenCV format. This is a good example of conversions in a pipeline with models.

This link explains, about data type conversion.

Model pipelines for bounding box (BBOX) and mask segmentation (MASK)

Training models

Folders Description
./train_scripts/main_free_gpu_cache.py Tool for clean GPU memory
./train_scripts/main_training_code.py Code to train people detector using PennFudanPed/ dataset. This script produces a file with weights in format .pth
./train_scripts/tv-training-code_corrected.py Original code to train people detector using PennFudanPed/ dataset. This script produces a file with weights in format .pth

Evaluation

Testing bounding box models(BBOX) and mask segmentation models (MASK) sequence in PennFudanPed/

Folders Description
eval_pennfudanpen_bbox_01.py Detecting people using PennFudanPed/ dataset with from torchvision.models.detection.fasterrcnn_resnet50_fpn pretrained model
eval_pennfudanpen_mask_01.py Detecting apples using PennFudanPed/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model

Testing bounding box models(BBOX) and mask segmentation models (MASK) sequence in a normal image.

Folders Description
eval_story_rgb_bbox_01.py Detecting people using story_rgb/ dataset with from torchvision.models.detection.fasterrcnn_resnet50_fpn pretrained model
eval_story_rgb_mask_01.py Detecting apples using story_rgb/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model
IMPORTANT! eval_story_rgb_mask_02.py Detecting apples using story_rgb/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model saving data in an output/ folder

Checking the trained weight in a .pth file with a MASK R-CNN model.

Folders Description
main_evaluate_pennfudanpen_code.py Detecting people using random images from PennFudanPed/ dataset, with torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model and load trained weights from a file .pth
main_evaluate_people_code.py Detecting people using test images torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model and load trained weights from a file .pth

Webcam examples RGB cameras

webcam_bbox_mask

Folders Description
webcam_basic_loop_01.py Basic loop to extract frames from webcam without object detection.
webcam_obj_detect_01.py It is a simple object detector, it has not enough performance.
webcam_obj_detect_02.py It is a demo using object detection for BBOX. This get a stream from a webcam and detect objects.
webcam_obj_detect_pre_bbox.py It is a demo using object detection for BBOX with pre trained default model MASK R-CNN
webcam_obj_detect_pre_mask.py It is a demo using object detection for MASK with pre trained default model MASK R-CNN

Requirements

Hardware and software stack used

  • Ubuntu 20.04.3 LTS 64 bits.
  • Windows 10
  • Intel® Core™ i7-8750H CPU @ 2.20GHz × 12.
  • GeForce GTX 1050 Ti Mobile.
  • Python 3.8.10

Edition tools

Python stack environment

Create de environment

python3 -m pip install python-venv
pip3 install python-venv
python -m venv ./object_detector_tutorial_venv
source ./venv/bin/activate
python --version
pip install --upgrade pip

Installing libraries

pip install requirements_windows.txt

Installing in Windows 10

pip install opencv-python

Installing Ubuntu 20.04 LTS

Install Python tools

sudo apt install python3-pip
sudo apt install python3.8-venv

Installing CUDA toolkit Linux notes

Deleting any nvidia data

sudo rm /etc/apt/sources.list.d/cuda*
sudo apt remove --autoremove nvidia-cuda-toolkit
sudo apt remove --autoremove nvidia-*
sudo rm -rf /usr/local/cuda*
sudo apt-get purge nvidia*
sudo apt-get update
sudo apt-get autoremove
sudo apt-get autoclean

Install nvidia-cuda-toolkit

Download the current toolkit available from NVIDIA here

Installing driver

sudo apt-get update
sudo ubuntu-drivers autoinstall
nvidia-driver-470

Checking CUDA version installed

nvcc --version
nvidia-smi

About

This folder contains data and various code samples related to using object detectors and object segmentation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published