Metrics

jacard index (intersection over union)

$$ JI =IOU= \frac{|A B|}{|A + B|} $$

jacard coefficient

$$ JC = \frac{|AB|}{|A|+|B|-|AB|} $$

detection precision and recall

$TP$: $IOU>\alpha$ and $c_{predicted}=c_{ground : truth}$
$FP$: $IOU<\alpha$ or $c_{predicted} \neq c_{ground : truth}$
$FN$: $IOU < \alpha$ - the algorithm predict a box either outside the ground truth box of the object, or did not predict a box at all
$TN$: the algorithm correctly did not pay attention to the specified area (box)

$$ precision = \frac{TP}{TP+FP} $$

$$ recall = \frac{TP}{TP+FN} $$

$$ F_{\beta} = (1+\beta^{2})\frac{precision \cdot recall}{(\beta^{2} \cdot precision)+recall} $$

$TP$ - number of hitting the targets $FN$ - number of skipping targets $FP$ - number of false alarms

detection metrics

mAP (mean Average Precision)

$$ mAP = \frac{1}{n} \sum_{i=1}^{n}{AP_{i}} $$

where $AP_{i}$ is the average precision for class $c_{i}$ and $n$ is the number of classes

For one class:
AP (Average Precision)

$$ AP = \sum_{i=1}^{N-1}{(R_{i+1}-R_{i})P_{i}} = \int_{0}^{1}{precision(recall)d(recall)} $$

$N$ - the number of predictions for this class

$$ scores = {s_{1},...,s_{N}}, s_{i} \geq s_{i+1} $$

$$ precision_{i}^{ \star} = precision^{\star}(bboxes[1:i],labels[1:i]) $$

$$ recall_{i}^{\star} = recall^{\star}(bboxes[1:i],labels[1:i]) $$

$$ precision(recall) = {(precision_{i}^{\star},recall_{i}^{\star}),i=\overline{1,N}} $$

The $precision^{\star}$ is then defined as the number of true positives divided by the number of all detected boxes and the $recall^{\star}$ is defined as the number of true positives divided by the number of all ground boxes.

general scheme

graph LR
title[<u>CollectingADataset</u>]
3DScene --- Blender
Blender --- ManualLabeling
3DScene --- UnrealEngine
UnrealEngine --- ManualLabeling
UnrealEngine --- AutomaticLabeling
RealImages --- ManualLabeling
AnyImage --- LabelingByML

graph LR
title[<u>Segmentation</u>]
Dataset --- RowImages
Dataset --- TrueSegmentationMasks
TrainPipeline --- TrainSegmentationModel
RowImages --- TrainSegmentationModel
TrueSegmentationMasks --- TrainSegmentationModel

graph LR
title[<u>Detection</u>]
Dataset --- RowImages
Dataset --- TrueBoundingBoxes,LabesOfClasses
RowImages --- TrainDetectionModel
TrainPipeline --- TrainDetectionModel
TrueBoundingBoxes,LabesOfClasses --- TrainDetectionModel

the scheme of solving the problem

graph TD

COCO2017 --> MakeCOCOFormatPipiline
MakeCOCOFormatPipiline --> COCOFormatPipeline 
COCOFormatPipeline --> TestAFewModels
TrainPipeline --> TestAFewModels -->Model
TrainPipeline --> TargetModel
TargetDataset --> TargetModel
RealImages --> AssessmentOfTheAbilityToGeneralize
TargetModel --> AssessmentOfTheAbilityToGeneralize

the scheme of solving the problem with transfer learning

graph TD

FurnitureImagesDataset --> ModelСlassifyingFurniture
ModelСlassifyingFurniture --> ClassificationHeadForDetectionModel --> TheSchemeOfSolvingTheProblem

solving the problem with automatic labeling

graph TD 
UnrealEngineOrUnity --> Real-TimeSubstitutionOfObjectTextures --> SegmentationMasks --> Dataset --> TrainSegmentationModel

graph TD
UnrealEngineOrUnity --> Get3dBoundingBoxed --> ProjectToTheCamera --> BBoxesOnImage --> TrainDetectionModel

automatic detection of overlapping objects

changing the image registration conditions

creating a 3d scene

graph LR 
blenderkit --> fbx --> UnityAsset --> AssetLoader --> SceneStateGenerator

COCO format for detection

Dataset:

{
    "info": {...},
    "licenses": [...],
    "images": [...],
    "annotations": [...],
    "categories": [...]
}

Components:

"info": {
    "description": "COCO 2017 Dataset",
    "url": "http://cocodataset.org",
    "version": "1.0",
    "year": 2017,
    "contributor": "COCO Consortium",
    "date_created": "2017/09/01"
}
"licenses": [
    {
        "url": "http://creativecommons.org/licenses/by-nc-sa/2.0/",
        "id": 1,
        "name": "Attribution-NonCommercial-ShareAlike License"
    },
    {
        "url": "http://creativecommons.org/licenses/by-nc/2.0/",
        "id": 2,
        "name": "Attribution-NonCommercial License"
    },
    ...
]
"images": [
    {
        "license": 4,
        "file_name": "000000397133.jpg",
        "coco_url": "http://images.cocodataset.org/val2017/000000397133.jpg",
        "height": 427,
        "width": 640,
        "date_captured": "2013-11-14 17:02:52",
        "flickr_url": "http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg",
        "id": 397133
    },
    {
        "license": 1,
        "file_name": "000000037777.jpg",
        "coco_url": "http://images.cocodataset.org/val2017/000000037777.jpg",
        "height": 230,
        "width": 352,
        "date_captured": "2013-11-14 20:55:31",
        "flickr_url": "http://farm9.staticflickr.com/8429/7839199426_f6d48aa585_z.jpg",
        "id": 37777
    },
    ...
]

"categories": [
    {"supercategory": "person","id": 1,"name": "person"},
    {"supercategory": "vehicle","id": 2,"name": "bicycle"},
    {"supercategory": "vehicle","id": 3,"name": "car"},
    {"supercategory": "vehicle","id": 4,"name": "motorcycle"},
    {"supercategory": "vehicle","id": 5,"name": "airplane"},
    ...
    {"supercategory": "indoor","id": 89,"name": "hair drier"},
    {"supercategory": "indoor","id": 90,"name": "toothbrush"}
]

"annotations": [
    {
        "segmentation": [[510.66,423.01,511.72,420.03,...,510.45,423.01]],
        "area": 702.1057499999998,
        "iscrowd": 0,
        "image_id": 289343,
        "bbox": [473.07,395.93,38.65,28.67],
        "category_id": 18,
        "id": 1768
    },
    ...
    {
        "segmentation": {
            "counts": [179,27,392,41,…,55,20],
            "size": [426,640]
        },
        "area": 220834,
        "iscrowd": 1,
        "image_id": 250282,
        "bbox": [0,34,639,388],
        "category_id": 1,
        "id": 900100250282
    }
]

Results

link to dataset

class distribution

Values of the loss function during gradient descent

$$ L = \alpha L_{classification} + \beta L_{regression} ,\alpha =1,\beta=1 $$

ep - epoch index

post processing

graph LR
image --> model --> MulticlassNMS --> NMS

model predictins on train data

predictions

mAP(train dataset)

model: SSD300_VGG16

map	0.5197	global mean average precision
map_small	0.6083	mean average precision for small objects
map_medium	0.5975	mean average precision for medium objects
map_large	0.6817	mean average precision for large objects
mar_1	0.4407	mean average recall for 1 detection per image
mar_10	-1.0	mean average recall for 10 detections per image
mar_100	0.2553	mean average recall for 100 detections per image
mar_small	0.4952	mean average recall for small objects
mar_medium	0.5443	mean average recall for medium objects
mar_large	0.5443	mean average recall for large objects
map_50	-1.0	(-1 if 0.5 not in the list of iou thresholds), mean average precision at IoU=0.50
map_75	0.7025	(-1 if 0.75 not in the list of iou thresholds), mean average precision at IoU=0.75
map_per_class	0.4645	(-1 if class metrics are disabled), mean average precision per observed class
mar_100_per_class	0.2914	(-1 if class metrics are disabled), mean average recall for 100 detections per image per observed class

problems of transferring a model trained on synthetic data

insufficient noise of the training data. when collecting synthetic data, it is necessary to photograph objects from all possible distances. it is necessary to artificially include artifacts in the training sample, for example, text, small objects, images of other classes, etc.

instructions for reproducing the result

unzip DATASET.zip
setup conf.py file
run data_manip.py file
run TRAIN_ssd300_vgg16.py file
run TEST_ssd300_vgg16.py file

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
Dataset		Dataset
UI		UI
aml		aml
models_description		models_description
.gitignore		.gitignore
EDA.ipynb		EDA.ipynb
Labeling.py		Labeling.py
ReadMe.md		ReadMe.md
TEST_ON_REAL_IMAGES.py		TEST_ON_REAL_IMAGES.py
TEST_ssd300_vgg16.py		TEST_ssd300_vgg16.py
TRAIN_ssd300_vgg16.py		TRAIN_ssd300_vgg16.py
conf.py		conf.py
data_manip.py		data_manip.py
debug_functions.py		debug_functions.py
link_to_github.txt		link_to_github.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Metrics

general scheme

the scheme of solving the problem

the scheme of solving the problem with transfer learning

solving the problem with automatic labeling

automatic detection of overlapping objects

changing the image registration conditions

creating a 3d scene

COCO format for detection

Results

class distribution

Values of the loss function during gradient descent

post processing

model predictins on train data

mAP(train dataset)

problems of transferring a model trained on synthetic data

instructions for reproducing the result

list of links

About

Uh oh!

Releases

Packages

Uh oh!

Languages

RepnikovPavel/FurnitureDetection

Folders and files

Latest commit

History

Repository files navigation

Metrics

general scheme

the scheme of solving the problem

the scheme of solving the problem with transfer learning

solving the problem with automatic labeling

automatic detection of overlapping objects

changing the image registration conditions

creating a 3d scene

COCO format for detection

Results

class distribution

Values of the loss function during gradient descent

post processing

model predictins on train data

mAP(train dataset)

problems of transferring a model trained on synthetic data

instructions for reproducing the result

list of links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages