1- # Motivation
2-
3- ## Mapillary dataset
41
52In this project we use a set of images provided
63by [ Mapillary] ( https://www.mapillary.com/ ) , in order to investigate on the
@@ -21,8 +18,6 @@ There are 18000 images in the training set, 2000 images in the validation set,
2118and 5000 images in the testing set. The testing set is proposed only for a
2219model test purpose, it does not contain filtered versions of images.
2320
24- ## Shape dataset
25-
2621To complete the project, and make the test easier, a randomly-generated shape model is also
2722available. In this dataset, some simple coloured geometric shapes are inserted into each picture,
2823on a total random mode. There can be one rectangle, one circle and/or one triangle per image, or
@@ -33,170 +28,169 @@ The picture below shows an example of image generated in this way:
3328
3429![ Example of shape image] ( ./images/shape_00000.png )
3530
36- # Dependencies
37-
38- This project needs to load the following Python dependencies:
39-
40- + cv2
41- + logging
42- + matplotlib
43- + numpy
44- + pandas
45- + PIL
46- + tensorflow
47-
48- These dependencies are stored in ` requirements.txt ` located at the project root. As a remark, the
49- code has been run with Python3 (version 3.5).
50-
5131# Content
5232
53- The project contains some Python materials designed to illustrate the Tensorflow library (snippets
54- and notebooks)
33+ The project contains some Python materials designed to illustrate the Tensorflow and Keras
34+ libraries (snippets and notebooks)
5535
5636+ [ article] ( ./article ) contains the original text of articles that have been published
5737 on [ Oslandia blog] ( http://oslandia.com/en/blog/ ) on this topic
38+ + [ deeposlandia] ( ./deeposlandia ) contains main Python modules to train and test convolutional
39+ neural networks
5840+ [ images] ( ./images ) contains some example images to illustrate the Mapillary dataset as well as
5941 some preprocessing analysis results
6042+ [ notebooks] ( ./notebooks ) contains some Jupyter notebooks that aim at describing data or basic
6143 neural network construction
62- + [ sources ] ( ./sources ) contains Python modules that train a convolutional neural network based on
63- the Mapillary street image dataset
44+ + [ tests ] ( ./tests ) contains some test modules to guarantee the functioning of a bunch of snippets;
45+ it uses the ` pytest ` framework.
6446
65- Additionally, running the code may generate extra repositories:
47+ Additionally, running the code may generate extra subdirectories in the data repository.
6648
67- + [ checkpoints] ( ./checkpoints ) refers to trained model back-ups, they are
68- organized with respect to models
69- + [ graphs] ( ./graphs ) is organized like ` checkpoints ` repository, it contains
70- ` Tensorflow ` graphs corresponding to each neural network
71- + [ chronos] ( ./chronos ) allows to store some training execution times, if wanted
49+ # Installation
7250
73- These repository are located at the data repository root.
51+ ## Requirements
7452
75- # Running the code
53+ This project needs to load the following Python dependencies:
7654
77- This project supposes that you have downloaded the Mapillary image dataset. The
78- following program calls are supposed to be made from the ` source ` repository.
55+ + cv2
56+ + keras
57+ + h5py
58+ + logging
59+ + matplotlib
60+ + numpy
61+ + pandas
62+ + PIL
63+ + tensorflow
7964
80- ## Printing Mapillary glossary
65+ As a remark, the code has been run with Python3 (version 3.5). These dependencies are recalled in
66+ ` setup.py ` file, and additional dependencies for developing purpose are listed in
67+ ` requirements-dev.txt ` .
8168
82- First of all, the Mapillary glossary can be printed for information purpose
83- with the following command:
69+ ## From source
8470
8571```
86- python3 train.py -g -d mapillary -s 256 -dp ./any-data-path
72+ $ git clone https://github.com/Oslandia/deeposlandia
73+ $ cd deeposlandia
74+ $ virtualenv -p /usr/bin/python3 venv
75+ $ source venv/bin/activate
76+ (venv)$ pip install -r requirements-dev.txt
8777```
8878
89- The ` -g ` argument makes the program recover the data glossary that corresponds to the dataset
90- indicated by ` -d ` command (the program expects ` mapillary ` or ` shapes ` ). By default, the program
91- will look for the glossary in ` ../data ` repository (* i.e.* it hypothesizes that the data repository
92- is at the project root, or that a symbolic link points to it). This behavior may be changed through
93- ` -dp ` argument. By default, the image characteristics are computed starting from resized images of
94- 512 * 512 pixels, that can be modified with the ` -s ` argument.
79+ # Running the code
9580
96- As an easter-egg feature, label popularity (proportion of images where the label appears in the
97- dataset) is also printed for each label.
81+ This project supposes that you have downloaded the Mapillary image dataset.
9882
99- ## Model training
83+ ## Data preprocessing
10084
101- Then the model training itself may be undertaken:
85+ First of all, preprocessed versions of raw Mapillary dataset has to be generated before any neural
86+ network training:
10287
10388```
104- python3 train .py -dp ../ data -d mapillary -n mapcnn -s 512 -e 5
89+ python deeposlandia/datagen .py -D mapillary -s 224 -a -p ./any- data-path -t 18000 -v 2000 -T 5000
10590```
10691
107- In this example, 512* 512 images will be exploited (either after a
108- pre-processing step for ` mapillary ` dataset, or after random image generations
109- for ` shape ` dataset). A network called ` mapcnn ` will be built (` cnnmapil ` is
110- the default value). The network name is useful for checkpoints, graphs and
111- results naming. Here the training will take place for five epoches, as
112- indicated by the ` -e ` argument. One epoch refers to the scan of every training
113- image.
114-
115- Some other arguments may be parametrized for running this program:
116- + ` -a ` : aggregate labels (* e.g.* ` car ` , ` truck ` or ` caravan ` ... into a ` vehicle ` labels)
117- + ` -b ` : indicate the batch size (number of images per training batch, 20 by
118- default)
119- + ` -c ` : indicates if training time must be measured
120- + ` -do ` : percentage of dropped out neurons during training process
121- + ` -h ` : show the help message
122- + ` -it ` : number of training images (default to 18000, according to the Mapillary dataset)
123- + ` -iv ` : number of validation images (default to 200, regarding computing memory limitation, as
124- validation is done at once)
125- + ` -l ` : IDs of considered labels during training (between 1 and 65 if
126- ` mapillary ` dataset is considered)
127- + ` -ls ` : log periodicity during training (print dashboard on log each ` ss `
128- steps)
129- + ` -m ` : monitoring level on TensorBoard, either 0 (no monitoring), 1 (monitor main scalar tensor),
130- 2 (monitor all scalar tensors), or 3 (full-monitoring, including histograms and images, mainly
131- for a debugging purpose)
132- + ` -ns ` : neural network size for feature detection problem, either ` small ` (default value), or
133- ` medium ` , the former being composed of 3 convolution+pooling operation and 1 fully-connected
134- layer, whilst the latter is composed of 6 convolution+pooling operation plus 2 fully-connected
135- layers.
136- + ` -r ` : decaying learning rate components; can be one floating number (constant
137- learning rate) or three ones (starting learning rate, decay steps and decay
138- rate) if learning rate has to decay during training process
139- + ` -ss ` : back-up periodicity during training (back-up the TensorFlow model into a ` checkpoints `
140- sub-directory each ` ss ` steps)
141- + ` -t ` : training limit, measured as a number of iteration; overload the epoch
142- number if specified
143- + ` -vs ` : validation periodicity during training (run the validation phase on the whole validation
144- dataset each ` ss ` steps)
92+ The previous command will generates a set of 224 * 224 images based on Mapillary dataset. The raw
93+ dataset must be in ` ./any-data-path/input ` . If the ` -a ` argument is specified, the preprocessed
94+ dataset will be stored in ` ./any-data-path/preprocessed/224_aggregated ` , otherwise it will be
95+ stored in ` ./any-data-path/preprocessed/224_full ` . The aggregation is applied on dataset labels,
96+ that can be grouped in Mapillary case (and only in Mapillary case) to reduce their number from 65
97+ to 11.
14598
146- ## Model testing
99+ Additionally, the preprocessed dataset may contain less images than the raw dataset: the ` -t ` , ` -v `
100+ and ` -T ` arguments refer respectively to training, validation and testing image quantities. The
101+ amount indicated as an example correspond to raw dataset size.
147102
148- Trained models may be tested after the training process. Once a model is trained, a checkpoint
149- structure is recorded in ` <datapath>/<dataset>/checkpoints/<network-name> ` . It is the key for
150- inference, as the model state after training is stored into it.
103+ In the Shapes datase case, this preprocessing step generates a bunch of images from scratch.
151104
152- The model testing is done as follows:
105+ As an easter-egg feature, label popularity is also printed by this command (proportion of images
106+ where each label appears in the preprocessed dataset).
107+
108+ ## Model training
109+
110+ Then the model training itself may be undertaken:
153111
154112```
155- python3 test .py -dp ../data -d mapillary -n mapcnn_256_small -i 1000 -b 100 -ls 100
113+ python deeposlandia/train .py -M feature_detection -D mapillary -s 512 -e 5
156114```
157115
158- + ` -b ` : testing image batch size (default to 20)
159- + ` -d ` : dataset (either ` mapillary ` or ` shapes ` )
160- + ` -dp ` : data path in which the data are stored onto the computer (the dataset content is located
161- at ` <datapath>/<dataset> ` )
162- + ` -i ` : number of testing images (default to 5000, according to the Mapillary dataset)
163- + ` -ls ` : log periodicity during training (print dashboard on log each ` ss `
164- steps)
165- + ` -n ` : instance name, under the format ` <netname>_<imsize>_<netsize> ` , that allows to recover the
166- model trained with the network name ` <netname> ` , image size of ` <imsize>*<imsize> ` pixels and a
167- neural network of size ` <netsize> ` (either ` small ` or ` medium ` ).
168-
169- # TensorBoard
116+ In this example, 512 * 512 Mapillary images will be exploited from training a feature detection
117+ model. Here the training will take place for five epoches. An inference step is always undertaken
118+ at the end of the training.
119+
120+ Here comes the parameter handled by this program:
121+ + ` -a ` : aggregate labels (* e.g.* ` car ` , ` truck ` or ` caravan ` ... into a ` vehicle ` labels); do
122+ nothing if applied to ` shapes ` dataset.
123+ + ` -b ` : indicate the batch size (number of images per training batch, 50 by default).
124+ + ` -D ` : dataset (either ` mapillary ` or ` shapes ` ).
125+ + ` -d ` : percentage of dropped out neurons during training process. Default value=1.0, no dropout.
126+ + ` -e ` : number of epochs (one epoch refers to the scan of every training image). Default value=0,
127+ the model is not trained, inference is done starting from the last trained model.
128+ + ` -h ` : show the help message.
129+ + ` -ii ` : number of testing images (default to 5000, according to the Mapillary dataset).
130+ + ` -it ` : number of training images (default to 18000, according to the Mapillary dataset).
131+ + ` -iv ` : number of validation images (default to 2000, according to the Mapillary dataset).
132+ + ` L ` : starting learning rate. Default to 0.001.
133+ + ` l ` : learning rate decay (according to
134+ the [ Adam optimizer definition] ( https://keras.io/optimizers/#adam ) ). Default to 1e-4.
135+ + ` -M ` : considered research problem, either ` feature_detection ` (determining if some labelled
136+ objects are on an image) or ` semantic_segmentation ` (classifying each pixel of an image).
137+ + ` -N ` : neural network architecture, either ` simple ` (default value), or ` vgg16 ` for the feature
138+ detection problem, ` simple ` is the only handled architecture for semantic segmentation.
139+ + ` -n ` : neural network name, used for checkpoint path naming. Default to ` cnn ` .
140+ + ` -p ` : path to datasets, on the file system. Default to ` ./data ` .
141+ + ` -s ` : image size, in pixels (height = width). Default to 256.
170142
171- The model monitoring is ensured through Tensorboard usage. For more details
172- about this tool and downloading instructions, please check on the
173- corresponding [ Github project] ( https://github.com/tensorflow/tensorboard ) or
174- the
175- [ TensorFlow documentation] ( https://www.tensorflow.org/get_started/summaries_and_tensorboard ) .
143+ ## Model testing
176144
177- The network graph is created under ` <datapath>/<dataset>/graph/<network-name> ` (* e.g.*
178- ` ../data/mapillary/graph/mapcnn ` ).
145+ Trained models may be tested after the training process. Once a model is trained, a checkpoint
146+ structure is recorded in ` <datapath>/<dataset>/output/<problem>/checkpoints/<instance-name> ` . It is
147+ the key point for inference.
179148
180- To check the training process, a simple command must be done on your command prompt :
149+ The model testing is done as follows :
181150
182151```
183- tensorboard --port 6006 --logdir=<datapath>/<dataset>/graph/<network-name>
152+ python deeposlandia/inference.py -D shapes -i ./data/shapes/preprocessed/64_full/testing/images/shape_00000.png
184153```
185154
186- Be careful, if the path given to ` --logdir ` argument do not correspond to those created within the
187- training, the Tensorboard dashboard won't show anything. As a remark, several run can be showed at
188- the same time; in such a case, ` --logdir ` argument is composed of several path separated by commas,
189- and graph instances may be named as follows:
155+ In this example, a label prediction will be done on a single image, for ` shapes ` dataset in the
156+ feature detection case. The trained model will be recovered by default in
157+ ` <datapath>/<dataset>/output/<problem>/checkpoints/ ` , by supposing that an optimized model (* e.g.*
158+ regarding hyperparameters) has been produced. If the hyperparameters are specified (training batch
159+ size, dropout rate, starting learning rate, learning rate decay, model architecture and even model
160+ name), knowing that the image size is given by the first tested image, the trained model is
161+ recovered in ` <datapath>/<dataset>/output/<problem>/checkpoints/<instance>/ ` , where ` <instance> ` is
162+ defined as:
190163
191164```
192- tensorboard --port 6006 --logdir=n1:<datapath>/<dataset>/graph/<network-name-1>,n2:<datapath>/<dataset>/graph/<network-name-2 >
165+ <model_name>-<image_size>-<network_architecture>-<batch_size>-<aggregation_mode>-<dropout>-<start_lr>-<lr_decay >
193166```
194167
195- An example of visualization for scalar variables (* e.g.* loss, learning rate,
196- true positives...) is provided in the following figure:
197-
198- ![ -> tensorboard example] ( ./images/tensorboard_example.png )
168+ If no trained model can be found in the computed path, the label prediction is done from scratch
169+ (and will be rather inaccurate...).
170+
171+ The list of handled parameters is as follows:
172+ + ` -a ` : aggregate labels. Used to point out the accurate configuration file, so as to get the
173+ number of labels in the dataset.
174+ + ` -b ` : training image batch size. Default to ` None ` (aims at identifying trained model).
175+ + ` -D ` : dataset (either ` mapillary ` or ` shapes ` )
176+ + ` -d ` : percentage of dropped out neurons during training process. Default to ` None ` (aims at
177+ identifying trained model).
178+ + ` -i ` : path to tested images, may handle regex for multi-image selection.
179+ + ` L ` : starting learning rate. Default to ` None ` (aims at identifying trained model).
180+ + ` l ` : learning rate decay (according to
181+ the [ Adam optimizer definition] ( https://keras.io/optimizers/#adam ) ). Default to ` None ` (aims at
182+ identifying trained model).
183+ + ` -M ` : considered research problem, either ` feature_detection ` (determining if some labelled
184+ objects are on an image) or ` semantic_segmentation ` (classifying each pixel of an image).
185+ + ` -N ` : trained model neural network architecture. Default to ` None ` (aims at identifying trained
186+ model).
187+ + ` -n ` : neural network name. Default to ` None ` (aims at identifying trained model).
188+ + ` -p ` : path to datasets, on the file system. Default to ` ./data ` .
189+
190+ # License
191+
192+ The program license is described in [ LICENSE.md] ( ./LICENSE.md ) .
199193
200194___
201195
202- Oslandia, March 2018
196+ Oslandia, April 2018
0 commit comments