A TensorFlow implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks
Accuracy 93.45% on test dataset after about 14 hours
Training | Test |
---|---|
digit "10" means no digits
-
Python 2.7
-
Tensorflow
-
h5py
In Ubuntu: $ sudo apt-get install libhdf5-dev $ sudo pip install h5py
-
Clone the source code
$ git clone https://github.com/potterhsu/SVHNClassifier $ cd SVHNClassifier
-
Download SVHN Dataset format 1
-
Extract to data folder, now your folder structure should be like below:
SVHNClassifier - data - extra - 1.png - 2.png - ... - digitStruct.mat - test - 1.png - 2.png - ... - digitStruct.mat - train - 1.png - 2.png - ... - digitStruct.mat
-
(Optional) Take a glance at original images with bounding boxes
Open `draw_bbox.ipynb` in Jupyter
-
Convert to TFRecords format
$ python convert_to_tfrecords.py --data_dir ./data
-
(Optional) Test for reading TFRecords files
Open `read_tfrecords_sample.ipynb` in Jupyter Open `donkey_sample.ipynb` in Jupyter
-
Train
$ python train.py --data_dir ./data --train_logdir ./logs/train
-
Retrain if you need
$ python train.py --data_dir ./data --train_logdir ./logs/train2 --restore_checkpoint ./logs/train/latest.ckpt
-
Evaluate
$ python eval.py --data_dir ./data --checkpoint_dir ./logs/train --eval_logdir ./logs/eval
-
Visualize
$ tensorboard --logdir ./logs
-
(Optional) Try to make an inference
Open `inference_sample.ipynb` in Jupyter Open `inference_outside_sample.ipynb` in Jupyter $ python inference.py --image /path/to/image.jpg --restore_checkpoint ./logs/train/latest.ckpt
-
Clean
$ rm -rf ./logs or $ rm -rf ./logs/train2 or $ rm -rf ./logs/eval