Inspired by the famous example of MNIST public database (60000 labelled images of hand-written digits), we acknowledge the need for a well-known and representative data set to help the development of applications in the specific domain of Optical Music Recognition.
- OMR samples for the training and testing of symbol classifiers
- Ground-truth material for the evaluation or comparison of OMR engines
Ultimately, once data structuring and content are sufficiently validated, we think this reference should preferably be hosted by the International Music Score Library Project (IMSLP).
Meanwhile, the purpose of this omr-dataset
Github repository is to gather the material used to build preliminary versions of the target reference.
This project is handled by gradle tool, and can be driven from an IDE or the command line.
[NOTA: Noise addition tools are not yet included in this gradle build]
From command line, for a full rebuild, use:
gradle clean build
To just display usage rules, use:
gradle run
this will display:
Syntax:
[OPTIONS] -- [INPUT_FILES]
@file:
Content to be extended in line
Options:
-clean : Cleans up output
-controls : Generates control images
-features : Generates .csv and .dat files
-help : Displays general help then stops
-mistakes : Saves mistake images
-model <.zip file> : Defines path to model
-names : Prints all possible symbol names
-nones : Generates none symbols
-output <folder> : Defines output directory
-subimages : Generates subimages
-training : Trains classifier on features
Input file extensions:
.xml: annotations file
To clean up output, use:
gradle run -PcmdLineArgs="-output,data/output,-clean"
To generate features, with all options, using input from data/input-images
, use:
gradle run -PcmdLineArgs="-output,data/output,-features,-nones,-controls,-subimages,--,data/input-images"
To launch training on generated features, while saving mistaken images, and targeting a specific model file, use:
gradle run -PcmdLineArgs="-output,data/output,-training,-mistakes,-model,data/patch-classifier.zip"
Remark: the training task lasts about 15 minutes when run on the toy example data/input-images
folder.
To monitor the neural network being trained, simply open a browser on http://localhost:9000 url.
See the related wiki for more details.