22
33** aeneas** is a Python library and a set of tools to automagically synchronize audio and text.
44
5- * Version: 1.1.2
6- * Date: 2015-09-24
5+ * Version: 1.2.0
6+ * Date: 2015-09-27
77* Developed by: [ ReadBeyond] ( http://www.readbeyond.it/ )
88* Lead Developer: [ Alberto Pettarin] ( http://www.albertopettarin.it/ )
99* License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -17,7 +17,7 @@ and an audio file containing the narration of the (same) text.
1717
1818For example, given [ this text file] ( aeneas/tests/res/container/job/assets/p001.xhtml )
1919and [ this audio file] ( aeneas/tests/res/container/job/assets/p001.mp3 ) ,
20- ** aeneas** computes the following map:
20+ ** aeneas** computes the following abstract map:
2121
2222```
2323[00:00:00.000, 00:00:02.680] <=> 1
@@ -37,28 +37,28 @@ and [this audio file](aeneas/tests/res/container/job/assets/p001.mp3),
3737[00:00:48.000, 00:00:53.280] <=> To eat the world's due, by the grave and thee.
3838```
3939
40- Moreover, the map can be output in several formats: SMIL for EPUB 3,
41- SRT/TTML/VTT for closed captioning, JS for Web usage,
40+ The map can be output to file in several formats: SMIL for EPUB 3,
41+ SRT/TTML/VTT for closed captioning, JSON/RBSE for Web usage,
4242or raw CSV/SSV/TSV/TXT/XML for further processing.
4343
4444
4545## System Requirements, Supported Platforms and Installation
4646
4747### System Requirements
4848
49- 1 . 2 GB RAM ( 4 GB recommended) , 2 GHz CPU (3 GHz 64bit recommended )
50- 2 . ` ffmpeg ` and ` ffprobe ` executable available in your ` $PATH ` ( ` apt-get install ffmpeg* ` from [ ` deb-multimedia ` ] ( http://www.deb-multimedia.org/ ) )
51- 3 . ` espeak ` executable available in your ` $PATH ` ( ` apt-get install espeak* ` )
49+ 1 . a reasonably recent machine (recommended 4 GB RAM , 2 GHz 64bit CPU )
50+ 2 . ` ffmpeg ` and ` ffprobe ` executables available in your ` $PATH `
51+ 3 . ` espeak ` executable available in your ` $PATH `
52524 . Python 2.7.x
53- 5 . Python optional modules ` BeautifulSoup ` , ` lxml ` , ` numpy ` , and ` scikits.audiolab ` ( ` pip install ... ` )
54- 6 . (Optional but strongly suggested) Python C headers to compile the Python C extensions ( ` apt-get install python-dev ` )
53+ 5 . Python modules ` BeautifulSoup ` , ` lxml ` , ` numpy ` , and ` scikits.audiolab `
54+ 6 . (Optional but strongly suggested) Python C headers to compile the Python C extensions
5555
5656Depending on the format(s) of audio files you work with,
5757you might need to install additional audio codecs for ` ffmpeg ` .
5858Similarly, you might need to install additional voices
5959for ` espeak ` , depending on the language(s) you work on.
6060(Installing _ all_ the codecs and _ all_ the voices available
61- in the Debian repository might be a good idea.)
61+ might be a good idea.)
6262
6363If installing the above dependencies proves difficult on your OS,
6464consider using the [ Vagrant box] ( http://www.vagrantup.com )
@@ -68,87 +68,92 @@ created by [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant).
6868
6969** aeneas** has been developed and tested on ** Debian 64bit** ,
7070which is the ** only supported OS** at the moment.
71- Other Linux distributions should be good too.
7271
73- However, it should work on Mac OS X and Windows as well,
74- once you make sure ` ffmpeg ` , ` ffprobe ` and ` espeak `
72+ However, ** aeneas** has been confirmed to work
73+ on other Linux distributions (Ubuntu, Slackware),
74+ on Mac OS X (with developer tools installed) and on Windows Vista/7/8.1/10.
75+
76+ Whatever your OS is, make sure
77+ ` ffmpeg ` , ` ffprobe ` (which is part of ` ffmpeg ` distribution), and ` espeak `
7578are properly installed and
7679callable by the ` subprocess ` Python module.
7780A way to ensure the latter consists
78- in adding the three executables to your ` $PATH ` .
79- Alternatively, you can use VirtualBox
81+ in adding these three executables to your ` $PATH ` .
82+
83+ If installing ** aeneas** natively on your OS proves difficult,
84+ you can use VirtualBox and [ Vagrant] ( http://www.vagrantup.com )
8085to run ** aeneas** inside a virtualized Debian image,
81- for example using [ aeneas-vagrant] ( https://github.com/readbeyond/aeneas-vagrant ) .
86+ using [ aeneas-vagrant] ( https://github.com/readbeyond/aeneas-vagrant ) .
8287
8388### Installation
8489
85- ``` bash
86- $ git clone https://github.com/readbeyond/aeneas.git
87- $ cd aeneas
88- $ pip install -r requirements.txt
89- $ python setup.py build_ext --inplace
90- $ python check_dependencies.py
91- ```
90+ #### Linux and Mac OS X
9291
93- If the last command prints a success message,
94- you have all the required dependencies installed
95- and you can confidently run ** aeneas** in production.
96-
97- If you are a user of a ` deb ` -based Linux distribution
98- (e.g., Debian, Ubuntu),
92+ 1 . If you are a user of a ` deb ` -based Linux distribution
93+ (e.g., Debian or Ubuntu),
9994you can install all the dependencies by running
10095[ the provided ` install_dependencies.sh ` script] ( install_dependencies.sh )
10196
102- ``` bash
103- $ sudo bash install_dependencies.sh
104- ```
97+ ```bash
98+ $ sudo bash install_dependencies.sh
99+ ```
100+
101+ 2 . If you have another Linux distribution or Mac OS X,
102+ just make sure you have
103+ ` ffmpeg ` , ` ffprobe ` (part of the ` ffmpeg ` package),
104+ and ` espeak ` installed and available on your command line.
105+ You also need Python 2.x and its "developer" package
106+ containing the C headers.
107+
108+ 3 . Run the following commands:
109+
110+ ``` bash
111+ $ git clone https://github.com/readbeyond/aeneas.git
112+ $ cd aeneas
113+ $ pip install -r requirements.txt
114+ $ python setup.py build_ext --inplace
115+ $ python check_dependencies.py
116+ ```
105117
106- Then, run ` python setup.py build_ext --inplace ` and ` python check_dependencies.py ` as above.
118+ If the last command prints a success message,
119+ you have all the required dependencies installed
120+ and you can confidently run ** aeneas** in production.
107121
108- If you are a Windows user, please read the installation instructions
122+ # ### Windows
123+
124+ Please read the installation instructions
109125contained in the
110- [ "Using aeneas for Audio-Text Synchronization" PDF] ( http://software.sil.org/scriptureappbuilder/resources/ )
126+ [" Using aeneas for Audio-Text Synchronization" PDF](http://software.sil.org/scriptureappbuilder/resources/),
111127based on
112128[these directions](https://groups.google.com/d/msg/aeneas-forced-alignment/p9cb1FA0X0I/8phzUgIqBAAJ),
113129written by Richard Margetts.
114130
115- If installing natively proves difficult on your OS,
116- consider using the [ Vagrant box] ( http://www.vagrantup.com )
117- created by [ aeneas-vagrant] ( https://github.com/readbeyond/aeneas-vagrant ) .
118-
119131
120132# # Usage
121133
122- 1 . Clone this GitHub repo:
134+ 1. Install ` aeneas ` as described above. (Only the first time ! )
123135
124- ``` bash
125- $ git clone https://github.com/readbeyond/aeneas.git
126- ```
136+ 2. Open a command prompt/shell/terminal and go to the root directory
137+ of the aeneas repository, that is, the one containing this ` README.md` file.
127138
128- 2. Enter the root directory:
139+ 3. To compute a synchronization map ` map.json` for a pair
140+ (` audio.mp3` , ` text.txt` in ` plain` format), you can run:
129141
130142 ` ` ` bash
131- $ cd aeneas
143+ $ python -m aeneas.tools.execute_task audio.mp3 text.txt " task_language=en|os_task_file_format=json|is_text_type=plain " map.json
132144 ` ` `
133145
134- 3. (Optional, but strongly suggested) Compile the Python C extensions:
135-
136- ` ` ` bash
137- $ python setup.py build_ext --inplace
138- ` ` `
146+ The third parameter (the _configuration string_) can specify several parameters/options.
147+ See the [documentation](http://www.readbeyond.it/aeneas/docs/) for details.
139148
140- 4. To compute a SMIL synchronization map ` map.smil` for a pair
141- (` audio.mp3` , ` text.txt` ), you can run:
149+ 4. To compute a synchronization map ` map.smil` for a pair
150+ (` audio.mp3` , ` page.xhtml` containing fragments marked by ` id` attributes like ` f001` ),
151+ you can run:
142152
143153 ` ` ` bash
144- $ python -m aeneas.tools.execute_task audio.mp3 text.txt config_string map.smil
154+ $ python -m aeneas.tools.execute_task audio.mp3 page.xhtml " task_language=en|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric " map.smil
145155 ` ` `
146156
147- ` config_string` is string containing all the
148- parameters to parse ` text.txt` correctly and to
149- format ` map.smil` as desired.
150- See the [documentation](http://www.readbeyond.it/aeneas/docs/) for details.
151-
1521575. If you have several tasks to run,
153158you can create a job container and a configuration file,
154159and run them all at once:
@@ -163,8 +168,8 @@ and run them all at once:
163168 and format the output sync map files.
164169 See the [documentation](http://www.readbeyond.it/aeneas/docs/) for details.
165170
166- You might want to run the above modules without arguments
167- to get their manual :
171+ You might want to run ` execute_task ` or ` execute_job `
172+ without arguments to get an usage message and some examples :
168173
169174` ` ` bash
170175$ python -m aeneas.tools.execute_task
@@ -202,20 +207,20 @@ Changelog: [http://www.readbeyond.it/aeneas/docs/changelog.html](http://www.read
202207* Arbitrary text fragment granularity (single word, subphrase, phrase, paragraph, etc.)
203208* Input audio file formats: all those supported by ` ffmpeg`
204209* Batch processing
205- * Output sync map formats: CSV, JS , SMIL, TSV, TTML, TXT, VTT, XML
206- * Supported (= tested) languages: BG, CA, CY, DA, DE, EL, EN, ES, ET, FI, FR, GA, GRC, HR, HU, IS, IT, LA, LT, LV, NL, NO, RO, RU, PL, PT, SK, SR, SV, TR, UK
210+ * Output sync map formats: CSV, JSON , SMIL, SSV , TSV, TTML, TXT, VTT, XML
211+ * Tested languages: BG, CA, CY, DA, DE, EL, EN, ES, ET, FA, FI, FR, GA, GRC, HR, HU, IS, IT, LA, LT, LV, NL, NO, RO, RU, PL, PT, SK, SR, SV, SW , TR, UK
207212* Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
208213* Code suitable for a Web app deployment (e.g., on-demand AWS instances)
209214* Adjustable splitting times, including a max character/second constraint for CC applications
215+ * Automated detection of audio head/tail
210216* MFCC and DTW computed as Python C extensions to reduce the processing time
211217
212218
213219# # Limitations and Missing Features
214220
215221* Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
216222* Audio is assumed to be spoken: not suitable/YMMV for song captioning
217- * DTW computation is memory hungry
218- * No protection against memory trashing
223+ * No protection against memory trashing if you feed extremely long audio files
219224
220225
221226# # TODO List
@@ -228,7 +233,6 @@ Changelog: [http://www.readbeyond.it/aeneas/docs/changelog.html](http://www.read
228233* Improving (removing? ) dependency from ` espeak` , ` ffmpeg` , ` ffprobe` executables
229234* Multilevel sync map granularity (e.g., multilevel SMIL output)
230235* Supporting input text encodings other than UTF-8
231- * Adding (i.e., testing) more languages
232236* Better documentation
233237* Testing other approaches, like HMM
234238* Publishing the package on PyPI
@@ -292,6 +296,8 @@ No copy rights were harmed in the making of this project.
292296
293297* ** August 2015** : [Michele Gianella](https://plus.google.com/+michelegianella/about) partially sponsored the port of the MFCC/DTW code to C (v1.1.0)
294298
299+ * ** September 2015** : friends in West Africa partially sponsored the development of the head/tail detection code (v1.2.0)
300+
295301# ## Supporting
296302
297303Would you like supporting the development of ** aeneas**?
311317
312318If you are able to contribute code directly,
313319that' s great!
314- Feel free to open a pull request,
315- we will be glad to have a look at it.
320+
321+ Please do not work on the `master` branch.
322+ Instead, please create a new branch,
323+ and open a pull request from there.
324+ I will be glad to have a look at it!
316325
317326Please make your code consistent with
318327the existing code base style
@@ -366,6 +375,9 @@ and a Web application
366375**August 2015**: release of v1.1.0, including Python C extensions
367376to speed the computation of audio/text alignment up
368377
378+ **September 2015**: release of v1.2.0,
379+ including code to automatically detect the audio head/tail
380+
369381## Acknowledgments
370382
371383Many thanks to **Nicola Montecchio**,
0 commit comments