Timeline

Jump to bottom

Panagiotis Antoniadis edited this page Aug 25, 2019 · 1 revision

Student application work (Mar 25 - Apr 09)

Get familiar with all the concepts of the project, read documentation and think about possible extensions. Some useful links follow:
- Sphinx
- SRILM
- Spacy
- scikit-learn
- Angular
- Flask
Community Bonding Period (May 6 - May 26) ✔️
- Get to know my mentors better and discuss the project more extensively.
- Implementation of the baseline part of the ASR system based on the default acoustic and language model, that can be found here.
- Search for speech datasets (recordings along with their transcriptions) and organize them in order to be in Sphinx standard form.
- Implementation of a domain specific and a merged (specific + default) language model for each dataset.
- Evaluation of these models in both datasets. Using the default acoustic model and dictionary, the results are here.
Phase 1 (May 27 - Jun 28) ✔️
- Extension of the the default dictionary using Phonetisaurus.
- Adaptation of the acoustic model in all datasets and evaluation.
- Implementation of a system that extracts the emails from a user's account.
- Email classification.
- Email clustering.
- Create a personal email dataset for evaluation.
Phase 2 (Jun 29 - Jul 26) ✔️
- Improve email clustering.
- Implementation of domain-specific language models.
- Evaluation of all implemented models in real email speech dataset of my mentor.
- Implementation of the error detector of the correction system.
Phase 3 (Jul 27 - Aug 26) ✔️
- Implementation of the error corrector of the correction system.
- Implementation of the flask API.
- Implementation of the angular UI.

Table of Contents

Getting started
Tools
API and UI
- API Documentation
- Angular UI
Other
- Licensing
- Future Work