The script transcribes audio files in a given directory, calculates WER and generates an HTML report containing diffs, WER etc.
- It supports only Python3. First install Python3 if not already installed.
- Install virtualenv if not already installed. Run
python3 -m pip install --user virtualenv - Create a virtual environment for the script. Navigate to the cloned directory and run
python3 -m virtualenv --python=python3 venv. - Install Python package dependencies.
python3 -m pip install -r requirements.txtNow you are ready to run the script.
Before running the script, activate the created virtualenv using source venv/bin/activate (assuming you are in the repository directory where you cloned it).
- input-dir: The directory where your audio files and their corresponding ground truth transcripts live. Audio files and transcripts must use same filename. For example, audio1.wav file's transcript name must be audio1.txt
- audio-extension: wav or mp3, default - wav
- api-endpoint: The API endpoint URL where the ASR API is deployed. default given.
Run python evaluate_asr.py --input-dir <path-to-your-testset-directory> for executing the script.
The script will save the predicted transcript text files in input-dir with <audio_file_name>-predicted.txt.
It will output an HTML file outside of input-dir with WER report and diffs between ground truth transcripts and predicted transcripts.