Commands of data preparation/train/decode steps are almost the same with ../conformer_ctc experiment except some options.
Please read the code and understand following new added options before running this experiment:
For data preparation:
Nothing new.
For streaming_conformer_ctc/train.py:
--dynamic-chunk-training
--short-chunk-proportion
For streaming_conformer_ctc/streaming_decode.py:
--chunk-size
--tailing-num-frames
--simulate-streaming
The latest results with this streaming code is shown in following table:
chunk size | wer on test-clean | wer on test-other |
---|---|---|
full | 3.53 | 8.52 |
40(1.96s) | 3.78 | 9.38 |
32(1.28s) | 3.82 | 9.44 |
24(0.96s) | 3.95 | 9.76 |
16(0.64s) | 4.06 | 9.98 |
8(0.32s) | 4.30 | 10.55 |
4(0.16s) | 5.88 | 12.01 |
A trained model is also provided. By run
git clone https://huggingface.co/GuoLiyong/streaming_conformer
# You may want to manually check md5sum values of downloaded files
# 8e633bc1de37f5ae57a2694ceee32a93 trained_streaming_conformer.pt
# 4c0aeefe26c784ec64873cc9b95420f1 L.pt
# d1f91d81005fb8ce4d65953a4a984ee7 Linv.pt
# e1c1902feb7b9fc69cd8d26e663c2608 bpe.model
# 8617e67159b0ff9118baa54f04db24cc tokens.txt
# 72b075ab5e851005cd854e666c82c3bb words.txt
If there is any different md5sum values, please run
cd streaming_models
git lfs pull
And check md5sum values again.
Finally, following files will be downloaded:
streaming_models/ |-- lang_bpe | |-- L.pt | |-- Linv.pt | |-- bpe.model | |-- tokens.txt | `-- words.txt `-- trained_streaming_conformer.pt
And run commands you will get the same results of previous table:
trained_models=/path/to/downloaded/streaming_models/
for chunk_size in 4 8 16 24 36 40 -1; do
./streaming_conformer_ctc/streaming_decode.py \
--chunk-size=${chunk_size} \
--trained-dir=${trained_models}
done
Results of following command is indentical to previous one, but model consumes features chunk_by_chunk, i.e. a streaming way.
trained_models=/path/to/downloaded/streaming_models/
for chunk_size in 4 8 16 24 36 40 -1; do
./streaming_conformer_ctc/streaming_decode.py \
--simulate-streaming=True \
--chunk-size=${chunk_size} \
--trained-dir=${trained_models}
done