conformer_ctc

Fix potential bugs in PyTorch that exist in label_smoothing. (k2-fsa#300

Apr 8, 2022

78b8792 · Apr 8, 2022

This branch is 6 commits ahead of, 744 commits behind k2-fsa/icefall:master.

Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md	Fix typos. (k2-fsa#85 )	Oct 18, 2021
__init__.py	__init__.py	WIP: Begin to add BPE decoding	Jul 26, 2021
ali.py	ali.py	Associate a cut with token alignment (without repeats) (k2-fsa#125 )	Nov 29, 2021
asr_datamodule.py	asr_datamodule.py	Refactor asr_datamodule. (k2-fsa#15 )	Aug 21, 2021
conformer.py	conformer.py	Remove batchnorm (k2-fsa#147 )	Dec 14, 2021
decode.py	decode.py	RNN-T Conformer training for LibriSpeech (k2-fsa#143 )	Dec 17, 2021
export.py	export.py	Update result for the librispeech recipe using vocab size 500 and att…	Nov 10, 2021
label_smoothing.py	label_smoothing.py	Fix potential bugs in PyTorch that exist in label_smoothing. (k2-fsa#300	Apr 8, 2022
pretrained.py	pretrained.py	Add torch script support for Aishell and update documents (k2-fsa#124 )	Nov 19, 2021
subsampling.py	subsampling.py	Refactor decode.py to make it more readable and more modular. (k2-fsa#44	Sep 20, 2021
test_label_smoothing.py	test_label_smoothing.py	New label smoothing (k2-fsa#109 )	Nov 17, 2021
test_subsampling.py	test_subsampling.py	Use new APIs with k2.RaggedTensor (k2-fsa#38 )	Sep 8, 2021
test_transformer.py	test_transformer.py	Use new APIs with k2.RaggedTensor (k2-fsa#38 )	Sep 8, 2021
train.py	train.py	Reset seed at the beginning of each epoch. (k2-fsa#221 )	Feb 21, 2022
transformer.py	transformer.py	Remove batchnorm (k2-fsa#147 )	Dec 14, 2021

README.md

Introduction

Please visit https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html for how to run this recipe.

How to compute framewise alignment information

Step 1: Train a model

Please use conformer_ctc/train.py to train a model. See https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html for how to do it.

Step 2: Compute framewise alignment

Run

# Choose a checkpoint and determine the number of checkpoints to average
epoch=30
avg=15
./conformer_ctc/ali.py \
  --epoch $epoch \
  --avg $avg \
  --max-duration 500 \
  --bucketing-sampler 0 \
  --full-libri 1 \
  --exp-dir conformer_ctc/exp \
  --lang-dir data/lang_bpe_500 \
  --ali-dir data/ali_500

and you will get four files inside the folder data/ali_500:

$ ls -lh data/ali_500
total 546M
-rw-r--r-- 1 kuangfangjun root 1.1M Sep 28 08:06 test_clean.pt
-rw-r--r-- 1 kuangfangjun root 1.1M Sep 28 08:07 test_other.pt
-rw-r--r-- 1 kuangfangjun root 542M Sep 28 11:36 train-960.pt
-rw-r--r-- 1 kuangfangjun root 2.1M Sep 28 11:38 valid.pt

Note: It can take more than 3 hours to compute the alignment for the training dataset, which contains 960 * 3 = 2880 hours of data.

Caution: The model parameters in conformer_ctc/ali.py have to match those in conformer_ctc/train.py.

Caution: You have to set the parameter preserve_id to True for CutMix. Search ./conformer_ctc/asr_datamodule.py for preserve_id.

Step 3: Check your extracted alignments

There is a file test_ali.py in icefall/test that can be used to test your alignments. It uses pre-computed alignments to modify a randomly generated nnet_output and it checks that we can decode the correct transcripts from the resulting nnet_output.

You should get something like the following if you run that script:

$ ./test/test_ali.py
['THE GOOD NATURED AUDIENCE IN PITY TO FALLEN MAJESTY SHOWED FOR ONCE GREATER DEFERENCE TO THE KING THAN TO THE MINISTER AND SUNG THE PSALM WHICH THE FORMER HAD CALLED FOR', 'THE OLD SERVANT TOLD HIM QUIETLY AS THEY CREPT BACK TO DWELL THAT THIS PASSAGE THAT LED FROM THE HUT IN THE PLEASANCE TO SHERWOOD AND THAT GEOFFREY FOR THE TIME WAS HIDING WITH THE OUTLAWS IN THE FOREST', 'FOR A WHILE SHE LAY IN HER CHAIR IN HAPPY DREAMY PLEASURE AT SUN AND BIRD AND TREE', "BUT THE ESSENCE OF LUTHER'S LECTURES IS THERE"]
['THE GOOD NATURED AUDIENCE IN PITY TO FALLEN MAJESTY SHOWED FOR ONCE GREATER DEFERENCE TO THE KING THAN TO THE MINISTER AND SUNG THE PSALM WHICH THE FORMER HAD CALLED FOR', 'THE OLD SERVANT TOLD HIM QUIETLY AS THEY CREPT BACK TO GAMEWELL THAT THIS PASSAGE WAY LED FROM THE HUT IN THE PLEASANCE TO SHERWOOD AND THAT GEOFFREY FOR THE TIME WAS HIDING WITH THE OUTLAWS IN THE FOREST', 'FOR A WHILE SHE LAY IN HER CHAIR IN HAPPY DREAMY PLEASURE AT SUN AND BIRD AND TREE', "BUT THE ESSENCE OF LUTHER'S LECTURES IS THERE"]

Step 4: Use your alignments in training

Please refer to conformer_mmi/train.py for usage. Some useful functions are:

load_alignments(), it loads alignment saved by conformer_ctc/ali.py
convert_alignments_to_tensor(), it converts alignments to PyTorch tensors
lookup_alignments(), it returns the alignments of utterances by giving the cut ID of the utterances.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

conformer_ctc

conformer_ctc

README.md

Introduction

How to compute framewise alignment information

Step 1: Train a model

Step 2: Compute framewise alignment

Step 3: Check your extracted alignments

Step 4: Use your alignments in training

Files

conformer_ctc

Directory actions

More options

Directory actions

More options

Latest commit

History

conformer_ctc

Folders and files

parent directory

README.md

Introduction

How to compute framewise alignment information

Step 1: Train a model

Step 2: Compute framewise alignment

Step 3: Check your extracted alignments

Step 4: Use your alignments in training