Multi language training #2214
Replies: 6 comments 23 replies
-
Maybe you can use the "multilingual_cleaners" and use your multi language dataset ? |
Beta Was this translation helpful? Give feedback.
-
Looking forward to hearing how it'll evolve overtime when you reach mote steps. By the way the answer from Edresson you linked is very interesting : I've always been wondering how all those losses should be interpreted ! And you wrote that you trained for nearly 1M steps. How long did it take you to do that and what's your hardware / batch size ? With an i5 2400 (~10 years old) / 16GB / rtx 3090 I reach ~ 80k steps / 24h (bs = 32) so I'd have to wait ~13 days to hopefully reach the same quality than what you've got (dataset is mine with ~ 1500 samples). |
Beta Was this translation helpful? Give feedback.
-
FWIW on my machine it's rather 90k per 24h as I've just measured it accurately on Tensorboard. So yeah you've got room for improvements! |
Beta Was this translation helpful? Give feedback.
-
Can you incrementally add languages or speakers without losing the older languages? |
Beta Was this translation helpful? Give feedback.
-
No idea what I did wrong, but now it is working. I just used another checkpoint and then recomputed the phoneme cache dir. Now the german language output is not corrupted anymore :) Thanks again @Ca-ressemble-a-du-fake for the support. |
Beta Was this translation helpful? Give feedback.
-
hi @Bebaam, did you train with both German and English data or just continue training German data from English checkpoint? |
Beta Was this translation helpful? Give feedback.
-
Hey everybody,
I currently try to train a multispeaker multilanguage model with phonemes. I've read before that I can start with one language and add new languages over time by training explicitly with the corresponding dataset (Searched a lot but unfortunately I can't find that thread, only #1859 (comment)). This would be perfect for me, because I can then set the proper phoneme language and a language dependent cleaner.
However, I trained the model for close to 1 Million steps in German - with Thorsten voice and another one - where the quality is ok. But when I then want to add English with the LJ-Speech Dataset, using the English espeak phonemizer and an English cleaner, the capability to speak german and also the german speakers in general are lost directly (no difference in using each of the german speakers provided as 'speaker_idx' for inference). Do I need to manually add this information to force the model to keep speaker/language information?
As a possible alternative, if the method above is not working, maybe it is better to train with the whole dataset - containing German and English? As erogol pointed out in #1590 (comment), phonemizer would be ready for that. But then I won't be able to use the cleaner appropriately?
Thanks in advance for any help and insights.
Beta Was this translation helpful? Give feedback.
All reactions