Difference between SSL and PPG-based methods? #6

Kristopher-Chen · 2022-09-15T08:27:13Z

Hi, I really appreciate your work; the demo sounds great.
I also read papers about PPG-based VC, which uses ASR for PPG extraction. I just wonder about the difference between SSL and PPG-based methods. It seems they both extract some information about linguistics. Have you ever compared them?
Thank you!

bshall · 2022-09-19T14:09:21Z

Hi @Kristopher-Chen, thanks for the feedback!

There are some definite similarities between PPGs and the Soft Speech Units we proposed. The main difference is that soft units don't require text transcriptions to train. This can be useful for training VC systems in languages without large corpora of annotated speech. Additionally, things like laughter, breathing, etc. may be captured better by soft units than PPGs. Unfortunately, I haven't compared the approaches directly yet. I think it would be a useful benchmark but haven't had the chance to look into it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between SSL and PPG-based methods? #6

Difference between SSL and PPG-based methods? #6

Kristopher-Chen commented Sep 15, 2022

bshall commented Sep 19, 2022

Difference between SSL and PPG-based methods? #6

Difference between SSL and PPG-based methods? #6

Comments

Kristopher-Chen commented Sep 15, 2022

bshall commented Sep 19, 2022