Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference between SSL and PPG-based methods? #6

Open
Kristopher-Chen opened this issue Sep 15, 2022 · 1 comment
Open

Difference between SSL and PPG-based methods? #6

Kristopher-Chen opened this issue Sep 15, 2022 · 1 comment

Comments

@Kristopher-Chen
Copy link

Hi, I really appreciate your work; the demo sounds great.
I also read papers about PPG-based VC, which uses ASR for PPG extraction. I just wonder about the difference between SSL and PPG-based methods. It seems they both extract some information about linguistics. Have you ever compared them?
Thank you!

@bshall
Copy link
Owner

bshall commented Sep 19, 2022

Hi @Kristopher-Chen, thanks for the feedback!

There are some definite similarities between PPGs and the Soft Speech Units we proposed. The main difference is that soft units don't require text transcriptions to train. This can be useful for training VC systems in languages without large corpora of annotated speech. Additionally, things like laughter, breathing, etc. may be captured better by soft units than PPGs. Unfortunately, I haven't compared the approaches directly yet. I think it would be a useful benchmark but haven't had the chance to look into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants