Thoughts on auxiliary audio losses using V-Diffusion #54
brentspell
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
First, thanks for creating this repo, it is a great resource for audio ML.
The Moûsai paper hints at additional/perceptual losses in the Future Work section. I'm curious whether this would be possible to do in the V-Diffusion framework, since the denoiser predicts the "velocity" of the noise instead of the clean audio. Do you know of a transformation that could be applied to the model outputs at training time, for comparing against ground truth using an additional criterion?
Beta Was this translation helpful? Give feedback.
All reactions