You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
## Model Description
2
2
We present a large classification model trained on a manually curated real-world dataset that can be used as a new benchmark for advancing research in voice toxicity detection and classification.
3
3
We started with the original weights from the [WavLM base plus](https://arxiv.org/abs/2110.13900) and fine-tuned it with 2,374 hours of voice chat audio clips for multilabel classification. The audio clips are automatically labeled using a synthetic data pipeline
4
-
described in [our blog post](https://research.roblox.com/tech-blog/2024/07/deploying-ml-for-voice-safety). A single output can have multiple labels.
4
+
described in [our blog post](https://research.roblox.com/tech-blog/2024/06/deploying-ml-for-voice-safety). A single output can have multiple labels.
5
5
The model outputs a n by 6 output tensor where the inferred labels are `Profanity`, `DatingAndSexting`, `Racist`,
6
6
`Bullying`, `Other`, `NoViolation`. `Other` consists of policy violation categories with low prevalence such as drugs
7
7
and alcohol or self-harm that are combined into a single category.
0 commit comments