You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is hard and not sure to use different model to create a unified model with this feature for a beginner. This is important for a lot of areas, such as phone call, speech recordings, as the large audio model becomes welcome and powerful.
We need:
A unified TargetAwareDiarizer class that:
Accepts pre-registered speaker embeddings during initialization
Performs joint diarization + verification in a single pass
Outputs segments with two new attributes:
is_target: Boolean flag for verified speakers
speaker_id: Custom ID for registered speakers (e.g., "VIP_Customer")
I think it will not be that hard for the develop team, and I will be excited if the team can provide a reliable solution.