NeMo Feature Request: Integrated Speaker Verification and Diarization for Target Speaker Identification

It is hard and not sure to use different model to create a unified model with this feature for a beginner. This is important for a lot of areas, such as phone call, speech recordings, as the large audio model becomes welcome and powerful.

We need:
A unified TargetAwareDiarizer class that:

Accepts pre-registered speaker embeddings during initialization
Performs joint diarization + verification in a single pass
Outputs segments with two new attributes:
is_target: Boolean flag for verified speakers
speaker_id: Custom ID for registered speakers (e.g., "VIP_Customer")

I think it will not be that hard for the develop team, and I will be excited if the team can provide a reliable solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NeMo Feature Request: Integrated Speaker Verification and Diarization for Target Speaker Identification #14265

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

NeMo Feature Request: Integrated Speaker Verification and Diarization for Target Speaker Identification #14265

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions