Skip to content

Comments

Add pyannote orchestration#92

Merged
EduardoPach merged 3 commits intomainfrom
eduardo/pyannote-orchestration
Jan 19, 2026
Merged

Add pyannote orchestration#92
EduardoPach merged 3 commits intomainfrom
eduardo/pyannote-orchestration

Conversation

@EduardoPach
Copy link
Collaborator

What does this PR do?

This PR introduces a reusable PyannoteAI engine and adds support for PyannoteAI's new STT orchestration feature, which combines speaker diarization with transcription in a single API call.

Changes

New Engine (src/openbench/engine/pyannote_engine.py)

  • Created PyannoteAIApi engine class that supports both diarization-only and diarization+transcription modes
  • Added response models for API outputs:
    • PyannoteApiDiarizationOutput - diarization-only responses
    • PyannoteApiOrchestrationOutput - responses with wordLevelTranscription and turnLevelTranscription

Refactored Diarization Pipeline

  • Updated PyannoteApiPipeline to use the new engine instead of duplicating API logic

New Pipelines

  • PyannoteTranscriptionPipeline - Uses PyannoteAI with STT but ignores speaker attribution (for transcription-only datasets)
  • PyannoteOrchestrationPipeline - Uses PyannoteAI with STT and includes speaker attribution

Pipeline Aliases

  • pyannote-transcription - Transcription without speaker labels
  • pyannote-orchestration - Full diarization + transcription with speaker attribution

Usage

Transcription only (no speaker labels)

openbench-cli evaluate -p pyannote-transcription -d <dataset-name> -m wer

Orchestration (with speaker labels)

openbench-cli evaluate -p pyannote-orchestration -d <dataset-name> -m wer -m cpwer -m wder

@EduardoPach EduardoPach requested review from atiorh and dbrkn January 13, 2026 18:33
@EduardoPach EduardoPach changed the title add: pyannote orchestration Add pyannote orchestration Jan 13, 2026
Copy link
Contributor

@dbrkn dbrkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@EduardoPach EduardoPach merged commit 18c7372 into main Jan 19, 2026
2 checks passed
@EduardoPach EduardoPach deleted the eduardo/pyannote-orchestration branch January 19, 2026 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants