Skip to content

Unify benchmarks from whisperkittools#18

Merged
EduardoPach merged 3 commits intomainfrom
eduardo/unify-whisperkittools-benchmark
Jul 21, 2025
Merged

Unify benchmarks from whisperkittools#18
EduardoPach merged 3 commits intomainfrom
eduardo/unify-whisperkittools-benchmark

Conversation

@EduardoPach
Copy link
Collaborator

What does this PR do?

This PR serves as the first step to unify the benchmarks once done in whisperkittools to SDBench by adding some pipelines that were only in whisperkittools, and adding support for the main datasets used in whisperkittools

  • Adds WhisperKit and Apple's SpeechAnalyzer as pipelines
  • Adds useful aliases for new pipelines
  • Add useful aliases for new datasets common-voice, librispeech, and earnings22

Copy link
Contributor

@atiorh atiorh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great - 1 NIT comment

@EduardoPach EduardoPach merged commit d43bf7c into main Jul 21, 2025
@EduardoPach EduardoPach deleted the eduardo/unify-whisperkittools-benchmark branch July 21, 2025 15:54
Copy link
Contributor

@arda-argmax arda-argmax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! ✅

Suggestions for future improvement:

  • We can add support for additional pipelines available in whisperkittools, such as WhisperCPP and WhisperMLX.
  • It might also be helpful to enable support for local/custom models within the WhisperKitTranscriptionPipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants