Contributors:
@ZachNagengast @keith4ever @dylanangus @chen-argmax @flashno @bpkeene
What's Changed
From v0.3.0:
- Performance improvements
- Imports the standard HuggingFace tokenizer via C bindings from https://github.com/FL33TW00D/tokenizers-sys
- Updated example app
- Automatic model downloads from HuggingFace
- Transcribe from file or recording from the microphone
- Selectable compute units between NPU, GPU, and CPU
- Thank you to @Acs176 for kickstarting this initiative
- Publishable Maven repository for Kotlin library
- Code style rules with auto-formatting via
make format
with v0.3.2:
-
Add multilingual model support with PerLayerKVDecoder, handling diverse encoder output names, int64 token bindings, and timestamp postprocessing; lifts KV-cache logic into decoder for improved modularity.
-
Various SDK and build improvements: conditional Bazel download, updated API with segments, improved model download/handling, version bumps (0.3.2), and README cleanup.
-
Example app published for open testing: https://play.google.com/store/apps/details?id=com.argmaxinc.whisperax