v0.3.2

Latest

Latest

bpkeene released this 02 Jun 17:24

8a6b737

Contributors:
@ZachNagengast @keith4ever @dylanangus @chen-argmax @flashno @bpkeene

What's Changed

From v0.3.0:

Performance improvements
Imports the standard HuggingFace tokenizer via C bindings from https://github.com/FL33TW00D/tokenizers-sys
Updated example app
- Automatic model downloads from HuggingFace
- Transcribe from file or recording from the microphone
- Selectable compute units between NPU, GPU, and CPU
- Thank you to @Acs176 for kickstarting this initiative
Publishable Maven repository for Kotlin library
Code style rules with auto-formatting via make format

with v0.3.2:

Add multilingual model support with PerLayerKVDecoder, handling diverse encoder output names, int64 token bindings, and timestamp postprocessing; lifts KV-cache logic into decoder for improved modularity.
Various SDK and build improvements: conditional Bazel download, updated API with segments, improved model download/handling, version bumps (0.3.2), and README cleanup.
Example app published for open testing: https://play.google.com/store/apps/details?id=com.argmaxinc.whisperax
- Demo:

Contributors

ZachNagengast, keith4ever, and 5 other contributors

Assets 2