Releases · daanzu/kaldi-active-grammar

02 Nov 14:32

daanzu

v3.2.0

3240728

v3.2.0 Latest

Latest

Functionality-wise, only one small bug-fix: to the broken alternative dictation interface. But extensive build and infrastructure changes lead me to make this a minor release rather than just a patch release, out of an abundance of caution.

Active development has resumed after a long break! (While development paused, the project was continuously maintained and actively used in production.) Look forward to more frequent releases in the hopefully-near future.

3.2.0 - 2025-11-02 - Changes: KaldiAG KaldiFork

Added

Comprehensive test suite with 80+ tests covering grammar compilation, plain dictation, and alternative dictation
Test infrastructure using pytest with TTS-generated test audio (Piper)
AGENTS.md documentation for AI coding agents with project architecture and development guidance
Exposed NativeWFST at package top-level for easier importing
Support for testing with multiple platforms and Python versions (3.9-3.13)

Changed

CI/CD Improvements:
- Implemented comprehensive caching of native binaries by commit hash
- Added caching of test setup data
- Updated build workflow to run on all pushes and PRs
- Modified macOS wheel builds to use delocate instead of ad-hoc manual library handling
- Improved Linux wheel build with cleaner output and better caching
- Updated CI to support latest GitHub Actions runners (Ubuntu 24.04, Windows 2025, macOS 13/15/26)
- Moved tests into main build workflow for faster feedback
- Added notices for built wheels in CI output
Relaxed Python package requirements version specifiers for better compatibility
Updated setup.py classifiers to include Python 3.11, 3.12, 3.13, 3.14
Dropped Python 2 from wheel tag (py3 instead of py2.py3), as Python 2 is no longer supported
Improved comments and cleanup in Justfile

Fixed

Updated CI workflows to properly handle latest runner environments
Fixed Linux build configuration and wrapper script
Cleaned up and standardized build processes across all platforms

Development

Refactored test structure for better organization and maintainability
Added test generators for creating synthetic speech using Piper TTS and Google TTS
Added helper utilities for test fixtures and audio generation
Improved test coverage for edge cases (empty audio, garbage audio, very short/long audio)
Added tests for complex grammar patterns (diamond, cascade, hub-and-spoke, etc.)
Added comprehensive alternative dictation tests with mocking

Donations are appreciated to encourage development.

Artifacts

Models are available here and below.

If you have trouble downloading, try using wget --continue.

Assets 6

24 Nov 12:06

daanzu

v3.1.0

48360b9

v3.1.0

Fixed

Fix updating of SymbolTable multiple times for new words, so that there is only one instance for a single Model.

Changed

Only mark lexicon stale if it was successfully modified.
Removed deprecated CLI binaries from Windows build, reducing wheel size by ~65%.

Donations are appreciated to encourage development.

Artifacts

Models are available here and below.

Assets 8

31 Oct 10:42

daanzu

v3.0.0

99be2ae

v3.0.0

Changed

Pronunciation generation for lexicon now better supports local mode (using the g2p_en package), which is now also the default mode. It is also preferred over the online mode (using CMU's web service), which is now disabled by default. See the Setup section of the README for details. The new models now include the data files for g2p_en.
PlainDictation output now discards any silence words from transcript.
lattice_beam default value reduced from 6.0 to 5.0, to hopefully avoid occasional errors.
Removed deprecated CLI binaries from build for linux/mac.

Fixed

Whitespace in the model path is once again handled properly (thanks @matthewmcintire).
NativeWFST.has_path() now handles loops.
Linux/Mac binaries are now more stripped.

Donations are appreciated to encourage development.

Artifacts

Models are available here and below.

Assets 8

08 Apr 03:40

daanzu

v2.1.0

ee3ff41

v2.1.0

You can subscribe to announcements on GitHub (see Watch panel above), or on Gitter (see instructions )

Donations are appreciated to encourage development.

See major changes introduced in v2.0.0 and associated downloads.

Added

NativeWFST support for checking for impossible graphs (no successful path), which can then fail to compile.
Debugging info for NativeWFST.

Changed

lattice_beam default value reduced from 8.0 to 6.0, to hopefully avoid occasional errors.
Minor fix for OpenBLAS compilation for some architectures on linux/mac.

Fixed

Reloading grammars with NativeWFST.

Artifacts

Models are available here
kaldi-dragonfly-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
kaldi-caster-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Assets 5

21 Mar 15:20

daanzu

v2.0.0

8cbebaf

V2.0.0: Faster Grammar Compilation; Cleaner Codebase; Preparation For New Features

You can subscribe to announcements on GitHub (see Watch panel above), or on Gitter (see instructions )

Donations are appreciated to encourage development.

Added

Native FST support, via direct wrapping of OpenFST, rather than Python text-format implementation
- Eliminates grammar (G) FST compilation step
Internalized many graph construction steps, via direct use of native Kaldi/OpenFST functions, rather than invoking separate CLI processes
- Eliminates need for many temporary files (FSTs, .confs, etc) and pipes
Example usage for allowing mixing of free dictation with strict command phrases
Experimental support for "look ahead" graphs, as an alternative to full HCLG compilation
Experimental support for rescoring with CARPA LMs
Experimental support for rescoring with RNN LMs
Experimental support for "priming" RNNLM previous left context for each utterance

Changed

OpenBLAS is now the default linear algebra library (rather than Intel MKL) on Linux/MacOS
- Because it is open source and provides good performance on all hardware (including AMD)
- Windows is more difficult for this, and will be implemented soon in a later release
Default tmp_dir is now set to [model_dir]/cache.tmp
tmp_dir is now optional, and only needed if caching compiled FSTs (or for certain framework/option combinations)
File cache is now stored at [model_dir]/file_cache.json
Optimized adding many new words to the lexicon, in many different grammars, all in one loading session: only rebuild L_disambig.fst once at the end.
External interfaces: Compiler.__init__(), decoding setup, etc.
Internal interfaces: wrappers, etc.
Major refactoring of C++ components, with a new inheritance hierarchy and configuration mechanism, making it easier to use and test features with and without "activity"
Many build changes

Removed

Python 2.7 support: it may still work, but will not be a focus.
Google cloud speech-to-text removed, as an unneeded dependency. Alternative dictation is still supported as an option, via a callback to an external provider.

Deprecated

Separate CLI Kaldi/OpenFST executables
Indirect AGF graph compilation (framework==agf-indirect)
Non-native FSTs
parsing_framework==text

Artifacts

Models are available here
kaldi-dragonfly-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
kaldi-caster-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Assets 6

05 Sep 14:45

daanzu

v1.8.0

a6a3159

v1.8.0: New Models, Noise Resistance, Better Errors, More Documentation

You can subscribe to announcements on GitHub (see Watch panel above), or on Gitter (see instructions )

Donations are appreciated to encourage development.

[GitHub is matching (only) my GitHub Sponsors donations.]

Added

New speech models (should be better in general, and support new noise resistance)
Make failed AGF graph compilation save and output stderr upon failure automatically
Example of complete usage with a grammar and microphone audio
Various documentation

Changed

Top FST now accepts various noise phones (if present in speech model), making it more resistant to noise
Cleanup error handling in compiler, supporting Dragonfly backend automatically printing excerpt of the Rule that failed

Fixed

Mysterious windows newline bug in some environments

Artifacts

Models are available here
kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Assets 7

05 Sep 13:29

daanzu

v1.7.0

97b5441

v1.7.0: Support compiling some complex grammars (Caster text manipulation)

You can subscribe to announcements on GitHub (see Watch panel above), or on Gitter (see instructions )

Donations are appreciated to encourage development.

[GitHub is matching (only) my GitHub Sponsors donations.]

Added

Add automatic saving of text FST & compiled FST files with log level 5

Changed

Miscellaneous naming

Fixed

Support compiling some complex grammars (Caster text manipulation), by simplifying during compilation (remove epsilons, and determinize)

Artifacts

Models are available here
kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Assets 2

12 Jul 14:58

daanzu

v1.6.0

4d20fc5

v1.6.0: Easier Configuration; Public Automated Builds

This should be included the next dragonfly version.

You can subscribe to announcements on Gitter: see instructions.

Added

Can now pass configuration dict to KaldiAgfNNet3Decoder, PlainDictationRecognizer (without HCLG.fst).
Continuous Integration builds run on GitHub Actions for Windows (x64), MacOS (x64), Linux (x64).

Changed

Refactor of passing configuration to initialization.
PlainDictationRecognizer.decode_utterance can take chunk_size parameter.
Smaller binaries: MacOS 11MB -> 7.6MB, Linux 21MB -> 18MB.

Fixed

Confidence measurement in the presence of multiple, redundant rules.
Python3 int division bug for cloud dictation.

Artifacts

Models are available here
kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Donations are appreciated to encourage development.

[GitHub is matching (only) my GitHub Sponsors donations.]

Assets 3

06 Jun 10:49

daanzu

v1.5.0

2538058

v1.5.0: Improved Recognition Confidence Estimation

You can subscribe to announcements on Gitter: see instructions.

Notes

Improved Recognition Confidence Estimation: two new, different measures:
- confidence: basically the difference in how much "better" the returned recognition was, compared to the second best guess (>0)
- expected_error_rate: an estimate of how often similar utterances are incorrect (roughly out of 1.0, but can be greater)
Refactoring in preparation for future improvements
Various bug fixes & optimizations

Artifacts

Models are available here
kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Donations are appreciated to encourage development.

[GitHub is matching (only) my GitHub Sponsors donations.]

Assets 3

28 Mar 10:54

daanzu

v1.4.0

0de3e3e

v1.4.0: MacOS Support, And Faster Graph Compilation

Support is now included in dragonfly2 v0.22.0! You can try a self-contained distribution available below.

You can subscribe to announcements on Gitter: see instructions.

Notes

MacOS Support
Faster Graph Compilation
Dictation: the dictation model now does not recognize a zero-word sequence
Various bug fixes & optimizations

Artifacts

kaldi_model_daanzu*: A better acoustic model, and varying levels of language model for dictation (bigger is generally better).
kaldi_model_zamia: A compatible general English Kaldi nnet3 chain model.
kaldi-dragonfly-winpython: [stable release version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
kaldi-caster-winpython-dev: [more recent development version] A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.

Donations are appreciated to encourage development.

[GitHub is matching (only) my GitHub Sponsors donations.]

Assets 7

Uh oh!

Releases: daanzu/kaldi-active-grammar

v3.2.0

3.2.0 - 2025-11-02 - Changes: KaldiAG KaldiFork

Added

Changed

Fixed

Development

Donations are appreciated to encourage development.

Artifacts

Uh oh!

v3.1.0

Fixed

Changed

Donations are appreciated to encourage development.

Artifacts

Uh oh!

v3.0.0

Changed

Fixed

Donations are appreciated to encourage development.

Artifacts

Uh oh!

v2.1.0

Donations are appreciated to encourage development.

Added

Changed

Fixed

Artifacts

Uh oh!

V2.0.0: Faster Grammar Compilation; Cleaner Codebase; Preparation For New Features

Donations are appreciated to encourage development.

Added

Changed

Removed

Deprecated

Artifacts

Uh oh!

v1.8.0: New Models, Noise Resistance, Better Errors, More Documentation

Donations are appreciated to encourage development.

Added

Changed

Fixed

Artifacts

Uh oh!

v1.7.0: Support compiling some complex grammars (Caster text manipulation)

Donations are appreciated to encourage development.

Added

Changed

Fixed

Artifacts

Uh oh!

v1.6.0: Easier Configuration; Public Automated Builds

Added

Changed

Fixed

Artifacts

Donations are appreciated to encourage development.

Uh oh!

v1.5.0: Improved Recognition Confidence Estimation

Notes

Artifacts

Donations are appreciated to encourage development.

Uh oh!

v1.4.0: MacOS Support, And Faster Graph Compilation

Notes

Artifacts

Donations are appreciated to encourage development.

Uh oh!