1.0.0

atiorh released this 14 Jun 15:48

· 69 commits to main since this release

6-bit weight compression using coremltools
Improved attention implementation (SPLIT_EINSUM_V2) which yields up to 30% improved Neural Engine performance
Multilingual text encoder support
New benchmarks for iPhone, iPad and Mac

Assets 2