1.0.0
- 6-bit weight compression using coremltools
- Improved attention implementation (
SPLIT_EINSUM_V2
) which yields up to 30% improved Neural Engine performance - Multilingual text encoder support
- New benchmarks for iPhone, iPad and Mac
SPLIT_EINSUM_V2
) which yields up to 30% improved Neural Engine performance