llmexport v0.0.2
Features
- Added support for Qwen2-VL.
- Introduced support for GTE and split embedding layers for BGE/GTE.
- Implemented
imitate_quant
functionality during testing. - Enabled usage of C++ compiled MNNConvert.
Refactors
- Refactored the implementation of the VL model.
- Updated model path handling for ONNX models.
Bug Fixes
- Resolved issues with
stop_ids
and quantization. - Fixed the bug related to
block_size = 0
.