Releases: wangzhaode/llm-export
Releases · wangzhaode/llm-export
llmexport v0.0.2
Features
- Added support for Qwen2-VL.
- Introduced support for GTE and split embedding layers for BGE/GTE.
- Implemented
imitate_quant
functionality during testing. - Enabled usage of C++ compiled MNNConvert.
Refactors
- Refactored the implementation of the VL model.
- Updated model path handling for ONNX models.
Bug Fixes
- Resolved issues with
stop_ids
and quantization. - Fixed the bug related to
block_size = 0
.
llmexport v0.0.1
- Support export onnx/ mnn from pretrain model.
- Using FakeLinear to save memory and time when export onnx and mnn.
- Support
onnxslim
to optimize onnx graph.