Releases · wangzhaode/llm-export · GitHub

27 Sep 06:28

wangzhaode

llmexport v0.0.2

Features

Added support for Qwen2-VL.
Introduced support for GTE and split embedding layers for BGE/GTE.
Implemented imitate_quant functionality during testing.
Enabled usage of C++ compiled MNNConvert.

Refactors

Refactored the implementation of the VL model.
Updated model path handling for ONNX models.

Bug Fixes

Resolved issues with stop_ids and quantization.
Fixed the bug related to block_size = 0.

Assets 2

19 Aug 09:31

wangzhaode

llmexport v0.0.1 Latest

Latest

Support export onnx/ mnn from pretrain model.
Using FakeLinear to save memory and time when export onnx and mnn.
Support onnxslim to optimize onnx graph.

Assets 2