llmexport v0.0.2

wangzhaode released this 27 Sep 06:28

· 11 commits to master since this release

Features

Added support for Qwen2-VL.
Introduced support for GTE and split embedding layers for BGE/GTE.
Implemented imitate_quant functionality during testing.
Enabled usage of C++ compiled MNNConvert.

Refactors

Refactored the implementation of the VL model.
Updated model path handling for ONNX models.

Bug Fixes

Resolved issues with stop_ids and quantization.
Fixed the bug related to block_size = 0.

Assets 2