Skip to content

Releases: wangzhaode/llm-export

llmexport v0.0.2

27 Sep 06:28
Compare
Choose a tag to compare

Features

  • Added support for Qwen2-VL.
  • Introduced support for GTE and split embedding layers for BGE/GTE.
  • Implemented imitate_quant functionality during testing.
  • Enabled usage of C++ compiled MNNConvert.

Refactors

  • Refactored the implementation of the VL model.
  • Updated model path handling for ONNX models.

Bug Fixes

  • Resolved issues with stop_ids and quantization.
  • Fixed the bug related to block_size = 0.

llmexport v0.0.1

19 Aug 09:31
Compare
Choose a tag to compare
  • Support export onnx/ mnn from pretrain model.
  • Using FakeLinear to save memory and time when export onnx and mnn.
  • Support onnxslim to optimize onnx graph.