[feature-request] Add support for mlc-llm

**Feature type?**
Algorithm request

**A proposal draft (if any)**
MLC-LLM is an LLM Deployment Engine with ML Compilation.
https://github.com/mlc-ai/mlc-llm

It has a very wide environment and backend support 
<img width="713" alt="image" src="https://github.com/user-attachments/assets/5f22f4e8-a654-4d5b-bc9a-f67c1c849372">

Primarily, 
- the different [quantisation schemes](https://github.com/mlc-ai/mlc-llm/blob/de6e91637c648e1bb30bdaea5c871e329c8763a1/python/mlc_llm/quantization/quantization.py#L30) should be supported ootb. 
  - In the past, we've had tested out that the outputs from nyuntam's w4a16 quant algo ([awq](https://github.com/nyunAI/nyuntam-text-generation/tree/9542023ef0836e4346c45eecaad83711c71efadc/quantization/autoawq)) can be directly used as inputs for the `q4f16_awq` quantisation scheme of mlcllm. We expect that this should still hold true. Ideally, if someone choses `q4f16_awq` as the quantisation, nyuntam's [AutoAWQ](https://github.com/nyunAI/nyuntam-text-generation/blob/9542023ef0836e4346c45eecaad83711c71efadc/quantization/autoawq/main.py#L126) should be used as the intermediary job, and the output(s) be used to continue mlc-llm's weight conversion and model compilation.
  - for 3bit quantisation, mlc-llm supports Omniquant's inputs as per [this notebook](https://github.com/OpenGVLab/OmniQuant/blob/main/runing_quantized_models_with_mlc_llm.ipynb)

- all the [platforms](https://llm.mlc.ai/docs/compilation/compile_models.html#compile-model-libraries) supported by mlc-llm should supported ootb, though testing for the same is subject to a test environment availability.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feature-request] Add support for mlc-llm #18

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[feature-request] Add support for mlc-llm #18

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions