- AutoRound algorithms presented in vLLM Office Hours (in collaboration with LLM Compressor Team). Recording (Jan 2026)
-
LLM Compressor related:
- 知乎: AutoRound x LLM Compressor:让低比特量化 LLM 更准、更好推理 (Dec 2025)
- 微信: AutoRound x LLM Compressor:让低比特量化 LLM 更准、更好推理 (Dec 2025)
- vLLM: Advancing Low‑Bit Quantization for LLMs: AutoRound x LLM Compressor (Dec 2025)
- RedHat: Advancing Low‑Bit Quantization for LLMs: AutoRound x LLM Compressor (Dec 2025)
- Intel: Advancing Low-Bit Quantization for LLMs: AutoRound x LLM Compressor (Dec 2025)
-
vLLM related:
- 小红书: AutoRound x vLLM: 把 4bit LLM 量化到可用 (Dec 2025)
- Medium: Accelerating vLLM and SGLang Deployment using AutoRound (Oct 2025)
- arXiv: SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs (Dec 2025)
-
SGLang related:
- Intel: AutoRound Meets SGLang: Enabling Quantized Model Inference with AutoRound (Nov 2025)
- LMSYS: AutoRound Meets SGLang: Enabling Quantized Model Inference with AutoRound (Nov 2025)
- HuggingFace: What is AutoRound? (April 2025)
-
arXiv: TEQ: Trainable Equivalent Transformation for Quantization of LLMs (Oct 2023)
-
Medium: Effective Post-Training Quantization for Large Language Models (Apr 2023)