40 lines (24 loc) · 2.76 KB

Full Publications/Events

2026

AutoRound algorithms presented in vLLM Office Hours (in collaboration with LLM Compressor Team). Recording (Jan 2026)

2025

LLM Compressor related:
- 知乎: AutoRound x LLM Compressor：让低比特量化 LLM 更准、更好推理 (Dec 2025)
- 微信: AutoRound x LLM Compressor：让低比特量化 LLM 更准、更好推理 (Dec 2025)
- vLLM: Advancing Low‑Bit Quantization for LLMs: AutoRound x LLM Compressor (Dec 2025)
- RedHat: Advancing Low‑Bit Quantization for LLMs: AutoRound x LLM Compressor (Dec 2025)
- Intel: Advancing Low-Bit Quantization for LLMs: AutoRound x LLM Compressor (Dec 2025)
vLLM related:
- 小红书: AutoRound x vLLM: 把 4bit LLM 量化到可用 (Dec 2025)
- Medium: Accelerating vLLM and SGLang Deployment using AutoRound (Oct 2025)

arXiv: SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs (Dec 2025)

SGLang related:
- Intel: AutoRound Meets SGLang: Enabling Quantized Model Inference with AutoRound (Nov 2025)
- LMSYS: AutoRound Meets SGLang: Enabling Quantized Model Inference with AutoRound (Nov 2025)

HuggingFace: What is AutoRound? (April 2025)

2024

EMNLP: Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLM (Oct 2024)

2023

arXiv: TEQ: Trainable Equivalent Transformation for Quantization of LLMs (Oct 2023)
Medium: Effective Post-Training Quantization for Large Language Models (Apr 2023)