For multimodal models, such as QwenVL2.5, is the SmoothQuantModifier necessary when performing W8A8 quantization? #1394

weirdo2310 · 2025-04-28T08:42:26Z

I noticed that in the examples, W4A16 quantization is provided specifically for multimodal models, while Int8 W8A8 quantization examples are only available for LLM. These examples use SmoothQuantModifier and GPTQModifier during quantization. Therefore, I would like to know: for multimodal models, such as QwenVL2.5, is SmoothQuantModifier necessary when performing W8A8 quantization?

kylesayrs · 2025-04-28T13:05:29Z

Hi @weirdo2310! SmoothQuantModifier implements the SmoothQuant algorithm. This algorithm has been shown to improve accuracy recovery for W8A8 schemes, regardless of model architecture (multimodal or not). Therefore, we recommend using SmoothQuantModifier when quantizing W8A8 schemes, but it is not required.

bash99 · 2025-05-04T05:12:38Z

Hi @weirdo2310! SmoothQuantModifier implements the SmoothQuant algorithm. This algorithm has been shown to improve accuracy recovery for W8A8 schemes, regardless of model architecture (multimodal or not). Therefore, we recommend using SmoothQuantModifier when quantizing W8A8 schemes, but it is not required.

Does SmoothQuantModifier improve accuracy recovery for W8A8.FP8-Dynamic ?
I don't find any usage of SmoothQuantModifier in FP8 recipes.

kylesayrs self-assigned this Apr 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For multimodal models, such as QwenVL2.5, is the SmoothQuantModifier necessary when performing W8A8 quantization? #1394

For multimodal models, such as QwenVL2.5, is the SmoothQuantModifier necessary when performing W8A8 quantization? #1394

weirdo2310 commented Apr 28, 2025

kylesayrs commented Apr 28, 2025

bash99 commented May 4, 2025

For multimodal models, such as QwenVL2.5, is the SmoothQuantModifier necessary when performing W8A8 quantization? #1394

For multimodal models, such as QwenVL2.5, is the SmoothQuantModifier necessary when performing W8A8 quantization? #1394

Comments

weirdo2310 commented Apr 28, 2025

kylesayrs commented Apr 28, 2025

bash99 commented May 4, 2025