Skip to content

Support FP8 Quantization and Inference Run on Intel Gaudi (HPU) using INC (Intel Neural Compressor)#12010

Merged
mgoin merged 35 commits intovllm-project:mainfrom
HabanaAI:dev/hpu_fp8
Jul 16, 2025
Merged

Support FP8 Quantization and Inference Run on Intel Gaudi (HPU) using INC (Intel Neural Compressor)#12010
mgoin merged 35 commits intovllm-project:mainfrom
HabanaAI:dev/hpu_fp8

Commits

Commits on Jun 24, 2025

Commits on Jun 26, 2025

Commits on Jul 8, 2025

Commits on Jul 15, 2025