auto-round/docs/alg_202508.md at main · intel/auto-round

If you are evaluating LLaMA models with recent versions of Transformers, please remove @use_kernel_forward_from_hub("RMSNorm") in modeling_llama.py and enable add_bos_token(this is set as default in AutoRound) in lm-eval to stabilize the accuracy. These adjustments affect the quantized model but not the BF16 model for the tasks evaluated in the AutoRoundv2 paper.

All other settings follow the default configurations of AutoRound and lm-eval.

Qwen3-8B W2G64	Avg.	arc_challenge	hellaswag	gsm8k	lambada_openai	mmlu	mmlupro	truthfulqa_mc1	winogrande
AutoRound	0.4373	0.4019	0.4437	0.4215	0.4826	0.5474	0.2630	0.3072	0.6314
AutoRound+alg_ext	0.4787	0.4275	0.4516	0.5944	0.5181	0.5773	0.2807	0.3305	0.6496
AutoRoundBest+alg_ext lr 2e-3	0.4937	0.4505	0.474	0.5906	0.5556	0.6028	0.3127	0.3109	0.6527

Llama3.1-8B-Instruct W2G64	Avg.	arc_challenge	hellaswag	gsm8k	lambada_openai	mmlu	mmlupro	truthfulqa_mc1	winogrande
AutoRound	0.3820	0.3635	0.4562	0.1622	0.5069	0.4411	0.1661	0.3207	0.6393
AutoRound+alg_ext	0.4166	0.3712	0.4729	0.2039	0.5946	0.4981	0.2163	0.3011	0.6748
AutoRoundBest+alg_ext lr 2e-3	0.4539	0.4138	0.4999	0.3071	0.6233	0.5279	0.2364	0.3231	0.6993

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

alg_202508.md

Latest commit

History

alg_202508.md

File metadata and controls