关于量化export_quantization_dataset的问题 #8737

BruceXuK · 2025-07-24T10:23:45Z

BruceXuK
Jul 24, 2025

Reminder

I have read the above rules and searched the existing issues.

System Info

INFO 07-24 10:17:15 init.py:194] No platform detected, vLLM is running on UnspecifiedPlatform

llamafactory version: 0.9.2.dev0
Platform: Linux-6.8.0-59-generic-x86_64-with-glibc2.35
Python version: 3.10.12
PyTorch version: 2.5.1+cu124 (GPU)
Transformers version: 4.48.3
Datasets version: 3.2.0
Accelerate version: 1.2.1
PEFT version: 0.12.0
TRL version: 0.9.6
GPU type: NVIDIA RTX A6000
GPU number: 2
GPU memory: 47.54GB
DeepSpeed version: 0.16.2
Bitsandbytes version: 0.45.3
vLLM version: 0.7.2

Reproduction

执行的命令是：
CUDA_VISIBLE_DEVICES=1 llamafactory-cli export --model_name_or_path "/data3/models/test/export/lora" --template "glm4" --export_quantization_dataset "./data/微调数据集1勿动.json" --export_dir "/data3/models/test/export1" --export_size 2 --trust-remote-code true --export_device cpu --export_legacy_format False --export_quantization_bit 4 --export_quantization_maxlen 40

报错信息为：
maxlen 40
Traceback (most recent call last):
File "/usr/local/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/app/src/llamafactory/cli.py", line 87, in main
export_model()
File "/app/src/llamafactory/train/tuner.py", line 109, in export_model
model = load_model(tokenizer, model_args, finetuning_args) # must after fixing tokenizer to resize vocab
File "/app/src/llamafactory/model/loader.py", line 132, in load_model
patch_config(config, tokenizer, model_args, init_kwargs, is_trainable)
File "/app/src/llamafactory/model/patcher.py", line 111, in patch_config
configure_quantization(config, tokenizer, model_args, init_kwargs)
File "/app/src/llamafactory/model/model_utils/quantization.py", line 150, in configure_quantization
dataset=_get_quantization_dataset(tokenizer, model_args),
File "/app/src/llamafactory/model/model_utils/quantization.py", line 88, in _get_quantization_dataset
sample: Dict[str, "torch.Tensor"] = tokenizer(dataset[sample_idx]["text"], return_tensors="pt")
KeyError: 'text'

我现在有个疑问export_quantization_dataset 这个参数指定的是什么，我已经使用“微调数据集1勿动.json”这个微调数据对模型进行微调，但是在导出时报以上错误，我的“量化校准数据集.json”是以下截图：

我现在不清楚export_quantization_dataset需要指定我训练时的“微调数据集1勿动.json”这个数据集还是指定“量化校准数据集.json”这个文件。

希望得到您的回答，谢谢

Others

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

关于量化export_quantization_dataset的问题 #8737

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

关于量化export_quantization_dataset的问题 #8737

Uh oh!

BruceXuK Jul 24, 2025

Reminder

System Info

Reproduction

Others

Replies: 0 comments

BruceXuK
Jul 24, 2025