Skip to content

mindnlp0.4版本不支持保存和加载PeftModel的adapter weights为safetensors #2024

Open
@xing-yiren

Description

@xing-yiren

Describe the bug/ 问题描述 (Mandatory / 必填)
A clear and concise description of what the bug is.

mindnlp0.4版本不支持保存和加载PeftModel的adapter weights为safetensors,仅能保存为ckpt,这个导致在训练过程中通过save_pretrained保存下来的adapter weights,在香橙派上通过PeftModel.from_pretrained进行加载时报错(_parse_ckpt_proto无法识别tensor_type,必须为Float16,然后香橙派上保存下来的tensor_dtype为mindspore.float16)

  • Hardware Environment(Ascend/GPU/CPU) / 硬件环境:

Please delete the backend not involved / 请删除不涉及的后端:
/device ascend

  • Software Environment / 软件环境 (Mandatory / 必填):
    -- MindSpore version (e.g., 1.7.0.Bxxx) : 2.5.0
    -- Python version (e.g., Python 3.7.5) : 3.9
    -- OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu
    -- GCC/Compiler version (if compiled from source):

  • Excute Mode / 执行模式 (Mandatory / 必填)(PyNative/Graph):

Please delete the mode not involved / 请删除不涉及的模式:
/mode pynative

To Reproduce / 重现步骤 (Mandatory / 必填)
Steps to reproduce the behavior:

model_id = "MindSpore-Lab/DeepSeek-R1-Distill-Qwen-1.5B"

base_model = AutoModelForCausalLM.from_pretrained("/home/HwHiAiUser/xing-yiren/DeepSeek_half", ms_dtype=mindspore.float16)
base_model.generation_config = GenerationConfig.from_pretrained("/home/HwHiAiUser/xing-yiren/DeepSeek_half")

base_model.generation_config.pad_token_id = base_model.generation_config.eos_token_id


config = LoraConfig(
    task_type=TaskType.CAUSAL_LM, 
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    inference_mode=False, # 训练模式
    r=8, # Lora 秩
    lora_alpha=32, # Lora alaph,具体作用参见 Lora 原理
    lora_dropout=0.1# Dropout 比例
)

# 实例化LoRA模型
model = get_peft_model(base_model, config)

# 保存LoRA权重
model.save_pretrained(peft_model_path)

# 加载LoRA权重
model = PeftModel.from_pretrained(base_model, peft_model_path)

Expected behavior / 预期结果 (Mandatory / 必填)
A clear and concise description of what you expected to happen.

将adapter保存为safetensors并成功加载

Screenshots/ 日志 / 截图 (Mandatory / 必填)
If applicable, add screenshots to help explain your problem.

Image

Additional context / 备注 (Optional / 选填)
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions