lora训练llama 貌似不支持？ #51

wind91725 · 2023-04-15T12:28:09Z

如题：命令如下：
python pretrain.py --pretrained_model_path models/llama-7b.bin --dataset_path datasets/ceshi --spm_model_path /u01/wangcheng/llm/llama/tokenizer.model --config_path models/llama/7b_config.json --output_model_path models/llama_zh_7b --world_size 5 --data_processor lm --total_steps 300000 --save_checkpoint_steps 5000 --batch_size 24 --use_lora --lora_dropout 0.05

只运行到Using distributed mode for training. 就结束了？

Daniel-1997 · 2023-04-24T06:07:43Z

我和你遇到了用样的问题，请问你解决了吗？

hepj987 · 2023-06-09T01:12:39Z

用最新版本的项目是可以用lora训练的，只是很奇怪，看他介绍lora是分两步走的，第一步--pretrained_model_path models 和--use_lora --lora_dropout 0.05只训练一个lora权重，第二步通过--lora_pretrained_model_path --pretrained_model_path models --use_lora --lora_dropout 0.05加载训练好的lora权重再训练。

hepj987 · 2023-06-09T01:13:12Z

用最新版本的项目是可以用lora训练的，只是很奇怪，看他介绍lora是分两步走的，第一步--pretrained_model_path models 和--use_lora --lora_dropout 0.05只训练一个lora权重，第二步通过--lora_pretrained_model_path --pretrained_model_path models --use_lora --lora_dropout 0.05加载训练好的lora权重再训练。

但是按他这个流程走，lora训练的时候 loss不降 acc也不提升

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lora训练llama 貌似不支持？ #51

lora训练llama 貌似不支持？ #51

wind91725 commented Apr 15, 2023

Daniel-1997 commented Apr 24, 2023

hepj987 commented Jun 9, 2023

hepj987 commented Jun 9, 2023

lora训练llama 貌似不支持？ #51

lora训练llama 貌似不支持？ #51

Comments

wind91725 commented Apr 15, 2023

Daniel-1997 commented Apr 24, 2023

hepj987 commented Jun 9, 2023

hepj987 commented Jun 9, 2023