Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于多张GPU微调大模型的问题 #433

Open
yanhan111 opened this issue Aug 28, 2024 · 3 comments
Open

关于多张GPU微调大模型的问题 #433

yanhan111 opened this issue Aug 28, 2024 · 3 comments
Assignees

Comments

@yanhan111
Copy link

请问如果我想使用旧版本的“internlm-xcomposer-7b”进行两张3090进行微调,应该如何修改代码。我发现最新的多卡运行代码无法适用旧版本的模型

@yuhangzang
Copy link
Collaborator

Please check the code here

@yanhan111
Copy link
Author

我使用了accelerate进行两张3090的demo,但是仍然报错File "/home/ubuntu/data/syh/C4MMD-main/C4MMDmain/CoT_module.py", line 210, in
response1 = model.generate(**inputs)
File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer-7b/modeling_InternLM_XComposer.py", line 204, in generate
out_embeds = self.internlm_model.generate(inputs_embeds=prompt_embeds,
File "/home/ubuntu/anaconda3/envs/C4MMD/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/anaconda3/envs/C4MMD/lib/python3.8/site-packages/transformers/generation/utils.py", line 1681, in generate
return self.beam_search(
File "/home/ubuntu/anaconda3/envs/C4MMD/lib/python3.8/site-packages/transformers/generation/utils.py", line 3091, in beam_search
model_kwargs["past_key_values"] = self._reorder_cache(model_kwargs["past_key_values"], beam_idx)
File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer-7b/modeling_InternLM.py", line 1243, in _reorder_cache
reordered_past += (tuple(
File "/home/ubuntu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer-7b/modeling_InternLM.py", line 1244, in
past_state.index_select(0, beam_idx)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper__index_select)

Process finished with exit code 1

@yuhangzang
Copy link
Collaborator

Can u provide more details about your training script?

For example, you may run the finetune_lora.sh and modify the GPUS_PER_NODE == 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants