-
Notifications
You must be signed in to change notification settings - Fork 178
Description
hello, when run main_finetune.py till 238th row:
for param in fsdp_ignored_parameters:
dist.broadcast(param.data, src=dist.get_global_rank(fs_init.get_data_parallel_group(), 0),
group=fs_init.get_data_parallel_group())
it throws an runtime error:
发生异常: RuntimeError
Tensors must be CUDA and dense
File "/amax/yt26/.conda/envs/accessory/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 1570, in broadcast
work = group.broadcast([tensor], opts)
File "/amax/yt26/.conda/envs/accessory/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 1451, in wrapper
return func(*args, **kwargs)
File "/amax/yt26/VCM/LLaMA2-Accessory/accessory/main_finetune.py", line 238, in main
dist.broadcast(param.data, src=dist.get_global_rank(fs_init.get_data_parallel_group(), 0),
File "/amax/yt26/VCM/LLaMA2-Accessory/accessory/main_finetune.py", line 369, in
main(args)
RuntimeError: Tensors must be CUDA and dense
how can I deal with that? thank you.