[BUG/Help] <title>启动双卡推理失败，只能启动cpu推理，return torch.load(checkpoint_file, map_location="cpu")

### Is there an existing issue for this?

- [X] I have searched the existing issues

### Current Behavior

在linux中，我已经安装了cuda，同时torch.cuda.device_count返回值为2


当我启动api.py时，日志为return torch.load(checkpoint_file, map_location="cpu")，可以看到，并没有启用gpu推理
api.py代码如下
tokenizer = AutoTokenizer.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().cuda()
model = model.eval()


---------------------------------------------------------------------------------------------------

但当我按照网上教程改为 load_model_on_gpus 加载模型时，发生报错
我的web_demo.py代码如下

from transformers import AutoModel, AutoTokenizer
import gradio as gr
import mdtex2html
import os
from utils import load_model_on_gpus

os.environ["CUDA_VISIBLE_DEVICES"]='0,1'
tokenizer = AutoTokenizer.from_pretrained("ChatGLM-6B", trust_remote_code=True)
model = load_model_on_gpus("ChatGLM-6B", num_gpus=2)
model = model.eval()

---------------------------------------------------------------------------------------------------------

发生报错为
Traceback (most recent call last):
  File "openai_api.py", line 173, in <module>
    model = load_model_on_gpus("ChatGLM-6B", num_gpus=2)
  File "/usr/local/glm/ChatGLM-6B/utils.py", line 50, in load_model_on_gpus
    model = dispatch_model(model, device_map=device_map)
  File "/home/xxx/miniconda3/envs/glm/lib/python3.8/site-packages/accelerate/big_modeling.py", line 352, in dispatch_model
    check_device_map(model, device_map)
  File "/home/xxx/miniconda3/envs/glm/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 1420, in check_device_map
    raise ValueError(
ValueError: The device_map provided does not give any device for the following parameters: transformer.embedding.word_embeddings.weight, …………


### Expected Behavior

_No response_

### Steps To Reproduce

OS : Ubuntu 20.04
cd /usr/local/glm
conda activate glm3
python api.py

api.py内容
tokenizer = AutoTokenizer.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().cuda()
model = model.eval()

出现问题
return torch.load(checkpoint_file, map_location="cpu")


web_demo.py内容
from transformers import AutoModel, AutoTokenizer
import gradio as gr
import mdtex2html
import os
from utils import load_model_on_gpus

os.environ["CUDA_VISIBLE_DEVICES"]='0,1'
tokenizer = AutoTokenizer.from_pretrained("ChatGLM-6B", trust_remote_code=True)
model = load_model_on_gpus("ChatGLM-6B", num_gpus=2)
model = model.eval()

出现问题
Traceback (most recent call last):
  File "openai_api.py", line 173, in <module>
    model = load_model_on_gpus("ChatGLM-6B", num_gpus=2)
  File "/usr/local/glm/ChatGLM-6B/utils.py", line 50, in load_model_on_gpus
    model = dispatch_model(model, device_map=device_map)
  File "/home/xxx/miniconda3/envs/glm/lib/python3.8/site-packages/accelerate/big_modeling.py", line 352, in dispatch_model
    check_device_map(model, device_map)
  File "/home/xxx/miniconda3/envs/glm/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 1420, in check_device_map
    raise ValueError(
ValueError: The device_map provided does not give any device for the following parameters: transformer.embedding.word_embeddings.weight, …………



### Environment

```markdown
- OS: Ubuntu 20.04
- Python:3.8
- Transformers:
- PyTorch:
- CUDA Support:True
```


### Anything else?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG/Help] <title>启动双卡推理失败，只能启动cpu推理，return torch.load(checkpoint_file, map_location="cpu") #1494

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG/Help] <title>启动双卡推理失败，只能启动cpu推理，return torch.load(checkpoint_file, map_location="cpu") #1494

Description

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions