-
Notifications
You must be signed in to change notification settings - Fork 8.4k
Open
Description
Installation Method | 安装方法与平台
Pip Install (I used latest requirements.txt)
Version | 版本
Latest | 最新版
OS | 操作系统
Linux
Describe the bug | 简述
背景
docker-compose启动的服务,因为网络问题无法从huggingface.co下载模型,是从魔塔社区下载的模型
modelscope download --model ZhipuAI/glm-4-9b-chat --local_dir ./models/THUDM/glm-4-9b-chat
1. docker-compose文件如下:
version: '3'
services:
gpt_academic_full_capability:
image: ghcr.io/binary-husky/gpt_academic_with_all_capacity:master
environment:
CHATGLM_LOCAL_MODEL_PATH: 'THUDM/glm-4-9b-chat'
LOCAL_MODEL_DEVICE: 'cuda'
LOCAL_MODEL_QUANT: 'FP16'
API_KEY: 'sk-xx'
DASHSCOPE_API_KEY: 'sk-xx'
USE_PROXY: 'False'
LLM_MODEL: 'gpt-4o'
AVAIL_LLM_MODELS: '["gpt-3.5-turbo", "gpt-4o", "qwen-max-latest", "chatglm4","deepseek-r1","deepseek-v3","chatglm3-6b"]'
ENABLE_AUDIO: 'False'
DEFAULT_WORKER_NUM: '20'
WEB_PORT: '18080'
ADD_WAIFU: 'False'
ALIYUN_APPKEY: 'RxPlZrM88DnAFkZK'
THEME: 'Chuanhu-Small-and-Beautiful'
LOCAL_MODEL_DEVICE: 'cuda'
API_URL_REDIRECT: >
{
"https://api.openai.com/v1/chat/completions": "https://api.gptsapi.net/v1/chat/completions",
"https://api.openai.com/v1/completions": "https://api.gptsapi.net/v1/completions",
"https://api.openai.com/v1/embeddings": "https://api.gptsapi.net/v1/embeddings"
}
CHATGLM_LOCAL_MODEL_PATH: '/models/THUDM/glm-4-9b-chat'
volumes:
- /root/models/THUDM/glm-4-9b-chat:/models/THUDM/glm-4-9b-chat:ro
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
# network_mode: "host"
ports:
- "18080:18080"
command: >
bash -c "python3 -u main.py"
2. 后端服务日志:
14:02 | ..v_variable:33 | [ENV_VAR] 尝试加载CHATGLM_LOCAL_MODEL_PATH,默认值:THUDM/glm-4-9b-chat --> 修正值:/models/THUDM/glm-4-9b-chat
14:02 | ..v_variable:60 | [ENV_VAR] 成功读取环境变量CHATGLM_LOCAL_MODEL_PATH
14:02 | ..v_variable:33 | [ENV_VAR] 尝试加载LOCAL_MODEL_DEVICE,默认值:cpu --> 修正值:cuda
14:02 | ..v_variable:60 | [ENV_VAR] 成功读取环境变量LOCAL_MODEL_DEVICE
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 0%| | 0/10 [00:00<?, ?it/s]
Loading checkpoint shards: 10%|█ | 1/10 [00:00<00:01, 4.88it/s]
Loading checkpoint shards: 20%|██ | 2/10 [00:00<00:01, 6.09it/s]
Loading checkpoint shards: 30%|███ | 3/10 [00:00<00:01, 6.68it/s]
Loading checkpoint shards: 40%|████ | 4/10 [00:00<00:00, 6.77it/s]
Loading checkpoint shards: 50%|█████ | 5/10 [00:00<00:00, 6.99it/s]
Loading checkpoint shards: 60%|██████ | 6/10 [00:00<00:00, 7.21it/s]
Loading checkpoint shards: 70%|███████ | 7/10 [00:01<00:00, 7.30it/s]
Loading checkpoint shards: 80%|████████ | 8/10 [00:01<00:00, 7.29it/s]
Loading checkpoint shards: 90%|█████████ | 9/10 [00:01<00:00, 7.33it/s]
Loading checkpoint shards: 100%|██████████| 10/10 [00:01<00:00, 7.37it/s]
Loading checkpoint shards: 100%|██████████| 10/10 [00:01<00:00, 7.03it/s]
3. 前端页面报错:
Traceback (most recent call last):
File "./request_llms/local_llm_class.py", line 160, in run
for response_full in self.llm_stream_generator(**kwargs):
File "./request_llms/bridge_chatglm4.py", line 63, in llm_stream_generator
outputs = self._model.generate(**inputs, **gen_kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/transformers/generation/utils.py", line 1758, in generate
raise ImportError(
File "/usr/local/lib/python3.8/dist-packages/transformers/generation/utils.py", line 2449, in _sample
File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 929, in _update_model_kwargs_for_generation
cache_name, cache = self._extract_past_from_model_output(outputs)
ValueError: too many values to unpack (expected 2)
Screen Shot | 有帮助的截图
已在日志中体现
Terminal Traceback & Material to Help Reproduce Bugs | 终端traceback(如有) + 帮助我们复现的测试材料样本(如有)
已在日志中体现
Metadata
Metadata
Assignees
Labels
No labels