Skip to content

qwen2.5-vl-72b多卡推理卡住 #4021

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zsLin177 opened this issue Apr 27, 2025 · 2 comments
Open

qwen2.5-vl-72b多卡推理卡住 #4021

zsLin177 opened this issue Apr 27, 2025 · 2 comments

Comments

@zsLin177
Copy link

Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)
单卡7b可以跑,多卡推理72b会卡住,也尝试过NPROC_PER_NODE=8 这一行去掉。

运行脚本

NPROC_PER_NODE=8 \
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
MAX_PIXELS=602112 \
swift infer \
    --model /root/Qwen2.5-VL-72B-Instruct \
    --infer_backend vllm \
    --val_dataset /root/code/ms-swift-3.3.1/data/hhhhh.jsonl \
    --gpu_memory_utilization 0.9 \
    --limit_mm_per_prompt '{"image": 1}' \
    --tensor_parallel_size 8

截图

Image Image

Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)
环境如下

Package                                  Version        Editable project location
---------------------------------------- -------------- -------------------------
absl-py                                  2.2.2
accelerate                               1.6.0
addict                                   2.4.0
aiofiles                                 24.1.0
aiohappyeyeballs                         2.6.1
aiohttp                                  3.11.16
aiohttp-cors                             0.8.1
aiosignal                                1.3.2
airportsdata                             20250224
aliyun-python-sdk-core                   2.16.0
aliyun-python-sdk-kms                    2.16.5
altair                                   5.5.0
anaconda-anon-usage                      0.5.0
annotated-types                          0.7.0
antlr4-python3-runtime                   4.7.2
anyio                                    4.9.0
archspec                                 0.2.3
argon2-cffi                              23.1.0
argon2-cffi-bindings                     21.2.0
arrow                                    1.3.0
arxiv                                    2.2.0
astor                                    0.8.1
asttokens                                3.0.0
async-lru                                2.0.5
attrdict                                 2.0.1
attrs                                    25.3.0
av                                       14.3.0
babel                                    2.17.0
beautifulsoup4                           4.13.3
binpacking                               1.5.2
bitsandbytes                             0.45.5
blake3                                   1.0.4
bleach                                   6.2.0
blessed                                  1.21.0
blinker                                  1.9.0
boltons                                  23.0.0
boto3                                    1.37.34
botocore                                 1.37.34
Brotli                                   1.0.9
cachetools                               5.5.2
certifi                                  2024.8.30
cffi                                     1.17.1
charset-normalizer                       3.3.2
click                                    8.1.8
cloudpickle                              3.1.1
colorama                                 0.4.6
colorful                                 0.5.6
comm                                     0.2.2
compressed-tensors                       0.9.1
conda                                    24.11.1
conda-anaconda-telemetry                 0.1.1
conda-content-trust                      0.2.0
conda-libmamba-solver                    24.9.0
conda-package-handling                   2.4.0
conda_package_streaming                  0.11.0
contourpy                                1.3.1
cpm-kernels                              1.0.11
crcmod                                   1.7
cryptography                             43.0.3
cupy-cuda12x                             13.4.1
cycler                                   0.12.1
dacite                                   1.9.2
datasets                                 3.2.0
debugpy                                  1.8.14
decorator                                5.2.1
decord                                   0.6.0
deepspeed                                0.16.5
defusedxml                               0.7.1
Deprecated                               1.2.18
depyf                                    0.18.0
dill                                     0.3.8
diskcache                                5.6.3
distlib                                  0.3.9
distro                                   1.9.0
dnspython                                2.7.0
duckduckgo_search                        5.3.1b1
einops                                   0.8.1
email_validator                          2.2.0
et_xmlfile                               2.0.0
evalscope                                0.14.0
evaluate                                 0.4.3
executing                                2.2.0
fastapi                                  0.115.12
fastapi-cli                              0.0.7
fastjsonschema                           2.21.1
fastrlock                                0.8.3
feedparser                               6.0.11
ffmpy                                    0.5.0
filelock                                 3.18.0
fire                                     0.7.0
flash_attn                               2.7.4.post1
fonttools                                4.57.0
fqdn                                     1.5.1
frozendict                               2.4.2
frozenlist                               1.5.0
fsspec                                   2024.9.0
func_timeout                             4.3.5
future                                   1.0.0
fuzzywuzzy                               0.18.0
gguf                                     0.10.0
gitdb                                    4.0.12
GitPython                                3.1.44
google-api-core                          2.24.2
google-auth                              2.39.0
googleapis-common-protos                 1.70.0
gpustat                                  1.1.1
gradio                                   5.25.1
gradio_client                            1.8.0
griffe                                   0.49.0
groovy                                   0.1.2
grpcio                                   1.71.0
h11                                      0.14.0
h2                                       4.2.0
h5py                                     3.13.0
hf-xet                                   1.0.5
hjson                                    3.1.0
hpack                                    4.1.0
httpcore                                 1.0.8
httptools                                0.6.4
httpx                                    0.28.1
huggingface-hub                          0.30.2
human-eval                               1.0.3
hyperframe                               6.1.0
idna                                     3.7
imageio                                  2.37.0
immutabledict                            4.2.1
importlib_metadata                       8.0.0
iniconfig                                2.1.0
interegular                              0.3.3
ipykernel                                6.29.5
ipython                                  9.1.0
ipython_pygments_lexers                  1.1.1
ipywidgets                               8.1.6
isoduration                              20.11.0
jedi                                     0.19.2
jieba                                    0.42.1
Jinja2                                   3.1.6
jiter                                    0.9.0
jmespath                                 0.10.0
joblib                                   1.4.2
json5                                    0.12.0
jsonlines                                4.0.0
jsonpatch                                1.33
jsonpointer                              2.1
jsonschema                               4.23.0
jsonschema-specifications                2024.10.1
jupyter                                  1.1.1
jupyter_client                           8.6.3
jupyter-console                          6.6.3
jupyter_core                             5.7.2
jupyter-events                           0.12.0
jupyter-lsp                              2.2.5
jupyter_server                           2.15.0
jupyter_server_terminals                 0.5.3
jupyterlab                               4.4.0
jupyterlab_pygments                      0.3.0
jupyterlab_server                        2.27.3
jupyterlab_widgets                       3.0.14
kiwisolver                               1.4.8
lagent                                   0.2.4
langdetect                               1.0.9
lark                                     1.2.2
latex2sympy2                             1.9.1
Levenshtein                              0.27.1
libmambapy                               1.5.11
llguidance                               0.7.19
llvmlite                                 0.43.0
lm-format-enforcer                       0.10.11
lxml                                     5.3.2
Markdown                                 3.8
markdown-it-py                           3.0.0
MarkupSafe                               3.0.2
matplotlib                               3.10.1
matplotlib-inline                        0.1.7
mdurl                                    0.1.2
menuinst                                 2.2.0
mistral_common                           1.5.4
mistune                                  3.1.3
mmengine                                 0.10.7
mmengine-lite                            0.10.7
modelscope                               1.25.0
mpmath                                   1.3.0
ms-opencompass                           0.1.6
ms_swift                                 3.3.1          /root/code/ms-swift-3.3.1
ms-vlmeval                               0.0.16
msgpack                                  1.1.0
msgspec                                  0.19.0
multidict                                6.4.3
multiprocess                             0.70.16
narwhals                                 1.35.0
nbclient                                 0.10.2
nbconvert                                7.16.6
nbformat                                 5.10.4
nest-asyncio                             1.6.0
networkx                                 3.4.2
ninja                                    1.11.1.4
nltk                                     3.9.1
notebook                                 7.4.0
notebook_shim                            0.2.4
numba                                    0.60.0
numpy                                    1.26.4
nvidia-cublas-cu12                       12.4.5.8
nvidia-cuda-cupti-cu12                   12.4.127
nvidia-cuda-nvrtc-cu12                   12.4.127
nvidia-cuda-runtime-cu12                 12.4.127
nvidia-cudnn-cu12                        9.1.0.70
nvidia-cufft-cu12                        11.2.1.3
nvidia-curand-cu12                       10.3.5.147
nvidia-cusolver-cu12                     11.6.1.9
nvidia-cusparse-cu12                     12.3.1.170
nvidia-cusparselt-cu12                   0.6.2
nvidia-ml-py                             12.570.86
nvidia-nccl-cu12                         2.21.5
nvidia-nvjitlink-cu12                    12.4.127
nvidia-nvtx-cu12                         12.4.127
omegaconf                                2.0.0
openai                                   1.74.0
OpenCC                                   1.1.9
opencensus                               0.11.4
opencensus-context                       0.1.3
opencv-python                            4.11.0.86
opencv-python-headless                   4.11.0.86
openpyxl                                 3.1.5
opentelemetry-api                        1.26.0
opentelemetry-exporter-otlp              1.26.0
opentelemetry-exporter-otlp-proto-common 1.26.0
opentelemetry-exporter-otlp-proto-grpc   1.26.0
opentelemetry-exporter-otlp-proto-http   1.26.0
opentelemetry-proto                      1.26.0
opentelemetry-sdk                        1.26.0
opentelemetry-semantic-conventions       0.47b0
opentelemetry-semantic-conventions-ai    0.4.3
orjson                                   3.10.16
oss2                                     2.19.1
outlines                                 0.1.11
outlines_core                            0.1.26
overrides                                7.7.0
packaging                                24.1
pandas                                   2.2.3
pandocfilters                            1.5.1
parso                                    0.8.4
partial-json-parser                      0.2.1.1.post5
peft                                     0.15.1
pexpect                                  4.9.0
phx-class-registry                       4.1.0
pillow                                   11.2.1
pip                                      24.2
platformdirs                             3.10.0
pluggy                                   1.5.0
portalocker                              3.1.1
prettytable                              3.16.0
prometheus_client                        0.21.1
prometheus-fastapi-instrumentator        7.1.0
prompt_toolkit                           3.0.50
propcache                                0.3.1
proto-plus                               1.26.1
protobuf                                 4.25.7
psutil                                   7.0.0
ptyprocess                               0.7.0
pure_eval                                0.2.3
py-cpuinfo                               9.0.0
py-spy                                   0.4.0
pyarrow                                  19.0.1
pyasn1                                   0.6.1
pyasn1_modules                           0.4.2
pybind11                                 2.13.6
pycosat                                  0.6.6
pycountry                                24.6.1
pycparser                                2.21
pycryptodome                             3.22.0
pydantic                                 2.11.3
pydantic_core                            2.33.1
pydeck                                   0.9.1
pydub                                    0.25.1
Pygments                                 2.19.1
pynvml                                   12.0.0
pyparsing                                3.2.3
pypinyin                                 0.54.0
PySocks                                  1.7.1
pytest                                   8.3.5
python-dateutil                          2.9.0.post0
python-dotenv                            1.1.0
python-json-logger                       3.3.0
python-Levenshtein                       0.27.1
python-multipart                         0.0.20
pytz                                     2025.2
PyYAML                                   6.0.2
pyzmq                                    26.4.0
qwen-vl-utils                            0.0.11
rank-bm25                                0.2.2
RapidFuzz                                3.13.0
ray                                      2.40.0
referencing                              0.36.2
regex                                    2024.11.6
requests                                 2.32.3
rfc3339-validator                        0.1.4
rfc3986-validator                        0.1.1
rich                                     14.0.0
rich-toolkit                             0.14.1
rouge                                    1.0.1
rouge-chinese                            1.0.3
rouge_score                              0.1.2
rpds-py                                  0.24.0
rsa                                      4.9.1
ruamel.yaml                              0.18.6
ruamel.yaml.clib                         0.2.8
ruff                                     0.11.5
s3transfer                               0.11.4
sacrebleu                                2.5.1
safehttpx                                0.1.6
safetensors                              0.5.3
scikit-learn                             1.6.1
scipy                                    1.15.2
seaborn                                  0.13.2
semantic-version                         2.10.0
Send2Trash                               1.8.3
sentence-transformers                    4.0.2
sentencepiece                            0.2.0
setuptools                               79.0.1
sgmllib3k                                1.0.0
shellingham                              1.5.4
simplejson                               3.20.1
six                                      1.17.0
smart-open                               7.1.0
smmap                                    5.0.2
sniffio                                  1.3.1
socksio                                  1.0.0
sortedcontainers                         2.4.0
soupsieve                                2.6
stack-data                               0.6.3
starlette                                0.46.2
streamlit                                1.44.1
sty                                      1.0.6
swankit                                  0.1.7
swanlab                                  0.5.5
sympy                                    1.13.1
tabulate                                 0.9.0
tenacity                                 9.1.2
tensorboard                              2.19.0
tensorboard-data-server                  0.7.2
termcolor                                3.0.1
terminado                                0.18.1
threadpoolctl                            3.6.0
tiktoken                                 0.9.0
timeout-decorator                        0.5.0
tinycss2                                 1.4.0
tokenizers                               0.21.1
toml                                     0.10.2
tomlkit                                  0.13.2
torch                                    2.5.1
torchaudio                               2.5.1
torchvision                              0.20.1
tornado                                  6.4.2
tqdm                                     4.66.5
traitlets                                5.14.3
transformers                             4.51.0
transformers-stream-generator            0.0.5
triton                                   3.1.0
trl                                      0.16.1
truststore                               0.8.0
typer                                    0.15.2
types-python-dateutil                    2.9.0.20241206
typing_extensions                        4.13.2
typing-inspection                        0.4.0
tzdata                                   2025.2
uri-template                             1.3.0
urllib3                                  2.2.3
uvicorn                                  0.34.1
uvloop                                   0.21.0
validators                               0.34.0
virtualenv                               20.30.0
vllm                                     0.7.3
watchdog                                 6.0.0
watchfiles                               1.0.5
wcwidth                                  0.2.13
webcolors                                24.11.1
webencodings                             0.5.1
websocket-client                         1.8.0
websockets                               15.0.1
Werkzeug                                 3.1.3
wheel                                    0.44.0
widgetsnbextension                       4.0.14
word2number                              1.1
wrapt                                    1.17.2
xformers                                 0.0.28.post3
xgrammar                                 0.1.11
XlsxWriter                               3.2.2
xtuner                                   0.1.11
xxhash                                   3.5.0
yapf                                     0.43.0
yarl                                     1.19.0
zipp                                     3.21.0
zstandard                                0.23.0

Additional context
Add any other context about the problem here(在这里补充其他信息)
也尝试过这里的方法,但是没有奏效:#3138

@Jintao-Huang
Copy link
Collaborator

NPROC_PER_NODE=8 \去掉

@mangoyuan
Copy link

我总是在1023或者接近1024的时候卡了很久。

Image

这是我的推理脚本,32b、72b和72b-awq都遇到,卡很久

env MIN_PIXELS=3136 MAX_PIXELS=1003520 swift infer \
    --model Qwen/Qwen2_5-VL-32B-Instruct  \
    --val_dataset $DATA_JSONL \
    --max_length 8192 \
    --infer_backend vllm \
    --ckpt_dir $CKPT_PATH \
    --result_path xxx.jsonl \
    --tensor-parallel-size 8 \
    --attn_impl flash_attn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants