We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图) 单卡7b可以跑,多卡推理72b会卡住,也尝试过NPROC_PER_NODE=8 这一行去掉。
运行脚本
NPROC_PER_NODE=8 \ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \ MAX_PIXELS=602112 \ swift infer \ --model /root/Qwen2.5-VL-72B-Instruct \ --infer_backend vllm \ --val_dataset /root/code/ms-swift-3.3.1/data/hhhhh.jsonl \ --gpu_memory_utilization 0.9 \ --limit_mm_per_prompt '{"image": 1}' \ --tensor_parallel_size 8
截图
Your hardware and system info Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等) 环境如下
Package Version Editable project location ---------------------------------------- -------------- ------------------------- absl-py 2.2.2 accelerate 1.6.0 addict 2.4.0 aiofiles 24.1.0 aiohappyeyeballs 2.6.1 aiohttp 3.11.16 aiohttp-cors 0.8.1 aiosignal 1.3.2 airportsdata 20250224 aliyun-python-sdk-core 2.16.0 aliyun-python-sdk-kms 2.16.5 altair 5.5.0 anaconda-anon-usage 0.5.0 annotated-types 0.7.0 antlr4-python3-runtime 4.7.2 anyio 4.9.0 archspec 0.2.3 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 arxiv 2.2.0 astor 0.8.1 asttokens 3.0.0 async-lru 2.0.5 attrdict 2.0.1 attrs 25.3.0 av 14.3.0 babel 2.17.0 beautifulsoup4 4.13.3 binpacking 1.5.2 bitsandbytes 0.45.5 blake3 1.0.4 bleach 6.2.0 blessed 1.21.0 blinker 1.9.0 boltons 23.0.0 boto3 1.37.34 botocore 1.37.34 Brotli 1.0.9 cachetools 5.5.2 certifi 2024.8.30 cffi 1.17.1 charset-normalizer 3.3.2 click 8.1.8 cloudpickle 3.1.1 colorama 0.4.6 colorful 0.5.6 comm 0.2.2 compressed-tensors 0.9.1 conda 24.11.1 conda-anaconda-telemetry 0.1.1 conda-content-trust 0.2.0 conda-libmamba-solver 24.9.0 conda-package-handling 2.4.0 conda_package_streaming 0.11.0 contourpy 1.3.1 cpm-kernels 1.0.11 crcmod 1.7 cryptography 43.0.3 cupy-cuda12x 13.4.1 cycler 0.12.1 dacite 1.9.2 datasets 3.2.0 debugpy 1.8.14 decorator 5.2.1 decord 0.6.0 deepspeed 0.16.5 defusedxml 0.7.1 Deprecated 1.2.18 depyf 0.18.0 dill 0.3.8 diskcache 5.6.3 distlib 0.3.9 distro 1.9.0 dnspython 2.7.0 duckduckgo_search 5.3.1b1 einops 0.8.1 email_validator 2.2.0 et_xmlfile 2.0.0 evalscope 0.14.0 evaluate 0.4.3 executing 2.2.0 fastapi 0.115.12 fastapi-cli 0.0.7 fastjsonschema 2.21.1 fastrlock 0.8.3 feedparser 6.0.11 ffmpy 0.5.0 filelock 3.18.0 fire 0.7.0 flash_attn 2.7.4.post1 fonttools 4.57.0 fqdn 1.5.1 frozendict 2.4.2 frozenlist 1.5.0 fsspec 2024.9.0 func_timeout 4.3.5 future 1.0.0 fuzzywuzzy 0.18.0 gguf 0.10.0 gitdb 4.0.12 GitPython 3.1.44 google-api-core 2.24.2 google-auth 2.39.0 googleapis-common-protos 1.70.0 gpustat 1.1.1 gradio 5.25.1 gradio_client 1.8.0 griffe 0.49.0 groovy 0.1.2 grpcio 1.71.0 h11 0.14.0 h2 4.2.0 h5py 3.13.0 hf-xet 1.0.5 hjson 3.1.0 hpack 4.1.0 httpcore 1.0.8 httptools 0.6.4 httpx 0.28.1 huggingface-hub 0.30.2 human-eval 1.0.3 hyperframe 6.1.0 idna 3.7 imageio 2.37.0 immutabledict 4.2.1 importlib_metadata 8.0.0 iniconfig 2.1.0 interegular 0.3.3 ipykernel 6.29.5 ipython 9.1.0 ipython_pygments_lexers 1.1.1 ipywidgets 8.1.6 isoduration 20.11.0 jedi 0.19.2 jieba 0.42.1 Jinja2 3.1.6 jiter 0.9.0 jmespath 0.10.0 joblib 1.4.2 json5 0.12.0 jsonlines 4.0.0 jsonpatch 1.33 jsonpointer 2.1 jsonschema 4.23.0 jsonschema-specifications 2024.10.1 jupyter 1.1.1 jupyter_client 8.6.3 jupyter-console 6.6.3 jupyter_core 5.7.2 jupyter-events 0.12.0 jupyter-lsp 2.2.5 jupyter_server 2.15.0 jupyter_server_terminals 0.5.3 jupyterlab 4.4.0 jupyterlab_pygments 0.3.0 jupyterlab_server 2.27.3 jupyterlab_widgets 3.0.14 kiwisolver 1.4.8 lagent 0.2.4 langdetect 1.0.9 lark 1.2.2 latex2sympy2 1.9.1 Levenshtein 0.27.1 libmambapy 1.5.11 llguidance 0.7.19 llvmlite 0.43.0 lm-format-enforcer 0.10.11 lxml 5.3.2 Markdown 3.8 markdown-it-py 3.0.0 MarkupSafe 3.0.2 matplotlib 3.10.1 matplotlib-inline 0.1.7 mdurl 0.1.2 menuinst 2.2.0 mistral_common 1.5.4 mistune 3.1.3 mmengine 0.10.7 mmengine-lite 0.10.7 modelscope 1.25.0 mpmath 1.3.0 ms-opencompass 0.1.6 ms_swift 3.3.1 /root/code/ms-swift-3.3.1 ms-vlmeval 0.0.16 msgpack 1.1.0 msgspec 0.19.0 multidict 6.4.3 multiprocess 0.70.16 narwhals 1.35.0 nbclient 0.10.2 nbconvert 7.16.6 nbformat 5.10.4 nest-asyncio 1.6.0 networkx 3.4.2 ninja 1.11.1.4 nltk 3.9.1 notebook 7.4.0 notebook_shim 0.2.4 numba 0.60.0 numpy 1.26.4 nvidia-cublas-cu12 12.4.5.8 nvidia-cuda-cupti-cu12 12.4.127 nvidia-cuda-nvrtc-cu12 12.4.127 nvidia-cuda-runtime-cu12 12.4.127 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.2.1.3 nvidia-curand-cu12 10.3.5.147 nvidia-cusolver-cu12 11.6.1.9 nvidia-cusparse-cu12 12.3.1.170 nvidia-cusparselt-cu12 0.6.2 nvidia-ml-py 12.570.86 nvidia-nccl-cu12 2.21.5 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.4.127 omegaconf 2.0.0 openai 1.74.0 OpenCC 1.1.9 opencensus 0.11.4 opencensus-context 0.1.3 opencv-python 4.11.0.86 opencv-python-headless 4.11.0.86 openpyxl 3.1.5 opentelemetry-api 1.26.0 opentelemetry-exporter-otlp 1.26.0 opentelemetry-exporter-otlp-proto-common 1.26.0 opentelemetry-exporter-otlp-proto-grpc 1.26.0 opentelemetry-exporter-otlp-proto-http 1.26.0 opentelemetry-proto 1.26.0 opentelemetry-sdk 1.26.0 opentelemetry-semantic-conventions 0.47b0 opentelemetry-semantic-conventions-ai 0.4.3 orjson 3.10.16 oss2 2.19.1 outlines 0.1.11 outlines_core 0.1.26 overrides 7.7.0 packaging 24.1 pandas 2.2.3 pandocfilters 1.5.1 parso 0.8.4 partial-json-parser 0.2.1.1.post5 peft 0.15.1 pexpect 4.9.0 phx-class-registry 4.1.0 pillow 11.2.1 pip 24.2 platformdirs 3.10.0 pluggy 1.5.0 portalocker 3.1.1 prettytable 3.16.0 prometheus_client 0.21.1 prometheus-fastapi-instrumentator 7.1.0 prompt_toolkit 3.0.50 propcache 0.3.1 proto-plus 1.26.1 protobuf 4.25.7 psutil 7.0.0 ptyprocess 0.7.0 pure_eval 0.2.3 py-cpuinfo 9.0.0 py-spy 0.4.0 pyarrow 19.0.1 pyasn1 0.6.1 pyasn1_modules 0.4.2 pybind11 2.13.6 pycosat 0.6.6 pycountry 24.6.1 pycparser 2.21 pycryptodome 3.22.0 pydantic 2.11.3 pydantic_core 2.33.1 pydeck 0.9.1 pydub 0.25.1 Pygments 2.19.1 pynvml 12.0.0 pyparsing 3.2.3 pypinyin 0.54.0 PySocks 1.7.1 pytest 8.3.5 python-dateutil 2.9.0.post0 python-dotenv 1.1.0 python-json-logger 3.3.0 python-Levenshtein 0.27.1 python-multipart 0.0.20 pytz 2025.2 PyYAML 6.0.2 pyzmq 26.4.0 qwen-vl-utils 0.0.11 rank-bm25 0.2.2 RapidFuzz 3.13.0 ray 2.40.0 referencing 0.36.2 regex 2024.11.6 requests 2.32.3 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 14.0.0 rich-toolkit 0.14.1 rouge 1.0.1 rouge-chinese 1.0.3 rouge_score 0.1.2 rpds-py 0.24.0 rsa 4.9.1 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 ruff 0.11.5 s3transfer 0.11.4 sacrebleu 2.5.1 safehttpx 0.1.6 safetensors 0.5.3 scikit-learn 1.6.1 scipy 1.15.2 seaborn 0.13.2 semantic-version 2.10.0 Send2Trash 1.8.3 sentence-transformers 4.0.2 sentencepiece 0.2.0 setuptools 79.0.1 sgmllib3k 1.0.0 shellingham 1.5.4 simplejson 3.20.1 six 1.17.0 smart-open 7.1.0 smmap 5.0.2 sniffio 1.3.1 socksio 1.0.0 sortedcontainers 2.4.0 soupsieve 2.6 stack-data 0.6.3 starlette 0.46.2 streamlit 1.44.1 sty 1.0.6 swankit 0.1.7 swanlab 0.5.5 sympy 1.13.1 tabulate 0.9.0 tenacity 9.1.2 tensorboard 2.19.0 tensorboard-data-server 0.7.2 termcolor 3.0.1 terminado 0.18.1 threadpoolctl 3.6.0 tiktoken 0.9.0 timeout-decorator 0.5.0 tinycss2 1.4.0 tokenizers 0.21.1 toml 0.10.2 tomlkit 0.13.2 torch 2.5.1 torchaudio 2.5.1 torchvision 0.20.1 tornado 6.4.2 tqdm 4.66.5 traitlets 5.14.3 transformers 4.51.0 transformers-stream-generator 0.0.5 triton 3.1.0 trl 0.16.1 truststore 0.8.0 typer 0.15.2 types-python-dateutil 2.9.0.20241206 typing_extensions 4.13.2 typing-inspection 0.4.0 tzdata 2025.2 uri-template 1.3.0 urllib3 2.2.3 uvicorn 0.34.1 uvloop 0.21.0 validators 0.34.0 virtualenv 20.30.0 vllm 0.7.3 watchdog 6.0.0 watchfiles 1.0.5 wcwidth 0.2.13 webcolors 24.11.1 webencodings 0.5.1 websocket-client 1.8.0 websockets 15.0.1 Werkzeug 3.1.3 wheel 0.44.0 widgetsnbextension 4.0.14 word2number 1.1 wrapt 1.17.2 xformers 0.0.28.post3 xgrammar 0.1.11 XlsxWriter 3.2.2 xtuner 0.1.11 xxhash 3.5.0 yapf 0.43.0 yarl 1.19.0 zipp 3.21.0 zstandard 0.23.0
Additional context Add any other context about the problem here(在这里补充其他信息) 也尝试过这里的方法,但是没有奏效:#3138
The text was updated successfully, but these errors were encountered:
NPROC_PER_NODE=8 \去掉
NPROC_PER_NODE=8 \
Sorry, something went wrong.
我总是在1023或者接近1024的时候卡了很久。
这是我的推理脚本,32b、72b和72b-awq都遇到,卡很久
env MIN_PIXELS=3136 MAX_PIXELS=1003520 swift infer \ --model Qwen/Qwen2_5-VL-32B-Instruct \ --val_dataset $DATA_JSONL \ --max_length 8192 \ --infer_backend vllm \ --ckpt_dir $CKPT_PATH \ --result_path xxx.jsonl \ --tensor-parallel-size 8 \ --attn_impl flash_attn
No branches or pull requests
Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)
单卡7b可以跑,多卡推理72b会卡住,也尝试过NPROC_PER_NODE=8 这一行去掉。
运行脚本
NPROC_PER_NODE=8 \ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \ MAX_PIXELS=602112 \ swift infer \ --model /root/Qwen2.5-VL-72B-Instruct \ --infer_backend vllm \ --val_dataset /root/code/ms-swift-3.3.1/data/hhhhh.jsonl \ --gpu_memory_utilization 0.9 \ --limit_mm_per_prompt '{"image": 1}' \ --tensor_parallel_size 8
截图
Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)
环境如下
Additional context
Add any other context about the problem here(在这里补充其他信息)
也尝试过这里的方法,但是没有奏效:#3138
The text was updated successfully, but these errors were encountered: