[Bug]: vllm-ascend 离线1p1d分离报错，和vllm不兼容 #1074

1027866388 · 2025-06-05T03:25:34Z

Your current environment

INFO 06-05 11:21:09 [importing.py:16] Triton not installed or not compatible; certain GPU-related functions will not be available.
WARNING 06-05 11:21:09 [importing.py:28] Triton is not installed. Using dummy decorators. Install it via pip install triton to enable kernel compilation.
INFO 06-05 11:21:11 [init.py:38] Available plugins for group vllm.platform_plugins:
INFO 06-05 11:21:11 [init.py:40] - ascend -> vllm_ascend:register
INFO 06-05 11:21:11 [init.py:43] All plugins in this group will be loaded. Set VLLM_PLUGINS to control which plugins to load.
INFO 06-05 11:21:11 [init.py:234] Platform plugin ascend is activated
WARNING 06-05 11:21:14 [_custom_ops.py:21] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")
Collecting environment information...
PyTorch version: 2.5.1
Is debug build: False

OS: Ubuntu Jammy Jellyfish (development branch) (aarch64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: version 4.0.2
Libc version: glibc-2.35

Python version: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 21:44:20) [GCC 12.3.0] (64-bit runtime)
Python platform: Linux-4.19.90-2107.6.0.0192.8.oe1.bclinux.aarch64-aarch64-with-glibc2.35

CPU:
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Vendor ID: HiSilicon
Model name: Kunpeng-920
Model: 0
Thread(s) per core: 1
Core(s) per cluster: 48
Socket(s): -
Cluster(s): 4
Stepping: 0x1
BogoMIPS: 200.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm ssbs
L1d cache: 12 MiB (192 instances)
L1i cache: 12 MiB (192 instances)
L2 cache: 96 MiB (192 instances)
L3 cache: 192 MiB (8 instances)
NUMA node(s): 8
NUMA node0 CPU(s): 0-23
NUMA node1 CPU(s): 24-47
NUMA node2 CPU(s): 48-71
NUMA node3 CPU(s): 72-95
NUMA node4 CPU(s): 96-119
NUMA node5 CPU(s): 120-143
NUMA node6 CPU(s): 144-167
NUMA node7 CPU(s): 168-191
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected

Versions of relevant libraries:
[pip3] gpytorch==1.14
[pip3] numpy==1.26.4
[pip3] pyzmq==26.2.1
[pip3] torch==2.5.1
[pip3] torch-npu==2.5.1
[pip3] torchair==0.1
[pip3] torchvision==0.20.1
[pip3] transformers==4.52.4
[pip3] transformers-stream-generator==0.0.5
[conda] gpytorch 1.14 pypi_0 pypi
[conda] numpy 1.26.4 pypi_0 pypi
[conda] pyzmq 26.2.1 pypi_0 pypi
[conda] torch 2.5.1 pypi_0 pypi
[conda] torch-npu 2.5.1 pypi_0 pypi
[conda] torchair 0.1 pypi_0 pypi
[conda] torchvision 0.20.1 pypi_0 pypi
[conda] transformers 4.52.4 pypi_0 pypi
[conda] transformers-stream-generator 0.0.5 pypi_0 pypi
vLLM Version: 0.9.1.dev129+g6d18ed2a2 (git sha: 6d18ed2a2)
vLLM Ascend Version: 0.8.5rc2.dev69+g068c3a0.d20250603 (git sha: 068c3a0, date: 20250603)

ENV Variables:
ASCEND_VISIBLE_DEVICES=4,5,7,1
ASCEND_RUNTIME_OPTIONS=
ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
ASCEND_OPP_PATH=/usr/local/Ascend/ascend-toolkit/latest/opp
LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/Ascend/driver/lib64:/usr/local/Ascend/driver/lib64/common:/usr/local/Ascend/driver/lib64/driver:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64:/usr/local/Ascend/ascend-toolkit/latest/tools/aml/lib64/plugin:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/Ascend/driver/lib64:/usr/local/Ascend/driver/lib64/common:/usr/local/Ascend/driver/lib64/driver:/usr/local/Ascend/ascend-toolkit/latest/lib64:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/opskernel:/usr/local/Ascend/ascend-toolkit/latest/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64:/usr/local/Ascend/driver/lib64:/usr/local/Ascend/driver/lib64/common:/usr/local/Ascend/driver/lib64/driver
ASCEND_AICPU_PATH=/usr/local/Ascend/ascend-toolkit/latest
ASCEND_HOME_PATH=/usr/local/Ascend/ascend-toolkit/latest
TORCH_DEVICE_BACKEND_AUTOLOAD=1
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1

NPU:
+------------------------------------------------------------------------------------------------+
| npu-smi 24.1.0.3 Version: 24.1.0.3 |
+---------------------------+---------------+----------------------------------------------------+
| NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page)|
| Chip | Bus-Id | AICore(%) Memory-Usage(MB) HBM-Usage(MB) |
+===========================+===============+====================================================+
| 1 910B2 | OK | 94.8 42 0 / 0 |
| 0 | 0000:01:00.0 | 0 0 / 0 3389 / 65536 |
+===========================+===============+====================================================+
| 4 910B2 | OK | 97.8 40 0 / 0 |
| 0 | 0000:81:00.0 | 0 0 / 0 3390 / 65536 |
+===========================+===============+====================================================+
| 5 910B2 | OK | 98.2 44 0 / 0 |
| 0 | 0000:41:00.0 | 0 0 / 0 3386 / 65536 |
+===========================+===============+====================================================+
| 7 910B2 | OK | 97.9 45 0 / 0 |
| 0 | 0000:42:00.0 | 0 0 / 0 3389 / 65536 |
+===========================+===============+====================================================+
+---------------------------+---------------+----------------------------------------------------+
| NPU Chip | Process id | Process name | Process memory(MB) |
+===========================+===============+====================================================+
| No running processes found in NPU 1 |
+===========================+===============+====================================================+
| No running processes found in NPU 4 |
+===========================+===============+====================================================+
| No running processes found in NPU 5 |
+===========================+===============+====================================================+
| No running processes found in NPU 7 |
+===========================+===============+====================================================+

CANN:
package_name=Ascend-cann-toolkit
version=8.0.0
innerversion=V100R001C20SPC001B251
compatible_version=[V100R001C15],[V100R001C17],[V100R001C18],[V100R001C19],[V100R001C20]
arch=aarch64
os=linux
path=/usr/local/Ascend/ascend-toolkit/8.0.0/aarch64-linux

🐛 Describe the bug

python offline_disaggregated_prefill_npu.py

INFO 06-05 11:24:52 [importing.py:16] Triton not installed or not compatible; certain GPU-related functions will not be available.
WARNING 06-05 11:24:52 [importing.py:28] Triton is not installed. Using dummy decorators. Install it via pip install triton to enable kernel compilation.
INFO 06-05 11:24:52 [importing.py:16] Triton not installed or not compatible; certain GPU-related functions will not be available.
WARNING 06-05 11:24:52 [importing.py:28] Triton is not installed. Using dummy decorators. Install it via pip install triton to enable kernel compilation.
INFO 06-05 11:24:55 [init.py:38] Available plugins for group vllm.platform_plugins:
INFO 06-05 11:24:55 [init.py:40] - ascend -> vllm_ascend:register
INFO 06-05 11:24:55 [init.py:43] All plugins in this group will be loaded. Set VLLM_PLUGINS to control which plugins to load.
INFO 06-05 11:24:55 [init.py:234] Platform plugin ascend is activated
INFO 06-05 11:24:55 [init.py:38] Available plugins for group vllm.platform_plugins:
INFO 06-05 11:24:55 [init.py:40] - ascend -> vllm_ascend:register
INFO 06-05 11:24:55 [init.py:43] All plugins in this group will be loaded. Set VLLM_PLUGINS to control which plugins to load.
INFO 06-05 11:24:55 [init.py:234] Platform plugin ascend is activated
WARNING 06-05 11:24:57 [_custom_ops.py:21] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")
WARNING 06-05 11:24:57 [_custom_ops.py:21] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")
Process Process-2:
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/workspace/mnt/wangyanhui/vllm-ascend/examples/offline_disaggregated_prefill_npu.py", line 89, in run_decode
ktc = KVTransferConfig.from_cli(
AttributeError: type object 'KVTransferConfig' has no attribute 'from_cli'
Process Process-1:
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/workspace/mnt/wangyanhui/vllm-ascend/examples/offline_disaggregated_prefill_npu.py", line 49, in run_prefill
ktc = KVTransferConfig.from_cli(
AttributeError: type object 'KVTransferConfig' has no attribute 'from_cli'
All process done!

The text was updated successfully, but these errors were encountered:

1027866388 added the bug Something isn't working label Jun 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: vllm-ascend 离线1p1d分离报错，和vllm不兼容 #1074

[Bug]: vllm-ascend 离线1p1d分离报错，和vllm不兼容 #1074

1027866388 commented Jun 5, 2025

[Bug]: vllm-ascend 离线1p1d分离报错，和vllm不兼容 #1074

[Bug]: vllm-ascend 离线1p1d分离报错，和vllm不兼容 #1074

Comments

1027866388 commented Jun 5, 2025

Your current environment

🐛 Describe the bug