Skip to content

vllm启动报错 #204

@dyhuachi

Description

@dyhuachi

System Info / 系統信息

vllm 0.11.0 pypi_0 pypi
cuda 12.4
NVIDIA GeForce RTX 4090 48g - 8卡
启动方式:
HF_HUB_OFFLINE=1 vllm serve "./GLM-4.1V-9B-Thinking" --trust-remote-code --dtype bfloat16 --max-model-len 8192 --tensor-parallel-size 8 --media-io-kwargs '{"video": {"num_frames": -1}}'
报错:
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] EngineCore failed to start.
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] Traceback (most recent call last):
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/core.py", line 699, in run_engine_core
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] engine_core = EngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/core.py", line 498, in init
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] super().init(vllm_config, executor_class, log_stats,
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] executor_fail_callback)
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] ^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/core.py", line 83, in init
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] self.model_executor = executor_class(vllm_config)
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] ~~~~~~~~~~~~~~^^^^^^^^^^^^^
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/executor/executor_base.py", line 54, in init
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] self._init_executor()
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] ~~~~~~~~~~~~~~~~~~~^^
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/executor/multiproc_executor.py", line 106, in _init_executor
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] self.workers = WorkerProc.wait_for_ready(unready_workers)
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/executor/multiproc_executor.py", line 509, in wait_for_ready
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] raise e from None
(EngineCore_DP0 pid=37512) ERROR 10-21 02:11:42 [core.py:708] Exception: WorkerProc initialization failed due to an exception in a background process. See stack trace for root cause.
(EngineCore_DP0 pid=37512) Process EngineCore_DP0:
(EngineCore_DP0 pid=37512) Traceback (most recent call last):
(EngineCore_DP0 pid=37512) File "/home/tanhe/miniconda3/lib/python3.13/multiprocessing/process.py", line 313, in _bootstrap
(EngineCore_DP0 pid=37512) self.run()
(EngineCore_DP0 pid=37512) ~~~~~~~~^^
(EngineCore_DP0 pid=37512) File "/home/tanhe/miniconda3/lib/python3.13/multiprocessing/process.py", line 108, in run
(EngineCore_DP0 pid=37512) self._target(*self._args, **self._kwargs)
(EngineCore_DP0 pid=37512) ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=37512) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/core.py", line 712, in run_engine_core
(EngineCore_DP0 pid=37512) raise e
(EngineCore_DP0 pid=37512) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/core.py", line 699, in run_engine_core
(EngineCore_DP0 pid=37512) engine_core = EngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=37512) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/core.py", line 498, in init
(EngineCore_DP0 pid=37512) super().init(vllm_config, executor_class, log_stats,
(EngineCore_DP0 pid=37512) ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=37512) executor_fail_callback)
(EngineCore_DP0 pid=37512) ^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=37512) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/core.py", line 83, in init
(EngineCore_DP0 pid=37512) self.model_executor = executor_class(vllm_config)
(EngineCore_DP0 pid=37512) ~~~~~~~~~~~~~~^^^^^^^^^^^^^
(EngineCore_DP0 pid=37512) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/executor/executor_base.py", line 54, in init
(EngineCore_DP0 pid=37512) self._init_executor()
(EngineCore_DP0 pid=37512) ~~~~~~~~~~~~~~~~~~~^^
(EngineCore_DP0 pid=37512) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/executor/multiproc_executor.py", line 106, in _init_executor
(EngineCore_DP0 pid=37512) self.workers = WorkerProc.wait_for_ready(unready_workers)
(EngineCore_DP0 pid=37512) ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=37512) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/executor/multiproc_executor.py", line 509, in wait_for_ready
(EngineCore_DP0 pid=37512) raise e from None
(EngineCore_DP0 pid=37512) Exception: WorkerProc initialization failed due to an exception in a background process. See stack trace for root cause.
(APIServer pid=37366) Traceback (most recent call last):
(APIServer pid=37366) File "/home/tanhe/miniconda3/bin/vllm", line 8, in
(APIServer pid=37366) sys.exit(main())
(APIServer pid=37366) ~~~~^^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/entrypoints/cli/main.py", line 54, in main
(APIServer pid=37366) args.dispatch_function(args)
(APIServer pid=37366) ~~~~~~~~~~~~~~~~~~~~~~^^^^^^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/entrypoints/cli/serve.py", line 57, in cmd
(APIServer pid=37366) uvloop.run(run_server(args))
(APIServer pid=37366) ~~~~~~~~~~^^^^^^^^^^^^^^^^^^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/uvloop/init.py", line 96, in run
(APIServer pid=37366) return __asyncio.run(
(APIServer pid=37366) ~~~~~~~~~~~~~^
(APIServer pid=37366) wrapper(),
(APIServer pid=37366) ^^^^^^^^^^
(APIServer pid=37366) ...<2 lines>...
(APIServer pid=37366) **run_kwargs
(APIServer pid=37366) ^^^^^^^^^^^^
(APIServer pid=37366) )
(APIServer pid=37366) ^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/asyncio/runners.py", line 195, in run
(APIServer pid=37366) return runner.run(main)
(APIServer pid=37366) ~~~~~~~~~~^^^^^^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/asyncio/runners.py", line 118, in run
(APIServer pid=37366) return self._loop.run_until_complete(task)
(APIServer pid=37366) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
(APIServer pid=37366) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/uvloop/init.py", line 48, in wrapper
(APIServer pid=37366) return await main
(APIServer pid=37366) ^^^^^^^^^^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/entrypoints/openai/api_server.py", line 1884, in run_server
(APIServer pid=37366) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/entrypoints/openai/api_server.py", line 1902, in run_server_worker
(APIServer pid=37366) async with build_async_engine_client(
(APIServer pid=37366) ~~~~~~~~~~~~~~~~~~~~~~~~~^
(APIServer pid=37366) args,
(APIServer pid=37366) ^^^^^
(APIServer pid=37366) client_config=client_config,
(APIServer pid=37366) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=37366) ) as engine_client:
(APIServer pid=37366) ^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/contextlib.py", line 214, in aenter
(APIServer pid=37366) return await anext(self.gen)
(APIServer pid=37366) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/entrypoints/openai/api_server.py", line 180, in build_async_engine_client
(APIServer pid=37366) async with build_async_engine_client_from_engine_args(
(APIServer pid=37366) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
(APIServer pid=37366) engine_args,
(APIServer pid=37366) ^^^^^^^^^^^^
(APIServer pid=37366) ...<2 lines>...
(APIServer pid=37366) client_config=client_config,
(APIServer pid=37366) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=37366) ) as engine:
(APIServer pid=37366) ^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/contextlib.py", line 214, in aenter
(APIServer pid=37366) return await anext(self.gen)
(APIServer pid=37366) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/entrypoints/openai/api_server.py", line 225, in build_async_engine_client_from_engine_args
(APIServer pid=37366) async_llm = AsyncLLM.from_vllm_config(
(APIServer pid=37366) vllm_config=vllm_config,
(APIServer pid=37366) ...<4 lines>...
(APIServer pid=37366) client_count=client_count,
(APIServer pid=37366) client_index=client_index)
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/utils/init.py", line 1572, in inner
(APIServer pid=37366) return fn(*args, **kwargs)
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/async_llm.py", line 207, in from_vllm_config
(APIServer pid=37366) return cls(
(APIServer pid=37366) vllm_config=vllm_config,
(APIServer pid=37366) ...<8 lines>...
(APIServer pid=37366) client_index=client_index,
(APIServer pid=37366) )
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/async_llm.py", line 134, in init
(APIServer pid=37366) self.engine_core = EngineCoreClient.make_async_mp_client(
(APIServer pid=37366) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
(APIServer pid=37366) vllm_config=vllm_config,
(APIServer pid=37366) ^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=37366) ...<4 lines>...
(APIServer pid=37366) client_index=client_index,
(APIServer pid=37366) ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=37366) )
(APIServer pid=37366) ^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/core_client.py", line 102, in make_async_mp_client
(APIServer pid=37366) return AsyncMPClient(*client_args)
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/core_client.py", line 769, in init
(APIServer pid=37366) super().init(
(APIServer pid=37366) ~~~~~~~~~~~~~~~~^
(APIServer pid=37366) asyncio_mode=True,
(APIServer pid=37366) ^^^^^^^^^^^^^^^^^^
(APIServer pid=37366) ...<3 lines>...
(APIServer pid=37366) client_addresses=client_addresses,
(APIServer pid=37366) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=37366) )
(APIServer pid=37366) ^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/core_client.py", line 448, in init
(APIServer pid=37366) with launch_core_engines(vllm_config, executor_class,
(APIServer pid=37366) ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=37366) log_stats) as (engine_manager,
(APIServer pid=37366) ^^^^^^^^^^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/contextlib.py", line 148, in exit
(APIServer pid=37366) next(self.gen)
(APIServer pid=37366) ~~~~^^^^^^^^^^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/utils.py", line 732, in launch_core_engines
(APIServer pid=37366) wait_for_engine_startup(
(APIServer pid=37366) ~~~~~~~~~~~~~~~~~~~~~~~^
(APIServer pid=37366) handshake_socket,
(APIServer pid=37366) ^^^^^^^^^^^^^^^^^
(APIServer pid=37366) ...<5 lines>...
(APIServer pid=37366) coordinator.proc if coordinator else None,
(APIServer pid=37366) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=37366) )
(APIServer pid=37366) ^
(APIServer pid=37366) File "/home/tanhe/miniconda3/lib/python3.13/site-packages/vllm/v1/engine/utils.py", line 785, in wait_for_engine_startup
(APIServer pid=37366) raise RuntimeError("Engine core initialization failed. "
(APIServer pid=37366) "See root cause above. "
(APIServer pid=37366) f"Failed core proc(s): {finished}")
(APIServer pid=37366) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
/home/tanhe/miniconda3/lib/python3.13/multiprocessing/resource_tracker.py:301: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown: {'/psm_080612a4'}
warnings.warn(

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

  • The official example scripts / 官方的示例脚本
  • My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

去modelscope官网下载模型文件,然后在conda环境下载vllm,然后执行启动命令

Expected behavior / 期待表现

修复解决这个问题

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions