-
Notifications
You must be signed in to change notification settings - Fork 225
Description
Describe the bug
When attempting to load Qwen3-32B with a hetero config, the model_server fails to do so.
To Reproduce
Steps to reproduce the behavior:
- Launch ovms with models in directory converted from export_model.py
- fail.
Expected behavior
OVMS should load the model into VRAM.
Logs
[2025-10-12 04:26:45.371][367240][modelmanager][debug][modelmanager.cpp:490] Mediapipe graph:Qwen/Qwen3-32B was not loaded so far. Triggering load
[2025-10-12 04:26:45.371][367240][modelmanager][debug][mediapipegraphdefinition.cpp:129] Started validation of mediapipe: Qwen/Qwen3-32B
[2025-10-12 04:26:45.371][367240][modelmanager][debug][mediapipe_utils.cpp:84] setting input stream: input packet type: UNKNOWN from: HTTP_REQUEST_PAYLOAD:input
[2025-10-12 04:26:45.371][367240][modelmanager][debug][mediapipe_utils.cpp:84] setting output stream: output packet type: UNKNOWN from: HTTP_RESPONSE_PAYLOAD:output
[2025-10-12 04:26:45.371][367240][serving][info][mediapipegraphdefinition.cpp:421] MediapipeGraphDefinition initializing graph nodes
[2025-10-12 04:26:45.371][367240][modelmanager][info][servable_initializer.cpp:424] Initializing Language Model Continuous Batching servable
[2025-10-12 04:27:41.099][367240][serving][error][servable_initializer.cpp:145] Error during llm node initialization for models_path: /media/models/Qwen/Qwen3-32B/./ exception: Exception from src/inference/src/cpp/core.cpp:109:
Exception from src/inference/src/dev/plugin.cpp:53:
Exception from src/plugins/hetero/src/compiled_model.cpp:36:
Standard exception from compilation library: Exception from src/inference/src/dev/plugin.cpp:53:
Check 'false' failed at src/plugins/intel_gpu/src/plugin/program_builder.cpp:163:
[GPU] ProgramBuilder build failed!
Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_common.hpp:40:
[GPU] clEnqueueNDRangeKernel, error code: -52 CL_INVALID_KERNEL_ARGS
Configuration
- OVMS version - built from latest.
- 11600KF, 3 x A770's with latest drivers.
Additional context
Add any other context about the problem here.