Qwen 2.5 VL inference on NPU is not producing any output. #1305

Open

Open

Qwen 2.5 VL inference on NPU is not producing any output.#1305

I have been trying to do inference of Qwen 2.5 VL 7b model on NPU and it is not giving any output.

How did you convert the model and does it compile?

Author

I converted the model using the command below and also attempted symmetric quantization. However, after updating to the latest packages, it now throws an error.

Command:
optimum-cli export openvino --model Qwen/Qwen2.5-VL-7B-Instruct Qwen2.5-VL-7B-Instruct/FP16 --weight-format fp16

Error:

RuntimeError: Exception from src\inference\src\cpp\core.cpp:112:
Exception from src\inference\src\dev\plugin.cpp:53:
Exception from src\plugins\intel_npu\src\plugin\src\plugin.cpp:492:
Exception from src\plugins\intel_npu\src\compiler_adapter\src\ze_graph_ext_wrappers.cpp:314:
L0 pfnCreate2 result: ZE_RESULT_ERROR_INVALID_ARGUMENT, code 0x78000004 - generic error code for invalid arguments . [NPU_VCL] Compiler returned msg:
Upper bounds were not specified, got the default value - '9223372036854775807'

OK so I dont have an NPU to test with.

However, there are two places to get the code you need.

https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-genai/inference-with-genai-on-npu.html

details how to convert. and what performance optimizations you need to apply.

to convert I think you should try

optimum-cli export openvino -m "model" --task image-text-to-text --weight-format int4 --ratio 1 --sym --group-size -1 "converted-model"

TO inference on vision models the src lives here

https://github.com/openvinotoolkit/openvino.genai/blob/675ed6c185f1d6e2145461ad5382dad45ecc5eef/src/python/openvino_genai/py_openvino_genai.pyi#L2343

There are other classes in that file which define the Python API, using them together is a bit harder but there are notebooks at openvino notebooks

Also: I convert a lot of models on HF. In that time I launched Echo9Zulu/Optimum-CLI-Tool_tool. For me making the command building process a bit more visual helps, especially since converting different models can be a research-intensive process

to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Participants