[NPU] Add dynamic host pipeline to support host compile #33249

XinWangIntel · 2025-12-15T09:14:18Z

Add new classes for handling dynamic pipelines and inference requests, update the build system to conditionally include these features, and ensure proper integration with the existing backend.

Dynamic Inference and Pipeline Support:

Build new dynamic feature only under developer build
Added new header zero_dynamic_infer_request.hpp implementing the ZeroDynamicInferRequest class, which add shape predict, tensor allocation with user and local tensor info based on normal ZeroInferRequest.
Added new header zero_dynamic_pipeline.hpp implementing the DynamicPipeline struct, which manages command list, graph arguments for dynamic execution.
Added IRGraph to forward inference call from dynamic pipeline to npu_mlir_runtime.
Update metadata with blob type, and parse blob type to create dynamic inferrequest and dynamic pipeline.
Update InferDynamicNetworkImportSetShapeCPUTensor to set legal shape

Tickets:

C-164943

rkazants

do NOT merge

ArtemySkrebkov · 2025-12-16T23:48:41Z

src/plugins/intel_npu/src/backend/src/zero_dynamic_infer_request.cpp

+        if (irGraph) {
+            IRGraph::GraphArguments graphArgs;
+            irGraph->getBinding(graphArgs);
+            std::vector<IRGraph::MemRefType> inputPros = graphArgs._inputs;


Suggested change

std::vector<IRGraph::MemRefType> inputPros = graphArgs._inputs;

std::vector<IRGraph::MemRefType>& inputPros = graphArgs._inputs;

As this likely creates a copy of the vector

We need copy since each DynamicPipeline has own graph arguments, the input shape or size may be different among them.

ArtemySkrebkov · 2025-12-16T23:49:23Z

src/plugins/intel_npu/src/backend/src/zero_dynamic_infer_request.cpp

+            // irGraph->predict_output_shape(inputPros, outputPros);
+
+            bool shapeChanged = false;
+            for (size_t i = 0; i < outputPros.size(); i++) {


Is it really required?

I would try to avoid extra logic in infer_async if possible.

Most time, the tensor we set can be used without issue. need predict_shape to detect if internal output tensor is large enough or need recreate. Need to check with complex model

src/plugins/intel_npu/src/utils/include/intel_npu/utils/zero/zero_types.hpp

src/plugins/intel_npu/src/compiler_adapter/src/npu_mlir_runtime_api.cpp

src/plugins/intel_npu/src/compiler_adapter/src/driver_compiler_adapter.cpp

src/plugins/intel_npu/src/backend/src/zero_dynamic_pipeline.cpp

src/plugins/intel_npu/src/backend/src/zero_dynamic_infer_request.cpp

src/plugins/intel_npu/src/compiler_adapter/include/npu_mlir_runtime_api.hpp

src/plugins/intel_npu/src/compiler_adapter/src/irgraph.cpp

src/plugins/intel_npu/src/compiler_adapter/src/npu_mlir_runtime_api.cpp

src/plugins/intel_npu/src/backend/include/zero_pipeline.hpp

pereanub · 2026-02-02T15:16:33Z

src/plugins/intel_npu/src/plugin/CMakeLists.txt

-ov_add_api_validator_post_build_step(TARGET ${NPU_PLUGIN_TARGET})
+if(NOT ENABLE_NPU_DEBUG_CAPS)
+    ov_add_api_validator_post_build_step(TARGET ${NPU_PLUGIN_TARGET})
+endif()


Removed changes on this file

Use point instread of MemRef Wrap deps to zero pipeline Refactor code Move tensor allocator to pipeline Try to allocate tensor on pipeline Allocate tensor in dynamic inferrequest Clear changes on inferrequest Clean code add free Not release mem buffer to solve segment issue Use StridedMemRefType Clear zero_dynamic_infer_request Remove zero dynamic_infer_request Move ir_graph to compiler_adapter Fix copy issue Add TODO Reuse engine in dynamic pipeline Support windows Update to latest commit Fix stream status issue Update mlir lib Fix windows build and link issue Fix min func issue on win and lin dynamic shape Fix linux build compilation add NPU_LLVM_BACKEND support multiple pipeline (batch) Impelemtn updateMutableCmdList Fix compilation errors after rebasing the code fix a compilation issue w/ NPU_LLVM_BACKEND disabled Add control and refactor Update level_zero_wrapper, mlir lib Use different tensor for different inferrequest in benchmark Update windows jit libs Add set_argument_property to update data, strides, shapes Change update_graph_arguments to pass strides and shape Blob will not be released in IRGraph Fix buffer issue Update stride in MemRefType to be element based Move debug log into logger Update arguments if we reuse L0 tensor on dynamic pipeline Use plugin Metadata to show blob type Add BlobType Use zero_dynamic_infer_request to manage buffer for dynamic pipeline Fix metadata test and code style Add InferRequestDynamicShapeTest Add test and fix engine init issue Only disable test for min size and unprocessed model Add test for remote tensor and host tensor Rebase and solve conflict Fix accuracy check Move mlir deps to npu mlir runtime Add metadata to return shapeFromIRModel Load lib instead of link Remove redundant libs Update lib and fix window Clean mlir deps Fix code style Use predict_shape function to check output tensor Use MemRef instead of ze arg properties Fix call of mlir runtime API and metadata Fix stride issue Add new functions to manage runtime MemRef object Use new API to manage MemRef Fix memref issue Use new api Update API to use continuous memory Only create handle before execute Fix ArgumentDescriptor Fix index issue reference code for output shape Fix predict shape Fix init issue Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Signed-off-by: Xin Wang <[email protected]>

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Signed-off-by: Xin Wang <[email protected]>

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Co-authored-by: Bogdan Pereanu <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Signed-off-by: Xin Wang <[email protected]>

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Signed-off-by: Xin Wang <[email protected]>

XinWangIntel · 2026-02-04T07:50:00Z

@rkazants please help to review again, Thanks.

Signed-off-by: Xin Wang <[email protected]>

github-actions bot added category: build OpenVINO cmake script / infra category: samples OpenVINO Runtime Samples category: NPU OpenVINO NPU plugin labels Dec 15, 2025

rkazants added the do not merge label Dec 16, 2025

rkazants requested changes Dec 16, 2025

View reviewed changes

ArtemySkrebkov reviewed Dec 16, 2025

View reviewed changes

XinWangIntel force-pushed the dynamic-pipeline-mlir-deps-v3 branch from 44b9061 to 6eafc74 Compare December 25, 2025 08:28

XinWangIntel force-pushed the dynamic-pipeline-mlir-deps-v3 branch from 6eafc74 to af1e4a8 Compare January 12, 2026 04:06

XinWangIntel changed the title ~~Test dynamic-pipline v3~~ Add dynamic host pipeline to support host compile Jan 20, 2026

XinWangIntel force-pushed the dynamic-pipeline-mlir-deps-v3 branch from 60ac319 to d8340e3 Compare January 28, 2026 05:20

github-actions bot removed the category: samples OpenVINO Runtime Samples label Jan 28, 2026

XinWangIntel marked this pull request as ready for review January 28, 2026 07:39

XinWangIntel requested review from a team as code owners January 28, 2026 07:39

XinWangIntel changed the title ~~Add dynamic host pipeline to support host compile~~ [NPU] Add dynamic host pipeline to support host compile Jan 28, 2026

XinWangIntel force-pushed the dynamic-pipeline-mlir-deps-v3 branch from 67fcad5 to 3a77880 Compare February 2, 2026 02:39

pereanub reviewed Feb 2, 2026

View reviewed changes

XinWangIntel added 13 commits February 3, 2026 18:33

Use predicted shape instead of real shape

6c3cc17

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Fix unit test

28ded18

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Port dynamic stride change

baee752

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Follow dynamic stride change

78f32f3

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Update allocate_tensor and update_graph_args

facce95

Signed-off-by: Xin Wang <[email protected]>

Use setArgumentValueSithStrides to replace setArgumentProperty

0b5c2d2

Signed-off-by: Xin Wang <[email protected]>

Force strides support in metadata be true

86589da

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Force open compilation

2b7f2e7

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Update local output tensor to use predict shape

4a2e032

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Change predict log from warn to info

edb4460

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Fix stride and shape info

16ff1aa

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Check User and Local tensor with predicted result

2b50f9d

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

XinWangIntel and others added 18 commits February 3, 2026 18:33

Skip check if user tensor is allocated by plugin

fc7b824

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Code clean and use ENABLE_NPU_DEBUG_CAPS to use this feature

618ad56

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Init execute params

a38ea45

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Remove tests

045de4f

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Remove some debug log

ed5de07

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Only call predict shape if user set new tensor

c5b7c3f

Signed-off-by: Xin Wang <[email protected]>

Code clean

8192a42

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Code refactor

a98f8d8

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Fix predict issue

29e2e9c

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Fix output shape

5b7629e

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Update copyright

8b0311b

Co-authored-by: Bogdan Pereanu <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Update copyright

3137891

Co-authored-by: Bogdan Pereanu <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Update copyright

3a4e5ee

Co-authored-by: Bogdan Pereanu <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Update copyright

742249a

Co-authored-by: Bogdan Pereanu <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Clean log and fix copyright

0227b85

Signed-off-by: Xin Wang <[email protected]>

Remove special flag for MSVC

f7a7ea0

Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>

Fix style

e572db5

Signed-off-by: Xin Wang <[email protected]>

Detect new mlir runtime api

69f323c

Signed-off-by: Xin Wang <[email protected]>

XinWangIntel force-pushed the dynamic-pipeline-mlir-deps-v3 branch from 78063f4 to 69f323c Compare February 3, 2026 12:00

Fix test that pass shape smaller than min size

0ebca81

Signed-off-by: Xin Wang <[email protected]>

Skip check for min size

5ec1727

Signed-off-by: Xin Wang <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NPU] Add dynamic host pipeline to support host compile #33249

[NPU] Add dynamic host pipeline to support host compile #33249

XinWangIntel commented Dec 15, 2025 •

edited

Loading

Uh oh!

rkazants left a comment

Uh oh!

ArtemySkrebkov Dec 16, 2025

Uh oh!

XinWangIntel Jan 28, 2026

Uh oh!

ArtemySkrebkov Dec 16, 2025

Uh oh!

XinWangIntel Dec 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pereanub Feb 2, 2026

Uh oh!

XinWangIntel Feb 3, 2026

Uh oh!

XinWangIntel commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	std::vector<IRGraph::MemRefType> inputPros = graphArgs._inputs;
	std::vector<IRGraph::MemRefType>& inputPros = graphArgs._inputs;

[NPU] Add dynamic host pipeline to support host compile #33249

Are you sure you want to change the base?

[NPU] Add dynamic host pipeline to support host compile #33249

Conversation

XinWangIntel commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tickets:

Uh oh!

rkazants left a comment

Choose a reason for hiding this comment

Uh oh!

ArtemySkrebkov Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

XinWangIntel Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

ArtemySkrebkov Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

XinWangIntel Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pereanub Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

XinWangIntel Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

XinWangIntel commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

XinWangIntel commented Dec 15, 2025 •

edited

Loading