-
Notifications
You must be signed in to change notification settings - Fork 3k
[NPU] Add dynamic host pipeline to support host compile #33249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[NPU] Add dynamic host pipeline to support host compile #33249
Conversation
rkazants
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do NOT merge
| if (irGraph) { | ||
| IRGraph::GraphArguments graphArgs; | ||
| irGraph->getBinding(graphArgs); | ||
| std::vector<IRGraph::MemRefType> inputPros = graphArgs._inputs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| std::vector<IRGraph::MemRefType> inputPros = graphArgs._inputs; | |
| std::vector<IRGraph::MemRefType>& inputPros = graphArgs._inputs; |
As this likely creates a copy of the vector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need copy since each DynamicPipeline has own graph arguments, the input shape or size may be different among them.
| // irGraph->predict_output_shape(inputPros, outputPros); | ||
|
|
||
| bool shapeChanged = false; | ||
| for (size_t i = 0; i < outputPros.size(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really required?
I would try to avoid extra logic in infer_async if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most time, the tensor we set can be used without issue. need predict_shape to detect if internal output tensor is large enough or need recreate. Need to check with complex model
44b9061 to
6eafc74
Compare
6eafc74 to
af1e4a8
Compare
60ac319 to
d8340e3
Compare
67fcad5 to
3a77880
Compare
src/plugins/intel_npu/src/utils/include/intel_npu/utils/zero/zero_types.hpp
Show resolved
Hide resolved
src/plugins/intel_npu/src/compiler_adapter/src/npu_mlir_runtime_api.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/src/compiler_adapter/src/driver_compiler_adapter.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/src/backend/src/zero_dynamic_pipeline.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/src/backend/src/zero_dynamic_infer_request.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/src/compiler_adapter/include/npu_mlir_runtime_api.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_npu/src/compiler_adapter/src/npu_mlir_runtime_api.cpp
Outdated
Show resolved
Hide resolved
| ov_add_api_validator_post_build_step(TARGET ${NPU_PLUGIN_TARGET}) | ||
| if(NOT ENABLE_NPU_DEBUG_CAPS) | ||
| ov_add_api_validator_post_build_step(TARGET ${NPU_PLUGIN_TARGET}) | ||
| endif() No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed changes on this file
Use point instread of MemRef Wrap deps to zero pipeline Refactor code Move tensor allocator to pipeline Try to allocate tensor on pipeline Allocate tensor in dynamic inferrequest Clear changes on inferrequest Clean code add free Not release mem buffer to solve segment issue Use StridedMemRefType Clear zero_dynamic_infer_request Remove zero dynamic_infer_request Move ir_graph to compiler_adapter Fix copy issue Add TODO Reuse engine in dynamic pipeline Support windows Update to latest commit Fix stream status issue Update mlir lib Fix windows build and link issue Fix min func issue on win and lin dynamic shape Fix linux build compilation add NPU_LLVM_BACKEND support multiple pipeline (batch) Impelemtn updateMutableCmdList Fix compilation errors after rebasing the code fix a compilation issue w/ NPU_LLVM_BACKEND disabled Add control and refactor Update level_zero_wrapper, mlir lib Use different tensor for different inferrequest in benchmark Update windows jit libs Add set_argument_property to update data, strides, shapes Change update_graph_arguments to pass strides and shape Blob will not be released in IRGraph Fix buffer issue Update stride in MemRefType to be element based Move debug log into logger Update arguments if we reuse L0 tensor on dynamic pipeline Use plugin Metadata to show blob type Add BlobType Use zero_dynamic_infer_request to manage buffer for dynamic pipeline Fix metadata test and code style Add InferRequestDynamicShapeTest Add test and fix engine init issue Only disable test for min size and unprocessed model Add test for remote tensor and host tensor Rebase and solve conflict Fix accuracy check Move mlir deps to npu mlir runtime Add metadata to return shapeFromIRModel Load lib instead of link Remove redundant libs Update lib and fix window Clean mlir deps Fix code style Use predict_shape function to check output tensor Use MemRef instead of ze arg properties Fix call of mlir runtime API and metadata Fix stride issue Add new functions to manage runtime MemRef object Use new API to manage MemRef Fix memref issue Use new api Update API to use continuous memory Only create handle before execute Fix ArgumentDescriptor Fix index issue reference code for output shape Fix predict shape Fix init issue Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Co-authored-by: Bogdan Pereanu <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Co-authored-by: Bogdan Pereanu <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Co-authored-by: Bogdan Pereanu <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Co-authored-by: Bogdan Pereanu <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]> Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Xin Wang <[email protected]>
78063f4 to
69f323c
Compare
Signed-off-by: Xin Wang <[email protected]>
|
@rkazants please help to review again, Thanks. |
Signed-off-by: Xin Wang <[email protected]>
Add new classes for handling dynamic pipelines and inference requests, update the build system to conditionally include these features, and ensure proper integration with the existing backend.
Dynamic Inference and Pipeline Support:
zero_dynamic_infer_request.hppimplementing theZeroDynamicInferRequestclass, which add shape predict, tensor allocation with user and local tensor info based on normal ZeroInferRequest.zero_dynamic_pipeline.hppimplementing theDynamicPipelinestruct, which manages command list, graph arguments for dynamic execution.Tickets: