-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[GPU] Enable custom op with dynamic shape #30880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
peterchen-intel
merged 27 commits into
openvinotoolkit:master
from
xipingyan:xp/enable_custom_op_with_dynamic_shape
Aug 4, 2025
+447
−51
Merged
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
adeb32b
draft: enable gpu to support dynamic customer op.
xipingyan bf563f3
wrapper calc work size for dynamic shape to update gws.
xipingyan 133c3f8
Clone a new op to make sure original model can be released.Back
xipingyan 129baa5
update debug log, and revert useless update.
xipingyan 9333900
is_dynamic->is_dynamic_input
xipingyan d13c484
wrapper get_output_shape
xipingyan 977877a
Move update gws,lws to primitive_imple create.
xipingyan 026a4ee
Fix gpu unit test fail issue.
xipingyan 1a1bf9f
Add test case for dynamic shape custom op.
xipingyan dcf0529
Merge branch 'master' into xp/enable_custom_op_with_dynamic_shape
xipingyan aed6f42
Fix test case build fail issue.
xipingyan 59f6064
Merge branch 'master' into xp/enable_custom_op_with_dynamic_shape
xipingyan aa974d0
Merge branch 'master' into xp/enable_custom_op_with_dynamic_shape
xipingyan 5a8363c
fix windows build issue.
xipingyan 7216436
fix windows build issue.
xipingyan afebea2
fix unit test fail: custom_gpu_primitive_f32.add_basic_in2x2x2x2
xipingyan 9bee3e5
Regist custom_gpu_primitive with dynamic_shape kernel.
xipingyan a3c26c1
1: test kernel: get index based on macro
xipingyan 0f64fd8
Override get_shape_infer_dependencies
xipingyan 5512c1c
Merge branch 'master' into xp/enable_custom_op_with_dynamic_shape
xipingyan 5d0b8aa
Merge branch 'master' into xp/enable_custom_op_with_dynamic_shape
xipingyan dc6f17f
Merge branch 'master' into xp/enable_custom_op_with_dynamic_shape
xipingyan 6e3dd7f
Fix ci issue.
xipingyan 7554290
Merge branch 'xp/enable_custom_op_with_dynamic_shape' of https://gith…
xipingyan 116f797
Merge branch 'master' into xp/enable_custom_op_with_dynamic_shape
xipingyan c1473b8
move generateTestFilePrefix to setup and teardown.
xipingyan 422c3ab
Add test: custom op static model accuracy test.
xipingyan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
256 changes: 256 additions & 0 deletions
256
src/plugins/intel_gpu/tests/functional/custom_op/custom_op_dynamic.cpp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,256 @@ | ||
// Copyright (C) 2025 Intel Corporation | ||
// SPDX-License-Identifier: Apache-2.0 | ||
// | ||
|
||
#include <string> | ||
#include <vector> | ||
|
||
#include "openvino/op/constant.hpp" | ||
#include "openvino/runtime/core.hpp" | ||
#include "openvino/runtime/exec_model_info.hpp" | ||
#include "openvino/runtime/properties.hpp" | ||
#include "shared_test_classes/base/ov_behavior_test_utils.hpp" | ||
|
||
using namespace ::testing; | ||
|
||
namespace ov { | ||
namespace test { | ||
namespace intel_gpu { | ||
|
||
class CustomAddOp : public ov::op::Op { | ||
private: | ||
float m_alpha; | ||
float m_beta; | ||
|
||
public: | ||
OPENVINO_OP("CustomAddOp", "gpu_opset"); | ||
|
||
CustomAddOp() = default; | ||
|
||
CustomAddOp(const ov::Output<ov::Node>& input, float alpha, float beta) : Op({input}), m_alpha(alpha), m_beta(beta) { | ||
constructor_validate_and_infer_types(); | ||
} | ||
|
||
void validate_and_infer_types() override { | ||
set_output_size(1); | ||
set_output_type(0, get_input_element_type(0), get_input_partial_shape(0)); | ||
} | ||
|
||
bool visit_attributes(ov::AttributeVisitor& visitor) override { | ||
visitor.on_attribute("alpha", m_alpha); | ||
visitor.on_attribute("beta", m_beta); | ||
return true; | ||
} | ||
|
||
std::shared_ptr<ov::Node> clone_with_new_inputs(const ov::OutputVector& new_args) const override { | ||
OPENVINO_ASSERT(new_args.size() == 1, "Incorrect number of new arguments"); | ||
return std::make_shared<CustomAddOp>(new_args[0], m_alpha, m_beta); | ||
} | ||
|
||
bool has_evaluate() const override { | ||
return true; | ||
} | ||
|
||
bool evaluate(ov::TensorVector& outputs, const ov::TensorVector& inputs) const override { | ||
auto in = inputs[0]; | ||
auto out = outputs[0]; | ||
out.set_shape(in.get_shape()); | ||
for (size_t i = 0; i < out.get_size(); i++) { | ||
out.data<float>()[i] = in.data<float>()[i] * m_alpha + m_beta; | ||
} | ||
return true; | ||
} | ||
}; | ||
|
||
using CustomOpDynamicTestParams = std::tuple<std::vector<ov::Shape>, // input shape | ||
std::vector<std::vector<float>>>; // input data | ||
class CustomOpDynamic : public ov::test::TestsCommon, public testing::WithParamInterface<CustomOpDynamicTestParams> { | ||
void SetUp() override { | ||
generate_config_files(); | ||
}; | ||
|
||
void TearDown() override { | ||
ov::test::utils::removeFile(config_cl); | ||
ov::test::utils::removeFile(config_xml); | ||
} | ||
|
||
public: | ||
static std::string getTestCaseName(const testing::TestParamInfo<CustomOpDynamicTestParams>& obj) { | ||
std::vector<ov::Shape> input_shapes; | ||
std::vector<std::vector<float>> input_datas; | ||
std::tie(input_shapes, input_datas) = obj.param; | ||
|
||
std::ostringstream result; | ||
result << "input_shape="; | ||
for (auto shape : input_shapes) { | ||
result << shape; | ||
} | ||
return result.str(); | ||
} | ||
|
||
static const size_t dim1 = 1; | ||
void run() { | ||
std::vector<ov::Shape> input_shapes; | ||
std::vector<std::vector<float>> input_datas; | ||
std::tie(input_shapes, input_datas) = GetParam(); | ||
ASSERT_TRUE(input_shapes.size() == input_datas.size()); | ||
|
||
ov::Core core; | ||
float alpha = 1.0, beta = 0.1; | ||
auto model = generate_model_with_custom_add_op(alpha, beta, ov::PartialShape{-1, dim1, -1}); | ||
|
||
ov::AnyMap config = {ov::hint::inference_precision(ov::element::f32), {"CONFIG_FILE", config_xml}}; | ||
auto compiled_model = core.compile_model(model, ov::test::utils::DEVICE_GPU, config); | ||
|
||
auto runtime_graph = compiled_model.get_runtime_model(); | ||
auto ops = runtime_graph->get_ordered_ops(); | ||
|
||
bool found_custom_op = false; | ||
for (auto op : ops) { | ||
if (op->get_rt_info()[ov::exec_model_info::LAYER_TYPE].as<std::string>() == "CustomGPUPrimitive") { | ||
found_custom_op = true; | ||
break; | ||
} | ||
} | ||
ASSERT_TRUE(found_custom_op); | ||
|
||
auto ireq = compiled_model.create_infer_request(); | ||
for (size_t i = 0; i < input_datas.size(); i++) { | ||
auto input = ov::Tensor({ov::element::f32}, input_shapes[i], input_datas[i].data()); | ||
ireq.set_input_tensor(0, input); | ||
ireq.infer(); | ||
auto output = ireq.get_output_tensor(0); | ||
std::vector<float> actual(output.data<float>(), output.data<float>() + output.get_size()); | ||
|
||
ASSERT_EQ(output.get_element_type(), element::f32); | ||
|
||
float* inp_data = input.data<float>(); | ||
for (size_t i = 0; i < output.get_size(); i++) { | ||
ASSERT_FLOAT_EQ(actual[i], inp_data[i] * alpha + beta); | ||
} | ||
} | ||
} | ||
|
||
protected: | ||
std::string config_cl; | ||
std::string config_xml; | ||
|
||
void generate_config_files() { | ||
config_cl = ov::test::utils::generateTestFilePrefix() + "_custom_op_dynamic.cl"; | ||
config_xml = ov::test::utils::generateTestFilePrefix() + "_custom_op_dynamic.xml"; | ||
|
||
std::string content_cl = R"( | ||
__kernel void custom_add_kernel( | ||
__global const INPUT0_TYPE* inp0, | ||
__global OUTPUT0_TYPE* outp) { | ||
const uint b = (uint)get_global_id(0); | ||
const uint f = (uint)get_global_id(1); | ||
const uint y = (uint)get_global_id(2); | ||
#if INPUT0_DIMS_SIZE == 4 | ||
const uint x = 0; | ||
#endif | ||
const unsigned src_index = b*INPUT0_DIMS[1]*INPUT0_DIMS[2]*INPUT0_DIMS[3] + f*INPUT0_DIMS[2]*INPUT0_DIMS[3] + y*INPUT0_DIMS[3] + x; | ||
const unsigned dst_index = src_index; | ||
outp[dst_index] = inp0[src_index] * alpha + beta; | ||
})"; | ||
|
||
std::string content_xml = R"( | ||
<CustomLayer name="CustomAddOp" type="SimpleGPU" version="1"> | ||
<Kernel entry="custom_add_kernel"> | ||
<Source filename=")" + config_cl + R"("/> | ||
<Define name="alpha" type="float" param="alpha" default="1.0"/> | ||
<Define name="beta" type="float" param="beta" default="0.1"/> | ||
</Kernel> | ||
<Buffers> | ||
<Tensor arg-index="0" type="input" port-index="0" format="BFYX"/> | ||
<Tensor arg-index="1" type="output" port-index="0" format="BFYX"/> | ||
</Buffers> | ||
<CompilerOptions options="-cl-mad-enable"/> | ||
<WorkSizes global="B,F,Y"/> | ||
</CustomLayer>)"; | ||
|
||
ov::test::utils::createFile(config_cl, content_cl); | ||
ov::test::utils::createFile(config_xml, content_xml); | ||
} | ||
|
||
std::shared_ptr<ov::Model> generate_model_with_custom_add_op(float alpha, float beta, ov::PartialShape inp_shape) { | ||
auto input = std::make_shared<ov::op::v0::Parameter>(ov::element::f32, inp_shape); | ||
auto op = std::make_shared<CustomAddOp>(input, alpha, beta); | ||
auto result = std::make_shared<ov::op::v0::Result>(op); | ||
return std::make_shared<ov::Model>(ov::ResultVector{result}, ov::ParameterVector{input}, "model_with_custom_op_dynamic"); | ||
} | ||
}; | ||
|
||
class CustomOpStatic : public CustomOpDynamic { | ||
public: | ||
void run() { | ||
std::vector<ov::Shape> input_shapes; | ||
std::vector<std::vector<float>> input_datas; | ||
std::tie(input_shapes, input_datas) = GetParam(); | ||
ASSERT_EQ(input_shapes.size(), input_datas.size()); | ||
ASSERT_EQ(input_shapes.size(), 1u); | ||
|
||
ov::Core core; | ||
float alpha = 1.0, beta = 0.1; | ||
auto model = generate_model_with_custom_add_op(alpha, beta, ov::PartialShape(input_shapes[0])); | ||
|
||
ov::AnyMap config = {ov::hint::inference_precision(ov::element::f32), {"CONFIG_FILE", config_xml}}; | ||
auto compiled_model = core.compile_model(model, ov::test::utils::DEVICE_GPU, config); | ||
|
||
auto runtime_graph = compiled_model.get_runtime_model(); | ||
auto ops = runtime_graph->get_ordered_ops(); | ||
|
||
bool found_custom_op = false; | ||
for (auto op : ops) { | ||
if (op->get_rt_info()[ov::exec_model_info::LAYER_TYPE].as<std::string>() == "CustomGPUPrimitive") { | ||
found_custom_op = true; | ||
break; | ||
} | ||
} | ||
ASSERT_TRUE(found_custom_op); | ||
|
||
auto ireq = compiled_model.create_infer_request(); | ||
auto input = ov::Tensor({ov::element::f32}, input_shapes[0], input_datas[0].data()); | ||
ireq.set_input_tensor(0, input); | ||
ireq.infer(); | ||
auto output = ireq.get_output_tensor(0); | ||
std::vector<float> actual(output.data<float>(), output.data<float>() + output.get_size()); | ||
|
||
ASSERT_EQ(output.get_element_type(), element::f32); | ||
|
||
float* inp_data = input.data<float>(); | ||
for (size_t i = 0; i < output.get_size(); i++) { | ||
ASSERT_FLOAT_EQ(actual[i], inp_data[i] * alpha + beta); | ||
} | ||
} | ||
}; | ||
|
||
TEST_P(CustomOpDynamic, Accuracy) { | ||
run(); | ||
} | ||
|
||
TEST_P(CustomOpStatic, Accuracy) { | ||
run(); | ||
} | ||
|
||
const std::vector<ov::Shape> input_shapes{{1, CustomOpDynamic::dim1, 2}, {2, CustomOpDynamic::dim1, 3}}; | ||
const std::vector<std::vector<float>> input_datas{{0.2, 0.4}, {0.2, 0.4, 0.3, 0.5, 0.7, 0.9}}; | ||
|
||
INSTANTIATE_TEST_SUITE_P(smoke_GPU_Accuracy, | ||
CustomOpDynamic, | ||
::testing::Combine(::testing::Values(input_shapes), ::testing::Values(input_datas)), | ||
CustomOpDynamic::getTestCaseName); | ||
|
||
const std::vector<ov::Shape> input_static_shapes{{2, 2, 3}}; | ||
const std::vector<std::vector<float>> input_static_datas{{0.2, 0.4, 0.3, 0.5, 0.7, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6}}; | ||
|
||
INSTANTIATE_TEST_SUITE_P(smoke_GPU_Accuracy, | ||
CustomOpStatic, | ||
::testing::Combine(::testing::Values(input_static_shapes), ::testing::Values(input_static_datas)), | ||
CustomOpStatic::getTestCaseName); | ||
|
||
} // namespace intel_gpu | ||
} // namespace test | ||
} // namespace ov |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.