Skip to content

Conversation

@CuriousPanCake
Copy link
Contributor

@CuriousPanCake CuriousPanCake commented Aug 21, 2025

Details:

Add SDPAFusionSinks transformation. The pattern is hard-coded and added as a separate transformation for the gpt-oss model because of high demand and issues with integrating it to the existing SDPAFusion pattern

gpt-oss SDPA pattern to fuse:
image

Tickets:

@CuriousPanCake CuriousPanCake requested review from a team as code owners August 21, 2025 15:36
@CuriousPanCake CuriousPanCake requested review from mryzhov and removed request for a team August 21, 2025 15:36
@github-actions github-actions bot added category: Core OpenVINO Core (aka ngraph) category: transformations OpenVINO Runtime library - Transformations category: CPP API OpenVINO CPP API bindings labels Aug 21, 2025
@CuriousPanCake CuriousPanCake changed the title Sdpa fusion gpt oss [DRAFT][DO NOT REVIEW]Sdpa fusion gpt oss Aug 21, 2025
@CuriousPanCake CuriousPanCake changed the title [DRAFT][DO NOT REVIEW]Sdpa fusion gpt oss SDPAFusion with sinks Aug 29, 2025
@yeonbok
Copy link
Contributor

yeonbok commented Aug 30, 2025

Could you add a decompose pass too ?

SDPAReshapeFusion();
};

class TRANSFORMATIONS_API SDPAFusionMatcherSinks : public ov::pass::MatcherPass {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please give a picture of the pattern to be fused

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're a lot of SDPA variations to fuse, so listing all of them is not feasible. Added as screenshot to the PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to share some particular case for it and give a note what other variations can be. Will be much helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added ascii graph

@github-actions github-actions bot removed category: Core OpenVINO Core (aka ngraph) category: CPP API OpenVINO CPP API bindings labels Sep 8, 2025
…scaled_dot_product_attention_decomposition.cpp
Copy link
Contributor

@mitruska mitruska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patterns LGTM. Would be good to refactor for reusage of the common code part with existing SDPAFusion (not a blocker, possible as follow up improvement).
Minor comment for graph visualization in the tests.

// Pattern 1: axis = -1 (last axis)
// Pattern 2: axis = rank size - 1 (also means last axis for static rank inputs)
auto axis_predicate = ([](const ov::Output<ov::Node>& node) {
auto softmax = std::dynamic_pointer_cast<ov::op::v8::Softmax>(node.get_node_shared_ptr());
Copy link
Collaborator

@rkazants rkazants Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaik, we need to use ov::as_type_ptr. This is important for OV.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I heard, there were new findings that using std::dynamic_pointer_cast is also safe now, but I'll change to ov::as_type_ptr, no problem

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who said it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, don't remember right now :)

auto k_node = pm.at(k);
auto v_node = pm.at(v);

if (pm.at(mask).get_partial_shape().size() > 4) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not clear. Is it a rank you want to compare? Is it static?

Copy link
Contributor Author

@CuriousPanCake CuriousPanCake Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're comparing the rank here, changed.
Yeah, it's static, we check this above using the pattern predicate: auto mask = any_input(has_static_rank());

@CuriousPanCake CuriousPanCake added this pull request to the merge queue Sep 11, 2025
Merged via the queue into openvinotoolkit:master with commit 5d822fd Sep 11, 2025
218 of 220 checks passed
@CuriousPanCake CuriousPanCake deleted the sdpa_fusion_gpt_oss branch September 11, 2025 16:42
@praasz praasz added this to the 2025.4 milestone Sep 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: transformations OpenVINO Runtime library - Transformations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants