-
Notifications
You must be signed in to change notification settings - Fork 3k
[TRANSFORMATIONS] Support new GPU RoPE pattern of glm-4-9b-chat-hf for RoPEFusion #33902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[TRANSFORMATIONS] Support new GPU RoPE pattern of glm-4-9b-chat-hf for RoPEFusion #33902
Conversation
...ansformations/src/transformations/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
...ansformations/src/transformations/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
...ansformations/src/transformations/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
...ansformations/src/transformations/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
...ansformations/src/transformations/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
...ansformations/src/transformations/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
...ansformations/src/transformations/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
...ansformations/src/transformations/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
src/common/transformations/tests/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
...ansformations/src/transformations/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
src/common/transformations/tests/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR extends RoPE fusion support for the glm-4-9b-chat-hf model to handle GPU graph patterns, which differ from CPU patterns due to the use of Broadcast operations instead of Multiply operations in certain parts of the computation graph.
Changes:
- Updated pattern matching to accept both Multiply and Broadcast operations in the ChatGLM interleave pattern builder
- Added comprehensive GPU-specific test case for the glm-4-9b-chat-hf RoPE pattern
- Renamed existing CPU test to clarify platform-specific testing
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| fuse_rotary_positional_embeddings.cpp (transformations) | Updated pattern matching to support Broadcast operation alongside Multiply for GPU compatibility |
| fuse_rotary_positional_embeddings.cpp (tests) | Added GPU-specific test case and renamed CPU test for clarity |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/common/transformations/tests/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
src/common/transformations/tests/common_optimizations/fuse_rotary_positional_embeddings.cpp
Outdated
Show resolved
Hide resolved
mryzhov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests became a little more complicated. Perhaps we should split the model generation into separate blocks.
Details:
The previous fix 55eda00 for the model didn't account for GPU, which has a different model graph, so needs additional support as well.
Tickets:
Signed-off-by: Andrii Staikov [email protected]