Skip to content

Conversation

@mdvoretc-intel
Copy link
Contributor

Details:

  • Add a new OCL implementation for fsv16 group normalization
  • The new implementation is used if each group contains fewer than fsv=16 features
  • A single fused kernel handles all stages of the reduction, avoiding excessive loading of shared values and reusing cache in cases of small inputs

Tickets:

@mdvoretc-intel mdvoretc-intel requested review from a team as code owners January 21, 2026 13:38
@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label Jan 21, 2026
@sys-openvino-ci sys-openvino-ci added the ExternalIntelPR External contributor from Intel label Jan 21, 2026
@mdvoretc-intel mdvoretc-intel force-pushed the groupnorm_fusion branch 2 times, most recently from b8e344f to 04bbc6a Compare January 23, 2026 10:38
@e-ddykim
Copy link
Contributor

Please add tests.

@e-ddykim e-ddykim added the pr: needs tests PR needs tests updating label Jan 29, 2026
@mdvoretc-intel
Copy link
Contributor Author

group_normalization.basic_b_fs_yx_fsv16 is currently selecting the new implementation. This is not intended behavior and interferes with keeping a separate specific test for the existing fsv16 implementation. While there is a selection between the two in other tests, this still needs to be resolved as part of test addition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GPU OpenVINO GPU plugin ExternalIntelPR External contributor from Intel pr: needs tests PR needs tests updating

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants