Skip to content

Conversation

@yingluAMD
Copy link
Contributor

Proposed changes

PR #3248 introduced a GPU architecture related template config in conv_bwd_data instances. It lead to MIOpen CI fail because this filed is used as host code in MIOpen which doesn't have __gfx950__ macro.
This PR defined a new macro CK_USE_GFX95 to fix this issue. MIOpen will also enable this macro on gfx950 platform.

Checklist

Please put an x into the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.

  • I have added tests relevant to the introduced functionality, and the unit tests are passing locally
  • I have added the test to REGRESSION_TESTS list defined at the top of CMakeLists.txt in tests/CMakeLists.txt, IF the test takes more than 30 seconds to run.
  • I have added inline documentation which enables the maintainers with understanding the motivation
  • I have removed the stale documentation which is no longer relevant after this pull request
  • (If this change is user-facing) I have added release notes which provide the end users with a brief summary of the improvement from this pull request
  • I have run clang-format on all changed files
  • Any dependent changes have been merged

Discussion

If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered

@yingluAMD yingluAMD self-assigned this Jan 23, 2026
@afagaj afagaj requested a review from Copilot January 23, 2026 16:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces the CK_USE_GFX950 macro to replace direct checks for __gfx950__ in template configurations. This change addresses a CI failure in MIOpen where GPU architecture-specific macros used in host code were causing compilation issues.

Changes:

  • Added CK_USE_GFX950 macro definition in CMakeLists.txt for gfx950 targets
  • Replaced __gfx950__ checks with CK_USE_GFX950 in conv forward and backward data instances
  • Removed unused CK_ENABLE_TF32 macro definitions from config.h.in

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
CMakeLists.txt Adds conditional definition of CK_USE_GFX950 macro for gfx950 GPU targets
device_grouped_conv_fwd_xdl_merged_groups_instance.hpp Updates architecture check from __gfx950__ to CK_USE_GFX950
device_grouped_conv_bwd_data_xdl_instance.hpp Updates architecture check from __gfx950__ to CK_USE_GFX950
config.h.in Removes unused CK_ENABLE_TF32 macro definitions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

>;

#if defined(__gfx950__)
#if defined(CK_USE_GFX950)
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The macro name CK_USE_GFX950 doesn't match the pattern established in CMakeLists.txt where it's defined as CK_USE_GFX95 (without the trailing '0'). This inconsistency will cause the conditional compilation to fail.

Suggested change
#if defined(CK_USE_GFX950)
#if defined(CK_USE_GFX95)

Copilot uses AI. Check for mistakes.
>;

#if defined(__gfx950__)
#if defined(CK_USE_GFX950)
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The macro name CK_USE_GFX950 doesn't match the pattern established in CMakeLists.txt where it's defined as CK_USE_GFX95 (without the trailing '0'). This inconsistency will cause the conditional compilation to fail.

Suggested change
#if defined(CK_USE_GFX950)
#if defined(CK_USE_GFX95)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant