Skip to content

Conversation

justinrosner
Copy link
Contributor

@justinrosner justinrosner commented Oct 7, 2025

Motivation

During the investigation for https://github.com/ROCm/rocMLIR-internal/issues/2000 it was discovered that there is a bug in the scheduler in the backend. This PR creates a LIT test to check for this failure, and to serve as a reminder that once this starts to pass we can remove the gfx1201 workarounds that were introduced in #1990.

Technical Details

This test was generated by removing the gfx1200 workaround in GridwiseGemmToBlockwise (introduced here: #1990) and taking the IR directly after this pass runs. When the backend bug is fixed, this test will start to fail with Unexpectedly Passed.

Test Plan

  • Nightly CI

Test Result

  • Nightly CI

Submission Checklist

// (see the PR here https://github.com/ROCm/rocMLIR/pull/1990)

// XFAIL: *
// RUN: rocmlir-driver -c %s | mlir-runner -O2 --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext,%conv_validation_wrapper_library_dir/libconv-validation-wrappers%shlibext,%linalg_test_lib_dir/libmlir_runner_utils%shlibext,%linalg_test_lib_dir/libmlir_float16_utils%shlibext --entry-point-result=void | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you don't need to do the full pipeline with -c, I guess this needs only "--host-pipeline=runner -kernel-pipeline=binary "?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I grabbed the IR fairly early into the gpu pipeline in order to try and minimize the size of the LIT test but still have the bad if statement that is introduced without the gfx12 workaround in in GridwiseGemmToBlockwise. Because of this, there are still passes that it needs in the GPU pipeline.

@justinrosner
Copy link
Contributor Author

This is passing the nightly CI here: https://ml-ci-internal.amd.com/job/MLIR/job/mlir/job/PR-2018/6/pipeline-overview/.

The reason why this job isn't finishing is because there are no gfx942 nodes on the CI. Triggering an empty CI job to get this in since we have already proved that the non-gfx12 path works from this passing on other archs.

@justinrosner justinrosner merged commit 0776656 into develop Oct 10, 2025
12 of 14 checks passed
@justinrosner justinrosner deleted the justinr-add-gfx1201-test branch October 10, 2025 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants