Annotate liveness pass (+ reuseLDS pass changes) #2015

dhernandez0 · 2025-10-07T08:50:50Z

Motivation

In order to allow attention pipelining of the outer loop, we need to be able to annotate the liveness of LDS buffers. Currently, we do:

%alloc1 = rock.alloc()
%alloc2 = rock.alloc()

gemm(%alloc1, %alloc2)
rock.dealloc %alloc1
rock.dealloc %alloc2

%alloc3 = rock.alloc()
output_swizzle(%alloc3)
rock.dealloc %alloc3

Note that we currently use rock.dealloc manually. However, for attention, we need to do:

%alloc = rock.alloc()
for ... {
  stage {
    store_lds(%alloc)
    load_lds(%alloc)
    rock.dealloc %alloc
  }

  %alloc1 = rock.alloc()
  stage {
    store_lds(%alloc1)
  }
  stage {
    load_lds(%alloc1)
    compute(...)
  }
  rock.dealloc %alloc1
...
}

So, after doing pipelining, we would end up with multiple calls to rock.dealloc. The main idea is that currently we have a single liveness range for each alloc, it starts when we call rock.alloc() and ends when we call rock.dealloc().

Technical Details

The solution of this PR is to introduce rock.live_in and rock.live_out (instead of rock.dealloc), then, decouple rock.alloc from liveness analysis. So, we can have multiple regions where a buffer is used and then not used, then used again.

Also, we introduce a new pass AnnotateLiveness that automatically marks the liveness of the buffers. The assumption we use to do this: there are blocks of write() then load() (or write(), write(), ... load(), load()). That block would be a liveness range, we would add rock.live_in before the first write() and rock.live_out after the last load().

See annotateLiveness() comment for more details about other assumptions.

Test Plan

Tests pass.

Test Result

All tests pass.

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

pabloantoniom · 2025-10-07T09:28:26Z

What quickly draws my attention is that rock.dealloc is removed in this PR, which seems very wrong. But the intention is to actually rename rock.dealloc to rock.live_out.

Initially that also makes me think is not a good idea. I'd rather have rock.alloc/rock.dealloc and rock.live_in/rock.live_out; the former ops are for memory management, and the latter serve as metadata actually. However the point here is that rock.dealloc actually does not dealloc, since it only works on LDS, which cannot be deallocated.

To me it looks like it was a mistake to call it rock.dealloc, so renaming rock.dealloc to rock.live_out makes a lot of sense.

Copilot

Pull Request Overview

This PR introduces a new liveness annotation system to enable attention pipelining of the outer loop by replacing the previous rock.dealloc approach with rock.live_in and rock.live_out annotations, allowing for multiple liveness ranges per buffer allocation.

Replaces rock.dealloc with rock.live_in/rock.live_out for more flexible LDS memory management
Adds a new AnnotateLiveness pass that automatically detects and marks buffer liveness based on write/read patterns
Updates the ReuseLDS pass to work with the new liveness annotations and handle multiple liveness ranges per buffer

Reviewed Changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
mlir/lib/Dialect/Rock/Transforms/AnnotateLiveness.cpp	New pass implementation for automatic liveness annotation
mlir/lib/Dialect/Rock/Transforms/ReuseLDS.cpp	Major refactor to use new liveness annotations and interference graph analysis
mlir/include/mlir/Dialect/Rock/IR/RockOps.td	Defines new `rock.live_in` and `rock.live_out` operations
mlir/test/Dialect/Rock/lowering_reuse_lds.mlir	Updated test cases to use new liveness annotations
mlir/test/Dialect/Rock/lowering_annotate_liveness.mlir	New test file for the liveness annotation pass
Multiple test files	Removal of `rock.dealloc` calls throughout existing tests

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

mlir/lib/Dialect/Rock/Transforms/AnnotateLiveness.cpp

mlir/include/mlir/Dialect/Rock/Passes.td

mlir/lib/Dialect/Rock/Transforms/ReuseLDS.cpp

justinrosner · 2025-10-07T13:41:52Z

mlir/include/mlir/Dialect/Rock/IR/RockOps.td


-// Annotate lifetime of memory allocation on GPU memory hierachy.
-def Rock_GpuDeallocOp:
-    Rock_Op<"dealloc", [MemoryEffects<[MemFree<DefaultResource>]>]>,


It looks like rock.dealloc annotated the memref input arg with [MemoryEffects<[MemFree<DefaultResource>]. Based on Pablo's earlier comment, he mentioned that we were never using this for actual deallocations, but is there a chance that the presence of this annotation was leading some community passes actually treating this like it was doing deallocations?

I don't think so, because we got rid of rock.dealloc in RockToGPU.cpp (see MIGPUDeallocRewritePattern), which is the last pass of buildKernelPipeline().

justinrosner · 2025-10-07T13:44:22Z

mlir/lib/Dialect/Rock/IR/RockDialect.cpp

  return emitError("The size of rock.alloc should be greather than zero.");
 }

-//===-----------------------------------------------------===//


Do we want verifier ops for the new LiveIn and LiveOut ops?

justinrosner · 2025-10-07T13:46:13Z

mlir/lib/Dialect/Rock/Pipelines/Pipelines.cpp

+  funcPm.addPass(rock::createRockAnnotateLivenessPass());
  funcPm.addPass(rock::createRockReuseLDSPass());
  funcPm.addPass(rock::createRockOutputSwizzlePass());
+  funcPm.addPass(rock::createRockAnnotateLivenessPass());


Can you add a comment explaining why we need to call RockAnnotateLiveness and RockReuseLDS again after running RockOutputSwizzle?

Sure, I'll add the following comments:

// We run reuse LDS before the output swizzle pass because it uses a heuristic to determine whether to swizzle or not, and that heuristic needs the actual LDS usage. // After running output swizzle, we'll create a new LDS buffer and we need to run reuse LDS again to be able to reuse LDS memory.

justinrosner · 2025-10-07T14:00:47Z

mlir/lib/Dialect/Rock/Transforms/AnnotateLiveness.cpp

+// (outside the loop). This would be incorrect because the buffer is alive for
+// the whole loop. However, in practise, this is not a problem because if there
+// are any interferences they will also happen in the epilogue and prologue.
+// This might need to get improved if changes to pipelining are made.


Is this worth filing a case for?

I'm not sure, I'd say if it's not a lot of work, we could fix it. Because this might happen in the future if we change the pipeline we currently use. But if it's a lot of work it might not be worth it...

justinrosner · 2025-10-07T14:05:25Z

mlir/lib/Dialect/Rock/Transforms/AnnotateLiveness.cpp

+}
+
+// Annotate LDS buffer usage based on the following assumptions:
+// 1. Liveness range is determined by a pattern of write(), then read()


What will happen right now if the number of writes and reads does not match? Should we catch that error in this pass?

it's ok if they don't match. You can have write(buffer), write(buffer), read(buffer). I guess what you mean is something like: write(buffer), write(buffer), read(buffer), write(buffer)?

I think we should fail for those cases, see line 206: "Found a non closed read-write pattern"

Yeah, that was the case that I was thinking of. Can you add a LIT test for the failing case as well?

…DS). Also update ReuseLDS pass accordingly.

pabloantoniom

Great work!

pabloantoniom · 2025-10-13T13:14:39Z

mlir/lib/Dialect/Rock/Transforms/AnnotateLiveness.cpp

@@ -0,0 +1,305 @@
+//===- AnnotateLiveness - MLIR Rock ops lowering passes -----===//
+//
+// Copyright 2025 The MLIR Authors.


nit: The MLIR authors?

pabloantoniom · 2025-10-13T13:22:39Z

mlir/lib/Dialect/Rock/Transforms/AnnotateLiveness.cpp

+}
+
+// Annotate LDS buffer usage based on the following assumptions:
+// 1. Liveness range is determined by a pattern of write(), then read()


1. Liveness range is determined by a pattern of write(), then read() is difficult to understand. What about:

1. Liveness range is determined by a pattern of one or more write() ops, and then one or more read() ops. In other words, there cannot be a write() after a read().

pabloantoniom · 2025-10-13T13:24:07Z

mlir/lib/Dialect/Rock/Transforms/AnnotateLiveness.cpp

+// read([0, 1, 2]).
+// clang-format on
+//
+// Where write(buffer, indices, data), read(indices), alloc(size). We would be


A bit pedantic but I would prefer read(buffer, indices) instead of read(indices)

pabloantoniom · 2025-10-13T13:27:33Z

mlir/lib/Dialect/Rock/Transforms/AnnotateLiveness.cpp

+  func::FuncOp func = getOperation();
+
+  // Only run this pass on GPU kernel functions.
+  if (!func->hasAttr("kernel"))


if (!func->hasAttr("kernel")) { LLVM_DEBUG(llvm::dbgs() << "Skipping RockAnnotateLivenessPass on func with no kernel attribute"; return; }

pabloantoniom · 2025-10-13T13:40:20Z

mlir/test/Dialect/Rock/lowering_annotate_liveness.mlir

@@ -0,0 +1,142 @@
+// RUN: sed s/##TOKEN_ARCH##/%arch/g %s | rocmlir-opt -rock-annotate-liveness | FileCheck %s
+
+#wg = #gpu.address_space<workgroup>


nit: Either use #wg everywhere or use #gpu.address_space<workgroup> everywhere

pabloantoniom · 2025-10-13T14:37:45Z

mlir/lib/Dialect/Rock/IR/RockDialect.cpp

 //===-----------------------------------------------------===//

-LogicalResult GpuDeallocOp::verify() {
+LogicalResult LiveInOp::verify() {


If live_in/live_out targets only LDS memory, this is a good moment to check if the GpuAllocOp is LDS or not.

pabloantoniom · 2025-10-13T14:45:32Z

mlir/test/Dialect/Rock/lowering_annotate_liveness_errors.mlir

@@ -0,0 +1,47 @@
+// RUN: sed s/##TOKEN_ARCH##/%arch/g %s | rocmlir-opt -rock-annotate-liveness -verify-diagnostics
+
+#wg = #gpu.address_space<workgroup>


nit: same here, let's use either #wg or #gpu.address_space<workgroup> below

pabloantoniom · 2025-10-13T15:02:08Z

mlir/test/Dialect/Rock/lowering_annotate_liveness_errors.mlir

+// RUN: sed s/##TOKEN_ARCH##/%arch/g %s | rocmlir-opt -rock-annotate-liveness -verify-diagnostics
+
+#wg = #gpu.address_space<workgroup>
+#priv = #gpu.address_space<private>


nit: #priv is not used

pabloantoniom · 2025-10-13T15:06:35Z

mlir/lib/Dialect/Rock/Transforms/AnnotateLiveness.cpp

+  bool hasRead = lastRead != nullptr;
+  bool hasWrite = currentWrite != nullptr;
+  if (hasRead != hasWrite) {
+    return buffer->emitError("Found a non closed read-write pattern");


I'm not sure I understand this. Let's take non_closed_read_write_pattern as an example.

Yes there is a write on a buffer that is not read later, but we could have a valid program that does that, right? It would be cleaned by dce at some point.

I guess supporting that complicates the logic, but can't we treat the return op as an implicit live.live_out for all buffers?

pabloantoniom · 2025-10-13T15:31:21Z

mlir/lib/Dialect/Rock/Transforms/AnnotateLiveness.cpp

+      // Update the last read (could be write, read, read, ... pattern)
+      lastRead = op;
+      if (!currentWrite) {
+        return buffer->emitError("Read before write");


I think this error makes sense, we are reading from LDS before writing to it. Because it makes no sense to read from an uninitialized buffer (i.e., if we have not written anything to it yet). But I wonder if this error string is appropriate. Actually, the error is not (any) "Read before write", but reading from a position of LDS that has never been written (e.g., we care about the first read before write) - not sure how to put it cleanly.

dhernandez0 requested a review from causten as a code owner October 7, 2025 08:50

dhernandez0 changed the title ~~Annotate liveness pass (+ reuseLDS pass changes)~~ [DRAFT] Annotate liveness pass (+ reuseLDS pass changes) Oct 7, 2025

dhernandez0 changed the title ~~[DRAFT] Annotate liveness pass (+ reuseLDS pass changes)~~ Annotate liveness pass (+ reuseLDS pass changes) Oct 7, 2025

dhernandez0 requested review from Copilot, justinrosner, pabloantoniom and umangyadav October 7, 2025 11:56

Copilot AI reviewed Oct 7, 2025

View reviewed changes

justinrosner reviewed Oct 7, 2025

View reviewed changes

dhernandez0 self-assigned this Oct 7, 2025

dhernandez0 mentioned this pull request Oct 7, 2025

[DRAFT] Attention: pipeline outer loop #2017

Open

1 task

dhernandez0 force-pushed the annotate_liveness branch from acd8802 to f2299cc Compare October 13, 2025 08:49

dhernandez0 added 2 commits October 13, 2025 11:31

Annotate liveness making some assumptions (write + read pattern for L…

f1d1875

…DS). Also update ReuseLDS pass accordingly.

Addressing PR comments

4b9228f

dhernandez0 force-pushed the annotate_liveness branch from f7e3311 to 4b9228f Compare October 13, 2025 09:31

pabloantoniom reviewed Oct 13, 2025

View reviewed changes

		@@ -0,0 +1,142 @@
		// RUN: sed s/##TOKEN_ARCH##/%arch/g %s \| rocmlir-opt -rock-annotate-liveness \| FileCheck %s

		#wg = #gpu.address_space<workgroup>

		@@ -0,0 +1,47 @@
		// RUN: sed s/##TOKEN_ARCH##/%arch/g %s \| rocmlir-opt -rock-annotate-liveness -verify-diagnostics

		#wg = #gpu.address_space<workgroup>

Annotate liveness pass (+ reuseLDS pass changes) #2015

Are you sure you want to change the base?

Annotate liveness pass (+ reuseLDS pass changes) #2015

Uh oh!

Conversation

dhernandez0 commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

pabloantoniom commented Oct 7, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dhernandez0 Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pabloantoniom left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

dhernandez0 commented Oct 7, 2025 •

edited

Loading

dhernandez0 Oct 7, 2025 •

edited

Loading