Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(dialects and transforms): csl stencil lowering #2747

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

n-io
Copy link
Collaborator

@n-io n-io commented Jun 18, 2024

This PR proposes a csl_stencil dialect that manages stencil accesses across the stencil pattern, including communication between PEs, with the goal of preparing the stencil dialect for lowering to CSL.

@n-io n-io added documentation Improvements or additions to documentation dialects Changes on the dialects transformations Changes or adds a transformatio labels Jun 18, 2024
@n-io n-io self-assigned this Jun 18, 2024
Copy link

codecov bot commented Jun 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.85%. Comparing base (1825c02) to head (73130e1).
Report is 137 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2747      +/-   ##
==========================================
+ Coverage   89.79%   89.85%   +0.05%     
==========================================
  Files         394      408      +14     
  Lines       48747    51021    +2274     
  Branches     7471     7912     +441     
==========================================
+ Hits        43774    45843    +2069     
- Misses       3793     3928     +135     
- Partials     1180     1250      +70     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 19 to 22
func.func @gauss_seidel(%a : memref<1024x512xtensor<512xf32>>, %b : memref<1024x512xtensor<512xf32>>) {
%0 = stencil.external_load %a : memref<1024x512xtensor<512xf32>> -> !stencil.field<[-1,1023]x[-1,511]xtensor<512xf32>>
%1 = stencil.load %0 : !stencil.field<[-1,1023]x[-1,511]xtensor<512xf32>> -> !stencil.temp<[-1,1023]x[-1,511]xtensor<512xf32>>
%2 = stencil.external_load %b : memref<1024x512xtensor<512xf32>> -> !stencil.field<[-1,1023]x[-1,511]xtensor<512xf32>>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I missed something specific to the tensorized aspect of it, you could write the function as taking the !stencil.fields directly here, they would get lowered to the memref you are converting from anyway 🙂

Copy link
Collaborator Author

@n-io n-io Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always happy to work with as many kernels as I can get my hands on and any variety of representations we might want to support. IIRC, this particular one was provided by @mesham, the tensorisation pass applied to it should only rewrite types and accesses.

n-io added a commit that referenced this pull request Jun 24, 2024
…#2766)

Introduces an intermediate dialect with ops:
* `csl_stencil.prefetch` to indicate prefetched buffer transfers
* `csl_stencil.access` performs a `stencil.access` to a prefetched
buffer

The `stencil-to-csl-stencil` transform:
* lowers `dmp.swap` to `csl_stencil.prefetch`
* adds prefetched buffers to signature of `stencil.apply`
* lowers `stencil.access` to `csl_stencil.access` iff they are accesses
to prefetched buffers

For a more detailed description see the document in #2747. This PR
implements Step 1 outlined there.

---------

Co-authored-by: n-io <[email protected]>
n-io added a commit that referenced this pull request Jun 28, 2024
This PR implements the `csl_stencil.apply` op as outlined in Step 2 of
#2747

This operation combines a `csl_stencil.prefetch` (symmetric buffer
communication across a given stencil shape) with a `stencil.apply`.
Please see the doc string of the op for a detailed description.

---------

Co-authored-by: n-io <[email protected]>
n-io added a commit that referenced this pull request Jul 5, 2024
This PR implements the conversion to csl_stencil.apply op as outlined in
Step 2 of #2747

The `csl_stencil.apply` op combines a csl_stencil.prefetch (symmetric
buffer communication across a given stencil shape) with a stencil.apply.

The transformation consists of several steps:
* When rewriting a `stencil.apply`, select the `csl_stencil.prefetch`
with the biggest memory overhead (if several are present) to be fused
with the apply op into a `csl_stencil.apply`
* Find a suitable split of ops to be divided across the two code blocks
* Re-order arithmetic e.g. `(a+b)+c -> (c+a)+b` to access, consume, and
reduce data of neighbours (for input stencil only)
* Move this into first code block, move all other ops into second code
block
  * Fallback strategy: Move everything into first code block
* Add `tensor.InsertSliceOp` to insert computed chunk into returned
z-value tensor
* Set up code block args
* Move ops into new regions according to determined split
* Translate arg usage to match new block args
* Set up yield ops
* Run tensor update shape on chunk region

---------

Co-authored-by: n-io <[email protected]>
@superlopuh
Copy link
Member

Is this branch still useful?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dialects Changes on the dialects documentation Improvements or additions to documentation transformations Changes or adds a transformatio
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants