-
Notifications
You must be signed in to change notification settings - Fork 38
Add new pack-peel pipeline with 4 tiling levels #1022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per our discussion in the meeting, this temporary pipeline LGTM. We'll leave the work for cleanup pipelines as follow-ups.
However, I have another idea for your consideration. If all the matmul shapes and tests work for this new pipeline, we can make the new pipeline "pack-peel" and the old one "pack-peel-basic" or something. In this way, we can run all the ObjectFifo tests in ci without changing the default pipeline, and it's easier when we are ready to deprecate the old pipeline.
One more thing, we need a lowering-strategy test for the new pipeline.
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Passes.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Passes.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Passes.cpp
Outdated
Show resolved
Hide resolved
Not all the tests are functional. I have fixes for matmul + truncf, batch_matmul which I can get in in follow ups. And I am still working on an issue with matmul_transpose_b. |
abf2de4
to
70aac75
Compare
Added the lowering strategy lit tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Adds a new pipeline with an additional level of tiling for matmul-like operations on the outer dimension. This should reduce the DDR BW requirements by caching more data slices on L2/MemTile.