-
Notifications
You must be signed in to change notification settings - Fork 37
[KernelDispatch] Add a temporary hack for 1x1 shapes with tile sizes M,N #1276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
...r/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/lowering_strategy_objectfifo_npu4.mlir
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/KernelDispatch.cpp
Outdated
Show resolved
Hide resolved
...r/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/lowering_strategy_objectfifo_npu4.mlir
Show resolved
Hide resolved
...r/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/lowering_strategy_objectfifo_npu4.mlir
Outdated
Show resolved
Hide resolved
// TODO(avarma): This is currently a workaround for 1x1 AIE array to make | ||
// those 2D matmul shapes work for which all of the operands get pulled in | ||
// to L2 buffer. Once reprogramming of DMA ops is supported, we can get rid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't follow what's the problem here. What's the error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this we get 'aie.memtile_dma' op could not find and assign a valid BD id
- because of exploding memtile_dma
issue, which should be solved by having DMA ops reconfigured. But since the support is not yet available, this is a temporary workaround.
9a9672a
to
7aa28d1
Compare
-- This commit adds a temporary hack for 1x1 AIE array to make those 2D matmul shapes work for which all of the operands get pulled in to L2 buffer. Once reprogramming of DMA ops is supported, we can get rid of this workaround. We need to add this only for pack-peel-4-level-tiling NOT pack-peel. The workaround just ensures that the tile size of first level is NOT equal to M,N by halving the n0Tile. -- Also adds e2e 32x512x64 for 1x1 npu4 test. Signed-off-by: Abhishek Varma <[email protected]>
7aa28d1
to
2d0a1de
Compare
// of this workaround. We need to add this only for pack-peel-4-level-tiling | ||
// NOT pack-peel. The workaround just ensures that the tile size of first | ||
// level is NOT equal to M, N by halving the N0 tile. | ||
if (numRows == 1 && numCols == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you only see this issue for 1x1? I would expect it for any number of cores?
-- This commit adds a temporary hack for 1x1 AIE array to make those
2D matmul shapes work for which all of the operands get pulled in to L2
buffer. Once reprogramming of DMA ops is supported, we can get rid of this
workaround. We need to add this only for pack-peel-4-level-tiling NOT
pack-peel. The workaround just ensures that the tile size of first level is
NOT equal to M,N by halving the n0Tile and halving the corresponding packing
size in case n0Tile becomes lesser than the packing size.
-- Also adds e2e 32x512x64 for 1x1 npu4 test.
Signed-off-by: Abhishek Varma [email protected]