Skip to content

Conversation

Abhishek-Varma
Copy link
Contributor

@Abhishek-Varma Abhishek-Varma commented May 22, 2025

-- This commit adds a temporary hack for 1x1 AIE array to make those
2D matmul shapes work for which all of the operands get pulled in to L2
buffer. Once reprogramming of DMA ops is supported, we can get rid of this
workaround. We need to add this only for pack-peel-4-level-tiling NOT
pack-peel. The workaround just ensures that the tile size of first level is
NOT equal to M,N by halving the n0Tile and halving the corresponding packing
size in case n0Tile becomes lesser than the packing size.
-- Also adds e2e 32x512x64 for 1x1 npu4 test.

Signed-off-by: Abhishek Varma [email protected]

@Abhishek-Varma Abhishek-Varma marked this pull request as ready for review May 22, 2025 19:01
Comment on lines 453 to 455
// TODO(avarma): This is currently a workaround for 1x1 AIE array to make
// those 2D matmul shapes work for which all of the operands get pulled in
// to L2 buffer. Once reprogramming of DMA ops is supported, we can get rid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't follow what's the problem here. What's the error message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this we get 'aie.memtile_dma' op could not find and assign a valid BD id - because of exploding memtile_dma issue, which should be solved by having DMA ops reconfigured. But since the support is not yet available, this is a temporary workaround.

@Abhishek-Varma Abhishek-Varma force-pushed the avarma_hack_32x512x64_1x1 branch from 9a9672a to 7aa28d1 Compare May 26, 2025 06:10
-- This commit adds a temporary hack for 1x1 AIE array to make those
   2D matmul shapes work for which all of the operands get pulled in to L2
   buffer. Once reprogramming of DMA ops is supported, we can get rid of this
   workaround. We need to add this only for pack-peel-4-level-tiling NOT
   pack-peel. The workaround just ensures that the tile size of first level is
   NOT equal to M,N by halving the n0Tile.
-- Also adds e2e 32x512x64 for 1x1 npu4 test.

Signed-off-by: Abhishek Varma <[email protected]>
@Abhishek-Varma Abhishek-Varma force-pushed the avarma_hack_32x512x64_1x1 branch from 7aa28d1 to 2d0a1de Compare May 26, 2025 06:11
@Abhishek-Varma Abhishek-Varma requested a review from yzhang93 May 26, 2025 06:11
// of this workaround. We need to add this only for pack-peel-4-level-tiling
// NOT pack-peel. The workaround just ensures that the tile size of first
// level is NOT equal to M, N by halving the N0 tile.
if (numRows == 1 && numCols == 1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you only see this issue for 1x1? I would expect it for any number of cores?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants