Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HSTU backward error #202

Open
FindHao opened this issue Mar 3, 2025 · 1 comment
Open

HSTU backward error #202

FindHao opened this issue Mar 3, 2025 · 1 comment

Comments

@FindHao
Copy link

FindHao commented Mar 3, 2025

There is an error when I run the following command to test hstu attention ops' backward kernel.

% python hstu_attention_bench.py --bench-forward=False
python: ../../../lib/Tools/LinearLayout.cpp:565: LinearLayout mlir::triton::LinearLayout::reshapeOuts(ArrayRef<std::pair<StringAttr, int32_t>>) const: Assertion `getTotalOutDimSize() == std::accumulate( newOutDims.begin(), newOutDims.end(), 1, [&](int32_t acc, auto &outDim) { return acc * outDim.second; })' failed.
[1]    1368375 IOT instruction (core dumped)  python hstu_attention_bench.py --bench-forward=False

The tested pytorch is the latest nightly version 2.7.0.dev20250303+cu126 installed via pip.

The error changed to the following after I manually installed the newest triton, downloading its main branch and running pip install -e python.

% python hstu_attention_bench.py --bench-forward=False
python: /home/yhao/ptd/triton/lib/Dialect/TritonGPU/IR/LinearLayoutConversions.cpp:1154: mlir::triton::LinearLayout mlir::triton::gpu::{anonymous}::chooseStMatrixLayoutLeadingOffset(mlir::MLIRContext*, mlir::RankedTensorType, int): Assertion `instrN >= numColsPerChunk && "Each chunk is filled in with a single warp"' failed.
...

It is exactly the same with the issue 5609 reported in triton repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants