HSTU backward error #202

FindHao · 2025-03-03T22:57:49Z

There is an error when I run the following command to test hstu attention ops' backward kernel.

% python hstu_attention_bench.py --bench-forward=False
python: ../../../lib/Tools/LinearLayout.cpp:565: LinearLayout mlir::triton::LinearLayout::reshapeOuts(ArrayRef<std::pair<StringAttr, int32_t>>) const: Assertion `getTotalOutDimSize() == std::accumulate( newOutDims.begin(), newOutDims.end(), 1, [&](int32_t acc, auto &outDim) { return acc * outDim.second; })' failed.
[1]    1368375 IOT instruction (core dumped)  python hstu_attention_bench.py --bench-forward=False

The tested pytorch is the latest nightly version 2.7.0.dev20250303+cu126 installed via pip.

The error changed to the following after I manually installed the newest triton, downloading its main branch and running pip install -e python.

% python hstu_attention_bench.py --bench-forward=False
python: /home/yhao/ptd/triton/lib/Dialect/TritonGPU/IR/LinearLayoutConversions.cpp:1154: mlir::triton::LinearLayout mlir::triton::gpu::{anonymous}::chooseStMatrixLayoutLeadingOffset(mlir::MLIRContext*, mlir::RankedTensorType, int): Assertion `instrN >= numColsPerChunk && "Each chunk is filled in with a single warp"' failed.
...

It is exactly the same with the issue 5609 reported in triton repo.

The text was updated successfully, but these errors were encountered:

xuzhao9 · 2025-03-07T21:07:01Z

cc @manman-ren @htyu

FindHao mentioned this issue Mar 3, 2025

[ragged_attention] HSTU ragged attention fails upon backward pytorch-labs/tritonbench#169

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HSTU backward error #202

HSTU backward error #202

FindHao commented Mar 3, 2025 •

edited

Loading

xuzhao9 commented Mar 7, 2025

HSTU backward error #202

HSTU backward error #202

Comments

FindHao commented Mar 3, 2025 • edited Loading

xuzhao9 commented Mar 7, 2025

FindHao commented Mar 3, 2025 •

edited

Loading