Skip to content

Conversation

@etiotto
Copy link
Contributor

@etiotto etiotto commented Oct 24, 2025

This PR extends the RemoveMask pass in order to consider mask on load and select operations that evaluate to true (or false) in the entire loop iteration space. This masked loads to be transformed into unmasked ones, and the mask condition may become dead if not used by other operations (therefore it may contribute to reduction of arithmetic complexity).

@etiotto etiotto self-assigned this Oct 24, 2025
Signed-off-by: Ettore Tiotto <[email protected]>
@etiotto etiotto requested a review from wdziurdz October 30, 2025 13:46
@etiotto etiotto linked an issue Oct 30, 2025 that may be closed by this pull request
@etiotto
Copy link
Contributor Author

etiotto commented Nov 3, 2025

Run micro-benchmarks on b580 and this PR did not regress any of the benchmarks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[pytorch upstream] softmax on BMG is slow

3 participants