[XLA:GPU] Do not multi-output fuse sibling transposes with reductions. #28786

copybara-service · 2025-07-11T08:04:47Z

[XLA:GPU] Do not multi-output fuse sibling transposes with reductions.

A elemental (aka nested) transpose has a different read pattern than an unnested transpose or reduction, because the emitters for tiled transposes and parallel reductions ensure uniform read patterns.
A multi-output fusion with roots that have different read patterns is generally not profitable. Input data will be read multiple times by different threads, which defeats the purpose of multi-output fusion. What is worse, increased register pressure can impact performance.

A elemental (aka nested) transpose has a different read pattern than an unnested transpose or reduction, because the emitters for tiled transposes and parallel reductions ensure uniform read patterns. A multi-output fusion with roots that have different read patterns is generally not profitable. Input data will be read multiple times by different threads, which defeats the purpose of multi-output fusion. What is worse, increased register pressure can impact performance. PiperOrigin-RevId: 783214366

copybara-service bot assigned thomasjoerg Jul 11, 2025

copybara-service bot force-pushed the test_781870005 branch 2 times, most recently from 0692bc2 to 7048b9e Compare July 15, 2025 07:07

copybara-service bot force-pushed the test_781870005 branch from 7048b9e to c7f65c9 Compare July 15, 2025 07:33

copybara-service bot merged commit c7f65c9 into main Jul 15, 2025

copybara-service bot deleted the test_781870005 branch July 15, 2025 07:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[XLA:GPU] Do not multi-output fuse sibling transposes with reductions. #28786

[XLA:GPU] Do not multi-output fuse sibling transposes with reductions. #28786

Uh oh!

copybara-service bot commented Jul 11, 2025

Uh oh!

Uh oh!

[XLA:GPU] Do not multi-output fuse sibling transposes with reductions. #28786

[XLA:GPU] Do not multi-output fuse sibling transposes with reductions. #28786

Uh oh!

Conversation

copybara-service bot commented Jul 11, 2025

Uh oh!

Uh oh!