Skip to content

Conversation

@leonling-ll
Copy link

The kpack is missed to pass to Triton Compiler in Torch CachingAutotuner, Triton Compiler would take 1 as default value.
When there are Triton GEMM kernels that perfer kpack > 1, they cannot get the performance as expected.
This change aims to add the kpack back.

@leonling-ll leonling-ll requested a review from jataylo January 23, 2026 14:48
@leonling-ll leonling-ll self-assigned this Jan 23, 2026
@leonling-ll leonling-ll marked this pull request as ready for review January 23, 2026 14:48
@leonling-ll leonling-ll changed the title [Draft] Add missing kpack triton compile options Add missing kpack triton compile options Jan 23, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a missing parameter issue in PyTorch's Triton compiler integration. The kpack parameter, which controls packing behavior for Triton GEMM kernels, was not being passed through the compilation pipeline, causing kernels that require kpack > 1 to fall back to the default value of 1 and miss expected performance optimizations.

Changes:

  • Added kpack parameter handling in the _create_compile_options method to extract and pass the value from compile metadata to Triton compiler options
  • Added kpack as an optional parameter to the triton_config function and ensured it's properly propagated to the config's kwargs

@leonling-ll
Copy link
Author

Same as pytorch#173179

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Jan 23, 2026

Jenkins build for 8bf933dcb1257b73d0e1cf7922e8e8861db331cf commit finished as FAILURE
Links: Pipeline Overview / Build artifacts / Test Results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant