Skip to content

[gfx950][mxfp4] Verify the state of current heuristics #22766

@Muzammiluddin-Syed-ECE

Description

@Muzammiluddin-Syed-ECE

Use amdsharktuner to collect performance data on the effect of knobs such as workgroup thread count, subgroup count, tile size, etc. on the best performance at various shapes of interest. This will help us verify the reliability of our existing heuristics. The intention is to compare it to the performance obtained when copying the configs of a handwritten assembly kernel and note whether we can do better.

M, N, K/2, K/32
512,1024,8192,512
512,16384,8192,512
512,53248,8192,512
1024,16384,8192,512
1024,1024,8192,512
1024,53248,8192,512
2048,1024,8192,512
2048,16384,8192,512
2048,53248,8192,512
512,16384,26624,1664
1024,16384,26624,1664
2048,16384,26624,1664

Metadata

Metadata

Assignees

No one assigned

    Labels

    codegenShared code generation infrastructure and dialectsyolo( ͡° ͜ʖ ͡°)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions