-
Notifications
You must be signed in to change notification settings - Fork 802
Open
Labels
codegenShared code generation infrastructure and dialectsShared code generation infrastructure and dialectsyolo( ͡° ͜ʖ ͡°)( ͡° ͜ʖ ͡°)
Description
Use amdsharktuner to collect performance data on the effect of knobs such as workgroup thread count, subgroup count, tile size, etc. on the best performance at various shapes of interest. This will help us verify the reliability of our existing heuristics. The intention is to compare it to the performance obtained when copying the configs of a handwritten assembly kernel and note whether we can do better.
M, N, K/2, K/32
512,1024,8192,512
512,16384,8192,512
512,53248,8192,512
1024,16384,8192,512
1024,1024,8192,512
1024,53248,8192,512
2048,1024,8192,512
2048,16384,8192,512
2048,53248,8192,512
512,16384,26624,1664
1024,16384,26624,1664
2048,16384,26624,1664
Metadata
Metadata
Assignees
Labels
codegenShared code generation infrastructure and dialectsShared code generation infrastructure and dialectsyolo( ͡° ͜ʖ ͡°)( ͡° ͜ʖ ͡°)