Conversation
|
Could one of the reviewers please take a look? |
TorreZuk
left a comment
There was a problem hiding this comment.
@mkuron can you tell me what is the gfx of the hardware you are building for? Something like this will likely be acceptable in the very near future. However it currently doesn't work with our default build options of compressed offload and released stack. We want to see all tests pass before making these changes so it can be a standardized rollout of this feature and work by default.
|
Indeed you need to override a few build options ( |
|
@mkuron I still would like to know if you can tell me what is the gfx of the hardware you will run on; as I want to clarify what this gfx generic adds support for over existing gfx942 ? |
|
When I build for gfx11-generic, the resulting binary will be compatible with gfx1100, gfx1101, and gfx1151. So this isn‘t about adding support for new hardware, but about supporting hardware without building for every single model. In the case of gfx9-4-generic, the resulting binary is supposed to also run on the upcoming gfx950. |
|
This is no longer relevant after #1639 was merged, which switched gfx950 to use the default kernels instead of the gfx942 specializations. When building for gfx9-4-generic, you will now get the default kernels on both architectures. |
I was thinking about building with e.g.
GPU_TARGETS="gfx9-4-generic;gfx11-generic"(see https://rocm.docs.amd.com/projects/llvm-project/en/develop/conceptual/code-portability.html#generic-code-objects) to reduce compile time and binary size while still being able to run on all architectures relevant to me. This led me to a few#ifdefs that explicitly checked for specific architecture, which this pull request adapts to also support generic architectures.