-
Notifications
You must be signed in to change notification settings - Fork 293
Fix use of CUDA CCs in LLVM easyblock and cleanup #3755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Fix use of CUDA CCs in LLVM easyblock and cleanup #3755
Conversation
easybuild/easyblocks/l/llvm.py
Outdated
if gpu_archs: | ||
self.runtimes_cmake_args['LIBOMPTARGET_DEVICE_ARCHITECTURES'] = '%s' % '|'.join(gpu_archs) | ||
self._cmakeopts['LIBOMPTARGET_DEVICE_ARCHITECTURES'] = ';'.join(gpu_archs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is going to work, and I would be extremely hesitant towards reverting this.
The sanity check may pass, simply because all architectures are being built, but this is not what we want.
See e.g. #3675 (comment) for LLVM 19. I think this is also something where @bedroge had issues with, which we resolved with the currently implemented approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually see the opposite: I did a test with the CLang and the LLVM easyblock of the "same" easyconfig and did a diff:
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_35.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_37.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_50.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_52.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_53.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_60.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_61.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_62.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_70.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_72.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_75.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_86.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_87.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_89.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_90.bc.
I inspected the CMake files and every CMake value with prefix LIBOMPTARGET
get automatically passed through to the sub-builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were significant changes with LLVM 19 & LLVM 20, which might influence the results a bit.
I just checked my LLVM 19 & 20 installs on my machines, and with the currently implemented approach, only the passed architectures are built.
I don't have an LLVM 18.1.8 installation with the new EasyBlock available to compare to, and unfortunately jsc-zen3
hasn't either. Maybe we missed something for this version in particular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a companion PR for testing with some easyconfigs of various versions. Build is running, will check
IIRC LLVM 20 doesn't use those anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking a Clang installation done with the old EasyBlock just before EB 5.0.0 (4.9.5dev), I can see that all architectures are built:
jreuter@ZAM054 /opt/EasyBuild/apps/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0 find . -name "*.bc"
./lib/libomptarget-amdgpu-gfx1150.bc
./lib/libomptarget-amdgpu-gfx900.bc
./lib/libomptarget-amdgpu-gfx941.bc
./lib/libomptarget-amdgpu-gfx1032.bc
./lib/libomptarget-amdgpu-gfx902.bc
./lib/libomptarget-nvptx-sm_89.bc
./lib/libomptarget-amdgpu-gfx801.bc
./lib/libomptarget-amdgpu-gfx1103.bc
./lib/libomptarget-amdgpu-gfx1010.bc
./lib/libomptarget-nvptx-sm_37.bc
./lib/libomptarget-nvptx-sm_80.bc
./lib/libomptarget-nvptx-sm_86.bc
./lib/libomptarget-nvptx-sm_61.bc
./lib/libomptarget-nvptx-sm_70.bc
./lib/libomptarget-nvptx-sm_50.bc
./lib/libomptarget-amdgpu-gfx1030.bc
./lib/libomptarget-nvptx-sm_62.bc
./lib/libomptarget-nvptx-sm_75.bc
./lib/libomptarget-nvptx-sm_72.bc
./lib/libomptarget-amdgpu-gfx1031.bc
./lib/libomptarget-amdgpu-gfx90c.bc
./lib/libomptarget-nvptx-sm_87.bc
./lib/libomptarget-nvptx-sm_60.bc
./lib/libomptarget-amdgpu-gfx1035.bc
./lib/libomptarget-amdgpu-gfx803.bc
./lib/libomptarget-nvptx-sm_35.bc
./lib/libomptarget-nvptx-sm_53.bc
./lib/libomptarget-amdgpu-gfx942.bc
./lib/libomptarget-amdgpu-gfx940.bc
./lib/libomptarget-amdgpu-gfx1101.bc
./lib/libomptarget-amdgpu-gfx1100.bc
./lib/libomptarget-amdgpu-gfx90a.bc
./lib/libomptarget-amdgpu-gfx906.bc
./lib/libomptarget-amdgpu-gfx701.bc
./lib/libomptarget-amdgpu-gfx700.bc
./lib/libomptarget-amdgpu-gfx908.bc
./lib/libomptarget-amdgpu-gfx1033.bc
./lib/libomptarget-amdgpu-gfx1151.bc
./lib/libomptarget-amdgpu-gfx1034.bc
./lib/libomptarget-amdgpu-gfx1036.bc
./lib/libomptarget-nvptx-sm_90.bc
./lib/libomptarget-amdgpu-gfx1102.bc
./lib/libomptarget-nvptx-sm_52.bc
Compared to LLVM 19.1.1 with the latest EasyBlock:
jreuter@ZAM054 /opt/EasyBuild/apps/software/LLVM/19.1.1-GCCcore-13.3.0 find . -name "*.bc"
./lib/x86_64-unknown-linux-gnu/libomptarget-nvptx-sm_80.bc
./lib/x86_64-unknown-linux-gnu/libomptarget-nvptx-sm_75.bc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I might have made a mistake during the comparison. I'm doing a rebuild of both versions now.
The idea was to simplify this and make it easier to understand in the future. To me this runtimes_cmake_args
thing was highly suspicious/confusing especially with the pipe separator and it isn't clear either if and why this argument isn't required for the "main" build.
In the source you can see that there are some variables
and the LIBOMP*
prefix passed through by default. And according to git blame
this has been virtually always the case.
So I'm confident this is the best way, especially as I couldn't find any examples doing it the current way which makes me thing this is "safer".
I think building a few versions (see the EC PR) with this should be enough to check that this works for all 3 major versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked LLVM 18.1.8 with your changes, and there it seems to work with your solution as well.
$ find . -name "*.bc"
./lib/libomptarget-nvptx-sm_75.bc
./lib/libomptarget-amdgpu-gfx1101.bc
./lib/libomptarget-nvptx-sm_80.bc
Lets check the other ones for safety as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've redone the tests with Clang 18.1.8 once with the clang easyblock and once with the updated llvm easyblock, both with CCC=7.0,8.0, and can still see that the clang easyblock builds all architectures. Diff of the install folder restricted to added files and ignoring the lib64 symlink:
Nur in Clang-LLVM/bin: count.
Nur in Clang-LLVM/bin: FileCheck.
Nur in Clang-LLVM/bin: lli-child-target.
Nur in Clang-LLVM/bin: llvm-jitlink-executor.
Nur in Clang-LLVM/bin: llvm-PerfectShuffle.
Nur in Clang-LLVM/bin: not.
Nur in Clang-LLVM/bin: obj2yaml.
Nur in Clang-LLVM/bin: split-file.
Nur in Clang-LLVM/bin: UnicodeNameMappingGenerator.
Nur in Clang-LLVM/bin: yaml2obj.
Nur in Clang-LLVM/bin: yaml-bench.
Nur in Clang-Clang/easybuild/reprod/easyblocks: clang.py.
Nur in Clang-LLVM/easybuild/reprod/easyblocks: llvm.py.
Nur in Clang-LLVM/include: x86_64-pc-linux.
Nur in Clang-LLVM/include: x86_64-pc-linux-gnu.
Nur in Clang-LLVM/lib/clang/18/lib: x86_64-pc-linux.
Nur in Clang-LLVM/lib/clang/18/lib: x86_64-pc-linux-gnu.
Nur in Clang-Clang/lib: libgomp.so.
Nur in Clang-Clang/lib: libiomp5.so.
Nur in Clang-LLVM/lib: libLLVM-18.so.
Nur in Clang-LLVM/lib: libLLVM.so.
Nur in Clang-LLVM/lib: libLLVM.so.18.1.
Nur in Clang-LLVM/lib: libMLIR.so.
Nur in Clang-LLVM/lib: libMLIR.so.18.1.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1010.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1030.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1031.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1032.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1033.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1034.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1035.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1036.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1100.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1101.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1102.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1103.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1150.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx1151.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx700.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx701.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx801.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx803.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx900.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx902.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx906.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx908.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx90a.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx90c.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx940.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx941.bc.
Nur in Clang-Clang/lib: libomptarget-amdgpu-gfx942.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_35.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_37.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_50.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_52.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_53.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_60.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_61.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_62.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_72.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_75.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_86.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_87.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_89.bc.
Nur in Clang-Clang/lib: libomptarget-nvptx-sm_90.bc.
Nur in Clang-LLVM/lib: x86_64-pc-linux.
Nur in Clang-LLVM/lib: x86_64-pc-linux-gnu.
libgomp
and libiomp
are symlinks to libomp
. Not sure why LLVM doesn't create them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might worth checking if this has some influence:
Nur in Clang-Clang/lib: libgomp.so.
Nur in Clang-Clang/lib: libiomp5.so.
One can select the OpenMP library with -fopenmp=[...]
IIRC.
If choosing libgomp
still works, I wouldn't care that the files are missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That option seemed to have been considered in an ancient Clang version but dropped. Given that those are symlinks I don't see a reason to not install them just in case.
easybuild/easyblocks/l/llvm.py
Outdated
self.runtimes_cmake_args['LIBOMPTARGET_PLUGINS_TO_BUILD'] = '%s' % '|'.join(self.offload_targets) | ||
self._cmakeopts['LIBOMPTARGET_PLUGINS_TO_BUILD'] = ';'.join(self.offload_targets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See other comment.
I don't see why we should revert this when the current approach is proven to work just fine.
We've tested this very extensively already, both with CUDA & ROCm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding offload targets, see above.
The remainder looks good to me at a quick first glance ( except the failing CI of course 😄 )
Test report by @Thyre Overview of tested easyconfigs (in order)
Build succeeded for 0 out of 1 (1 easyconfigs in total) The CMake option changes break builds where offload architectures need to be passed. |
I had messed up quoting one argument. Shall we introduce a function that does |
As for the CI failure: I'd rather make the function more flexible and use that: easybuilders/easybuild-framework#4913 IMO this is better than having to repeat our logic of determining the CCCs in many places. I had set the build option locally which is why I didn't notice it. |
Both sound like a good idea 👍 |
(created using
eb --new-pr
)usepolly
touse_polly
consistent with other optionsget_cuda_cc_template_value
andlist_to_cmake_arg
to avoid comments and simplify coderuntimes_cmake_args
(Variables prefixed with e.g.LIBOMPTARGET
get passed directly and should rather be passed to the main configure)nvptx_target_cond
block)super()
call, f-Strings and list changesTest easyconfigs: easybuilders/easybuild-easyconfigs#23028
Requires:
required
argument toget_cuda_cc_template_value
easybuild-framework#4913