-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[HIP][HIPSTDPAR][NFC] Re-order & adapt hipstdpar
specific passes
#134753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-driver @llvm/pr-subscribers-clang Author: Alex Voicu (AlexVlx) ChangesThe Full diff: https://github.com/llvm/llvm-project/pull/134753.diff 3 Files Affected:
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 7557cb8408921..fa5e12d4033a5 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1115,6 +1115,10 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
if (CodeGenOpts.LinkBitcodePostopt)
MPM.addPass(LinkInModulesPass(BC));
+ if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
+ LangOpts.HIPStdParInterposeAlloc)
+ MPM.addPass(HipStdParAllocationInterpositionPass());
+
// Add a verifier pass if requested. We don't have to do this if the action
// requires code generation because there will already be a verifier pass in
// the code-generation pipeline.
@@ -1178,10 +1182,6 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
return;
}
- if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
- LangOpts.HIPStdParInterposeAlloc)
- MPM.addPass(HipStdParAllocationInterpositionPass());
-
// Now that we have all of the passes ready, run them.
{
PrettyStackTraceString CrashInfo("Optimizer");
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index abb83701759ce..52e35a01be58d 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -231,10 +231,11 @@ void HIPAMDToolChain::addClangTargetOptions(
CC1Args.append({"-fcuda-is-device", "-fno-threadsafe-statics"});
if (!DriverArgs.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
- false))
+ false)) {
CC1Args.append({"-mllvm", "-amdgpu-internalize-symbols"});
- if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
- CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+ if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
+ CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+ }
StringRef MaxThreadsPerBlock =
DriverArgs.getLastArgValue(options::OPT_gpu_max_threads_per_block_EQ);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 4b5c70f09155f..03b1693244879 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -802,17 +802,17 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
#define GET_PASS_REGISTRY "AMDGPUPassRegistry.def"
#include "llvm/Passes/TargetPassRegistry.inc"
- PB.registerPipelineStartEPCallback(
- [](ModulePassManager &PM, OptimizationLevel Level) {
- if (EnableHipStdPar)
- PM.addPass(HipStdParAcceleratorCodeSelectionPass());
- });
-
PB.registerPipelineEarlySimplificationEPCallback(
[](ModulePassManager &PM, OptimizationLevel Level,
ThinOrFullLTOPhase Phase) {
- if (!isLTOPreLink(Phase))
+ if (!isLTOPreLink(Phase)) {
+ // When we are not using -fgpu-rdc, we can run accelerator code
+ // selection relatively early, but still after linking to prevent
+ // eager removal of potentially reachable symbols.
+ if (EnableHipStdPar)
+ PM.addPass(HipStdParAcceleratorCodeSelectionPass());
PM.addPass(AMDGPUPrintfRuntimeBindingPass());
+ }
if (Level == OptimizationLevel::O0)
return;
@@ -883,6 +883,12 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
PB.registerFullLinkTimeOptimizationLastEPCallback(
[this](ModulePassManager &PM, OptimizationLevel Level) {
+ // When we are using -fgpu-rdc, we can onky run accelerator code
+ // selection after linking to prevent, otherwise we end up removing
+ // potentially reachable symbols that were exported as external in other
+ // modules.
+ if (EnableHipStdPar)
+ PM.addPass(HipStdParAcceleratorCodeSelectionPass());
// We want to support the -lto-partitions=N option as "best effort".
// For that, we need to lower LDS earlier in the pipeline before the
// module is partitioned for codegen.
|
@llvm/pr-subscribers-backend-amdgpu Author: Alex Voicu (AlexVlx) ChangesThe Full diff: https://github.com/llvm/llvm-project/pull/134753.diff 3 Files Affected:
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 7557cb8408921..fa5e12d4033a5 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1115,6 +1115,10 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
if (CodeGenOpts.LinkBitcodePostopt)
MPM.addPass(LinkInModulesPass(BC));
+ if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
+ LangOpts.HIPStdParInterposeAlloc)
+ MPM.addPass(HipStdParAllocationInterpositionPass());
+
// Add a verifier pass if requested. We don't have to do this if the action
// requires code generation because there will already be a verifier pass in
// the code-generation pipeline.
@@ -1178,10 +1182,6 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
return;
}
- if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
- LangOpts.HIPStdParInterposeAlloc)
- MPM.addPass(HipStdParAllocationInterpositionPass());
-
// Now that we have all of the passes ready, run them.
{
PrettyStackTraceString CrashInfo("Optimizer");
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index abb83701759ce..52e35a01be58d 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -231,10 +231,11 @@ void HIPAMDToolChain::addClangTargetOptions(
CC1Args.append({"-fcuda-is-device", "-fno-threadsafe-statics"});
if (!DriverArgs.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
- false))
+ false)) {
CC1Args.append({"-mllvm", "-amdgpu-internalize-symbols"});
- if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
- CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+ if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
+ CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+ }
StringRef MaxThreadsPerBlock =
DriverArgs.getLastArgValue(options::OPT_gpu_max_threads_per_block_EQ);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 4b5c70f09155f..03b1693244879 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -802,17 +802,17 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
#define GET_PASS_REGISTRY "AMDGPUPassRegistry.def"
#include "llvm/Passes/TargetPassRegistry.inc"
- PB.registerPipelineStartEPCallback(
- [](ModulePassManager &PM, OptimizationLevel Level) {
- if (EnableHipStdPar)
- PM.addPass(HipStdParAcceleratorCodeSelectionPass());
- });
-
PB.registerPipelineEarlySimplificationEPCallback(
[](ModulePassManager &PM, OptimizationLevel Level,
ThinOrFullLTOPhase Phase) {
- if (!isLTOPreLink(Phase))
+ if (!isLTOPreLink(Phase)) {
+ // When we are not using -fgpu-rdc, we can run accelerator code
+ // selection relatively early, but still after linking to prevent
+ // eager removal of potentially reachable symbols.
+ if (EnableHipStdPar)
+ PM.addPass(HipStdParAcceleratorCodeSelectionPass());
PM.addPass(AMDGPUPrintfRuntimeBindingPass());
+ }
if (Level == OptimizationLevel::O0)
return;
@@ -883,6 +883,12 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
PB.registerFullLinkTimeOptimizationLastEPCallback(
[this](ModulePassManager &PM, OptimizationLevel Level) {
+ // When we are using -fgpu-rdc, we can onky run accelerator code
+ // selection after linking to prevent, otherwise we end up removing
+ // potentially reachable symbols that were exported as external in other
+ // modules.
+ if (EnableHipStdPar)
+ PM.addPass(HipStdParAcceleratorCodeSelectionPass());
// We want to support the -lto-partitions=N option as "best effort".
// For that, we need to lower LDS earlier in the pipeline before the
// module is partitioned for codegen.
|
@llvm/pr-subscribers-clang-driver Author: Alex Voicu (AlexVlx) ChangesThe Full diff: https://github.com/llvm/llvm-project/pull/134753.diff 3 Files Affected:
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 7557cb8408921..fa5e12d4033a5 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1115,6 +1115,10 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
if (CodeGenOpts.LinkBitcodePostopt)
MPM.addPass(LinkInModulesPass(BC));
+ if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
+ LangOpts.HIPStdParInterposeAlloc)
+ MPM.addPass(HipStdParAllocationInterpositionPass());
+
// Add a verifier pass if requested. We don't have to do this if the action
// requires code generation because there will already be a verifier pass in
// the code-generation pipeline.
@@ -1178,10 +1182,6 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
return;
}
- if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
- LangOpts.HIPStdParInterposeAlloc)
- MPM.addPass(HipStdParAllocationInterpositionPass());
-
// Now that we have all of the passes ready, run them.
{
PrettyStackTraceString CrashInfo("Optimizer");
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index abb83701759ce..52e35a01be58d 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -231,10 +231,11 @@ void HIPAMDToolChain::addClangTargetOptions(
CC1Args.append({"-fcuda-is-device", "-fno-threadsafe-statics"});
if (!DriverArgs.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
- false))
+ false)) {
CC1Args.append({"-mllvm", "-amdgpu-internalize-symbols"});
- if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
- CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+ if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
+ CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+ }
StringRef MaxThreadsPerBlock =
DriverArgs.getLastArgValue(options::OPT_gpu_max_threads_per_block_EQ);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 4b5c70f09155f..03b1693244879 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -802,17 +802,17 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
#define GET_PASS_REGISTRY "AMDGPUPassRegistry.def"
#include "llvm/Passes/TargetPassRegistry.inc"
- PB.registerPipelineStartEPCallback(
- [](ModulePassManager &PM, OptimizationLevel Level) {
- if (EnableHipStdPar)
- PM.addPass(HipStdParAcceleratorCodeSelectionPass());
- });
-
PB.registerPipelineEarlySimplificationEPCallback(
[](ModulePassManager &PM, OptimizationLevel Level,
ThinOrFullLTOPhase Phase) {
- if (!isLTOPreLink(Phase))
+ if (!isLTOPreLink(Phase)) {
+ // When we are not using -fgpu-rdc, we can run accelerator code
+ // selection relatively early, but still after linking to prevent
+ // eager removal of potentially reachable symbols.
+ if (EnableHipStdPar)
+ PM.addPass(HipStdParAcceleratorCodeSelectionPass());
PM.addPass(AMDGPUPrintfRuntimeBindingPass());
+ }
if (Level == OptimizationLevel::O0)
return;
@@ -883,6 +883,12 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
PB.registerFullLinkTimeOptimizationLastEPCallback(
[this](ModulePassManager &PM, OptimizationLevel Level) {
+ // When we are using -fgpu-rdc, we can onky run accelerator code
+ // selection after linking to prevent, otherwise we end up removing
+ // potentially reachable symbols that were exported as external in other
+ // modules.
+ if (EnableHipStdPar)
+ PM.addPass(HipStdParAcceleratorCodeSelectionPass());
// We want to support the -lto-partitions=N option as "best effort".
// For that, we need to lower LDS earlier in the pipeline before the
// module is partitioned for codegen.
|
@llvm/pr-subscribers-clang-codegen Author: Alex Voicu (AlexVlx) ChangesThe Full diff: https://github.com/llvm/llvm-project/pull/134753.diff 3 Files Affected:
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 7557cb8408921..fa5e12d4033a5 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1115,6 +1115,10 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
if (CodeGenOpts.LinkBitcodePostopt)
MPM.addPass(LinkInModulesPass(BC));
+ if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
+ LangOpts.HIPStdParInterposeAlloc)
+ MPM.addPass(HipStdParAllocationInterpositionPass());
+
// Add a verifier pass if requested. We don't have to do this if the action
// requires code generation because there will already be a verifier pass in
// the code-generation pipeline.
@@ -1178,10 +1182,6 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
return;
}
- if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
- LangOpts.HIPStdParInterposeAlloc)
- MPM.addPass(HipStdParAllocationInterpositionPass());
-
// Now that we have all of the passes ready, run them.
{
PrettyStackTraceString CrashInfo("Optimizer");
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index abb83701759ce..52e35a01be58d 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -231,10 +231,11 @@ void HIPAMDToolChain::addClangTargetOptions(
CC1Args.append({"-fcuda-is-device", "-fno-threadsafe-statics"});
if (!DriverArgs.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
- false))
+ false)) {
CC1Args.append({"-mllvm", "-amdgpu-internalize-symbols"});
- if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
- CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+ if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
+ CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+ }
StringRef MaxThreadsPerBlock =
DriverArgs.getLastArgValue(options::OPT_gpu_max_threads_per_block_EQ);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 4b5c70f09155f..03b1693244879 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -802,17 +802,17 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
#define GET_PASS_REGISTRY "AMDGPUPassRegistry.def"
#include "llvm/Passes/TargetPassRegistry.inc"
- PB.registerPipelineStartEPCallback(
- [](ModulePassManager &PM, OptimizationLevel Level) {
- if (EnableHipStdPar)
- PM.addPass(HipStdParAcceleratorCodeSelectionPass());
- });
-
PB.registerPipelineEarlySimplificationEPCallback(
[](ModulePassManager &PM, OptimizationLevel Level,
ThinOrFullLTOPhase Phase) {
- if (!isLTOPreLink(Phase))
+ if (!isLTOPreLink(Phase)) {
+ // When we are not using -fgpu-rdc, we can run accelerator code
+ // selection relatively early, but still after linking to prevent
+ // eager removal of potentially reachable symbols.
+ if (EnableHipStdPar)
+ PM.addPass(HipStdParAcceleratorCodeSelectionPass());
PM.addPass(AMDGPUPrintfRuntimeBindingPass());
+ }
if (Level == OptimizationLevel::O0)
return;
@@ -883,6 +883,12 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
PB.registerFullLinkTimeOptimizationLastEPCallback(
[this](ModulePassManager &PM, OptimizationLevel Level) {
+ // When we are using -fgpu-rdc, we can onky run accelerator code
+ // selection after linking to prevent, otherwise we end up removing
+ // potentially reachable symbols that were exported as external in other
+ // modules.
+ if (EnableHipStdPar)
+ PM.addPass(HipStdParAcceleratorCodeSelectionPass());
// We want to support the -lto-partitions=N option as "best effort".
// For that, we need to lower LDS earlier in the pipeline before the
// module is partitioned for codegen.
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs test
Done.
|
||
// RUN: %clang -### --hipstdpar --offload-arch=gfx906 %s -nogpulib -nogpuinc \ | ||
// RUN: 2>&1 | FileCheck -check-prefix=NORDC %s | ||
// NORDC: {{".*clang.*".* "-triple" "amdgcn-amd-amdhsa".* "-mllvm" "-amdgpu-enable-hipstdpar".*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for both check lines you don't need the regex ({{ }}
), you can just CHECK: "-mllvm" "-amdgpu-enable-hipstdpar"
no ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just regex the .* parts. Alternatively you can use CHECK-SAME
…tdpar_passes_reorder
|
||
// RUN: %clang -### --hipstdpar --offload-arch=gfx906 %s -nogpulib -nogpuinc -fgpu-rdc \ | ||
// RUN: 2>&1 | FileCheck -check-prefix=RDC %s | ||
// RDC-NOT: {{.*}}"-mllvm" "-amdgpu-enable-hipstdpar" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// RDC-NOT: {{.*}}"-mllvm" "-amdgpu-enable-hipstdpar" | |
// RDC-NOT: -amdgpu-enable-hipstdpar |
-NOT checks are hazardous and should be as permissive as possible
// RDC-NOT: {{.*}}"-mllvm" "-amdgpu-enable-hipstdpar" | |
// RDC-NOT: {{.*}}"-mllvm" "-amdgpu-enable-hipstdpar" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case it actually has to be "-mllvm" because we only care about it not being passed to the initial from source, per TU compilation; forming the check as you suggest would (erroneously) match the (intentional) passing of the argument via -plugin-opt
, when we do the final lowering from bitcode. This merely tests/validates the change we did in HIPAMDToolChain::addClangTargetOptions
.
clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp
Outdated
Show resolved
Hide resolved
clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp
Outdated
Show resolved
Hide resolved
// Ensure Pass HipStdParAcceleratorCodeSelectionPass is not invoked in PreLink. | ||
// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager \ | ||
// RUN: %s -o - 2>&1 | FileCheck --check-prefix=HIPSTDPAR-PRE %s | ||
// HIPSTDPAR-PRE-NOT: Running pass: HipStdParAcceleratorCodeSelectionPass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to use -NEXT checks with the passes before and after it
// Ensure Pass HipStdParAcceleratorCodeSelectionPass is not invoked in PreLink. | ||
// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager \ | ||
// RUN: %s -o /dev/null 2>&1 | FileCheck --check-prefix=HIPSTDPAR-PRE %s | ||
// HIPSTDPAR-PRE-NOT: Running pass: HipStdParAcceleratorCodeSelectionPass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still should use -next checks around where it should run
…tdpar_passes_reorder
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/144/builds/22732 Here is the relevant piece of the build log for the reference
|
This looks somewhat odd and I'm not privy to the workings of the SIE bots (others seem to pass). Should this test require AMDGPU (guessing that the |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/3/builds/14539 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/18291 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/133/builds/14505 Here is the relevant piece of the build log for the reference
|
…lvm#134753) The `hipstdpar` specific passes were not ordered ideally, especially for `fgpu-rdc` compilations, which meant that we'd eagerly run accelerator code selection and remove symbols that might end up used. This change corrects that aspect by ensuring that accelerator code selection is only done after linking (this will have to be revisited in the future once the closed-world assumption no longer holds). Furthermore, we take the opportunity to move allocation interposition so that it properly gets printed when print-pipeline-passes is requested. NFC.
This test needs the amdgpu target, and its absence wreaked havoc with some of the bots, therefore we now require it.
The
hipstdpar
specific passes were not ordered ideally, especially forfgpu-rdc
compilations, which meant that we'd eagerly run accelerator code selection and remove symbols that might end up used. This change corrects that aspect by ensuring that accelerator code selection is only done after linking (this will have to be revisited in the future once the closed-world assumption no longer holds). Furthermore, we take the opportunity to move allocation interposition so that it properly gets printed when print-pipeline-passes is requested. NFC.