Skip to content

[HIP][HIPSTDPAR][NFC] Re-order & adapt hipstdpar specific passes #134753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Apr 14, 2025

Conversation

AlexVlx
Copy link
Contributor

@AlexVlx AlexVlx commented Apr 7, 2025

The hipstdpar specific passes were not ordered ideally, especially for fgpu-rdc compilations, which meant that we'd eagerly run accelerator code selection and remove symbols that might end up used. This change corrects that aspect by ensuring that accelerator code selection is only done after linking (this will have to be revisited in the future once the closed-world assumption no longer holds). Furthermore, we take the opportunity to move allocation interposition so that it properly gets printed when print-pipeline-passes is requested. NFC.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:codegen IR generation bugs: mangling, exceptions, etc. labels Apr 7, 2025
@AlexVlx AlexVlx requested review from jhuber6 and yxsamliu April 7, 2025 23:24
@llvmbot
Copy link
Member

llvmbot commented Apr 7, 2025

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-clang

Author: Alex Voicu (AlexVlx)

Changes

The hipstdpar specific passes were not ordered ideally, especially for fgpu-rdc compilations, which meant that we'd eagerly run accelerator code selection and remove symbols that might end up used. This change corrects that aspect by ensuring that accelerator code selection is only done after linking (this will have to be revisited in the future once the closed-world assumption no longer holds). Furthermore, we take the opportunity to move allocation interposition so that it properly gets printed when print-pipeline-passes is requested. NFC.


Full diff: https://github.com/llvm/llvm-project/pull/134753.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+4-4)
  • (modified) clang/lib/Driver/ToolChains/HIPAMD.cpp (+4-3)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (+13-7)
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 7557cb8408921..fa5e12d4033a5 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1115,6 +1115,10 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
   if (CodeGenOpts.LinkBitcodePostopt)
     MPM.addPass(LinkInModulesPass(BC));
 
+  if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
+      LangOpts.HIPStdParInterposeAlloc)
+  MPM.addPass(HipStdParAllocationInterpositionPass());
+
   // Add a verifier pass if requested. We don't have to do this if the action
   // requires code generation because there will already be a verifier pass in
   // the code-generation pipeline.
@@ -1178,10 +1182,6 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
     return;
   }
 
-  if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
-      LangOpts.HIPStdParInterposeAlloc)
-    MPM.addPass(HipStdParAllocationInterpositionPass());
-
   // Now that we have all of the passes ready, run them.
   {
     PrettyStackTraceString CrashInfo("Optimizer");
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index abb83701759ce..52e35a01be58d 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -231,10 +231,11 @@ void HIPAMDToolChain::addClangTargetOptions(
   CC1Args.append({"-fcuda-is-device", "-fno-threadsafe-statics"});
 
   if (!DriverArgs.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
-                          false))
+                          false)) {
     CC1Args.append({"-mllvm", "-amdgpu-internalize-symbols"});
-  if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
-    CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+    if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
+      CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+  }
 
   StringRef MaxThreadsPerBlock =
       DriverArgs.getLastArgValue(options::OPT_gpu_max_threads_per_block_EQ);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 4b5c70f09155f..03b1693244879 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -802,17 +802,17 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
 #define GET_PASS_REGISTRY "AMDGPUPassRegistry.def"
 #include "llvm/Passes/TargetPassRegistry.inc"
 
-  PB.registerPipelineStartEPCallback(
-      [](ModulePassManager &PM, OptimizationLevel Level) {
-        if (EnableHipStdPar)
-          PM.addPass(HipStdParAcceleratorCodeSelectionPass());
-      });
-
   PB.registerPipelineEarlySimplificationEPCallback(
       [](ModulePassManager &PM, OptimizationLevel Level,
          ThinOrFullLTOPhase Phase) {
-        if (!isLTOPreLink(Phase))
+        if (!isLTOPreLink(Phase)) {
+          // When we are not using -fgpu-rdc, we can run accelerator code
+          // selection relatively early, but still after linking to prevent
+          // eager removal of potentially reachable symbols.
+          if (EnableHipStdPar)
+            PM.addPass(HipStdParAcceleratorCodeSelectionPass());
           PM.addPass(AMDGPUPrintfRuntimeBindingPass());
+        }
 
         if (Level == OptimizationLevel::O0)
           return;
@@ -883,6 +883,12 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
 
   PB.registerFullLinkTimeOptimizationLastEPCallback(
       [this](ModulePassManager &PM, OptimizationLevel Level) {
+        // When we are using -fgpu-rdc, we can onky run accelerator code
+        // selection after linking to prevent, otherwise we end up removing
+        // potentially reachable symbols that were exported as external in other
+        // modules.
+        if (EnableHipStdPar)
+          PM.addPass(HipStdParAcceleratorCodeSelectionPass());
         // We want to support the -lto-partitions=N option as "best effort".
         // For that, we need to lower LDS earlier in the pipeline before the
         // module is partitioned for codegen.

@llvmbot
Copy link
Member

llvmbot commented Apr 7, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Alex Voicu (AlexVlx)

Changes

The hipstdpar specific passes were not ordered ideally, especially for fgpu-rdc compilations, which meant that we'd eagerly run accelerator code selection and remove symbols that might end up used. This change corrects that aspect by ensuring that accelerator code selection is only done after linking (this will have to be revisited in the future once the closed-world assumption no longer holds). Furthermore, we take the opportunity to move allocation interposition so that it properly gets printed when print-pipeline-passes is requested. NFC.


Full diff: https://github.com/llvm/llvm-project/pull/134753.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+4-4)
  • (modified) clang/lib/Driver/ToolChains/HIPAMD.cpp (+4-3)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (+13-7)
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 7557cb8408921..fa5e12d4033a5 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1115,6 +1115,10 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
   if (CodeGenOpts.LinkBitcodePostopt)
     MPM.addPass(LinkInModulesPass(BC));
 
+  if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
+      LangOpts.HIPStdParInterposeAlloc)
+  MPM.addPass(HipStdParAllocationInterpositionPass());
+
   // Add a verifier pass if requested. We don't have to do this if the action
   // requires code generation because there will already be a verifier pass in
   // the code-generation pipeline.
@@ -1178,10 +1182,6 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
     return;
   }
 
-  if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
-      LangOpts.HIPStdParInterposeAlloc)
-    MPM.addPass(HipStdParAllocationInterpositionPass());
-
   // Now that we have all of the passes ready, run them.
   {
     PrettyStackTraceString CrashInfo("Optimizer");
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index abb83701759ce..52e35a01be58d 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -231,10 +231,11 @@ void HIPAMDToolChain::addClangTargetOptions(
   CC1Args.append({"-fcuda-is-device", "-fno-threadsafe-statics"});
 
   if (!DriverArgs.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
-                          false))
+                          false)) {
     CC1Args.append({"-mllvm", "-amdgpu-internalize-symbols"});
-  if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
-    CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+    if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
+      CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+  }
 
   StringRef MaxThreadsPerBlock =
       DriverArgs.getLastArgValue(options::OPT_gpu_max_threads_per_block_EQ);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 4b5c70f09155f..03b1693244879 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -802,17 +802,17 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
 #define GET_PASS_REGISTRY "AMDGPUPassRegistry.def"
 #include "llvm/Passes/TargetPassRegistry.inc"
 
-  PB.registerPipelineStartEPCallback(
-      [](ModulePassManager &PM, OptimizationLevel Level) {
-        if (EnableHipStdPar)
-          PM.addPass(HipStdParAcceleratorCodeSelectionPass());
-      });
-
   PB.registerPipelineEarlySimplificationEPCallback(
       [](ModulePassManager &PM, OptimizationLevel Level,
          ThinOrFullLTOPhase Phase) {
-        if (!isLTOPreLink(Phase))
+        if (!isLTOPreLink(Phase)) {
+          // When we are not using -fgpu-rdc, we can run accelerator code
+          // selection relatively early, but still after linking to prevent
+          // eager removal of potentially reachable symbols.
+          if (EnableHipStdPar)
+            PM.addPass(HipStdParAcceleratorCodeSelectionPass());
           PM.addPass(AMDGPUPrintfRuntimeBindingPass());
+        }
 
         if (Level == OptimizationLevel::O0)
           return;
@@ -883,6 +883,12 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
 
   PB.registerFullLinkTimeOptimizationLastEPCallback(
       [this](ModulePassManager &PM, OptimizationLevel Level) {
+        // When we are using -fgpu-rdc, we can onky run accelerator code
+        // selection after linking to prevent, otherwise we end up removing
+        // potentially reachable symbols that were exported as external in other
+        // modules.
+        if (EnableHipStdPar)
+          PM.addPass(HipStdParAcceleratorCodeSelectionPass());
         // We want to support the -lto-partitions=N option as "best effort".
         // For that, we need to lower LDS earlier in the pipeline before the
         // module is partitioned for codegen.

@llvmbot
Copy link
Member

llvmbot commented Apr 7, 2025

@llvm/pr-subscribers-clang-driver

Author: Alex Voicu (AlexVlx)

Changes

The hipstdpar specific passes were not ordered ideally, especially for fgpu-rdc compilations, which meant that we'd eagerly run accelerator code selection and remove symbols that might end up used. This change corrects that aspect by ensuring that accelerator code selection is only done after linking (this will have to be revisited in the future once the closed-world assumption no longer holds). Furthermore, we take the opportunity to move allocation interposition so that it properly gets printed when print-pipeline-passes is requested. NFC.


Full diff: https://github.com/llvm/llvm-project/pull/134753.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+4-4)
  • (modified) clang/lib/Driver/ToolChains/HIPAMD.cpp (+4-3)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (+13-7)
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 7557cb8408921..fa5e12d4033a5 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1115,6 +1115,10 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
   if (CodeGenOpts.LinkBitcodePostopt)
     MPM.addPass(LinkInModulesPass(BC));
 
+  if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
+      LangOpts.HIPStdParInterposeAlloc)
+  MPM.addPass(HipStdParAllocationInterpositionPass());
+
   // Add a verifier pass if requested. We don't have to do this if the action
   // requires code generation because there will already be a verifier pass in
   // the code-generation pipeline.
@@ -1178,10 +1182,6 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
     return;
   }
 
-  if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
-      LangOpts.HIPStdParInterposeAlloc)
-    MPM.addPass(HipStdParAllocationInterpositionPass());
-
   // Now that we have all of the passes ready, run them.
   {
     PrettyStackTraceString CrashInfo("Optimizer");
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index abb83701759ce..52e35a01be58d 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -231,10 +231,11 @@ void HIPAMDToolChain::addClangTargetOptions(
   CC1Args.append({"-fcuda-is-device", "-fno-threadsafe-statics"});
 
   if (!DriverArgs.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
-                          false))
+                          false)) {
     CC1Args.append({"-mllvm", "-amdgpu-internalize-symbols"});
-  if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
-    CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+    if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
+      CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+  }
 
   StringRef MaxThreadsPerBlock =
       DriverArgs.getLastArgValue(options::OPT_gpu_max_threads_per_block_EQ);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 4b5c70f09155f..03b1693244879 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -802,17 +802,17 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
 #define GET_PASS_REGISTRY "AMDGPUPassRegistry.def"
 #include "llvm/Passes/TargetPassRegistry.inc"
 
-  PB.registerPipelineStartEPCallback(
-      [](ModulePassManager &PM, OptimizationLevel Level) {
-        if (EnableHipStdPar)
-          PM.addPass(HipStdParAcceleratorCodeSelectionPass());
-      });
-
   PB.registerPipelineEarlySimplificationEPCallback(
       [](ModulePassManager &PM, OptimizationLevel Level,
          ThinOrFullLTOPhase Phase) {
-        if (!isLTOPreLink(Phase))
+        if (!isLTOPreLink(Phase)) {
+          // When we are not using -fgpu-rdc, we can run accelerator code
+          // selection relatively early, but still after linking to prevent
+          // eager removal of potentially reachable symbols.
+          if (EnableHipStdPar)
+            PM.addPass(HipStdParAcceleratorCodeSelectionPass());
           PM.addPass(AMDGPUPrintfRuntimeBindingPass());
+        }
 
         if (Level == OptimizationLevel::O0)
           return;
@@ -883,6 +883,12 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
 
   PB.registerFullLinkTimeOptimizationLastEPCallback(
       [this](ModulePassManager &PM, OptimizationLevel Level) {
+        // When we are using -fgpu-rdc, we can onky run accelerator code
+        // selection after linking to prevent, otherwise we end up removing
+        // potentially reachable symbols that were exported as external in other
+        // modules.
+        if (EnableHipStdPar)
+          PM.addPass(HipStdParAcceleratorCodeSelectionPass());
         // We want to support the -lto-partitions=N option as "best effort".
         // For that, we need to lower LDS earlier in the pipeline before the
         // module is partitioned for codegen.

@llvmbot
Copy link
Member

llvmbot commented Apr 7, 2025

@llvm/pr-subscribers-clang-codegen

Author: Alex Voicu (AlexVlx)

Changes

The hipstdpar specific passes were not ordered ideally, especially for fgpu-rdc compilations, which meant that we'd eagerly run accelerator code selection and remove symbols that might end up used. This change corrects that aspect by ensuring that accelerator code selection is only done after linking (this will have to be revisited in the future once the closed-world assumption no longer holds). Furthermore, we take the opportunity to move allocation interposition so that it properly gets printed when print-pipeline-passes is requested. NFC.


Full diff: https://github.com/llvm/llvm-project/pull/134753.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+4-4)
  • (modified) clang/lib/Driver/ToolChains/HIPAMD.cpp (+4-3)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (+13-7)
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 7557cb8408921..fa5e12d4033a5 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1115,6 +1115,10 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
   if (CodeGenOpts.LinkBitcodePostopt)
     MPM.addPass(LinkInModulesPass(BC));
 
+  if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
+      LangOpts.HIPStdParInterposeAlloc)
+  MPM.addPass(HipStdParAllocationInterpositionPass());
+
   // Add a verifier pass if requested. We don't have to do this if the action
   // requires code generation because there will already be a verifier pass in
   // the code-generation pipeline.
@@ -1178,10 +1182,6 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
     return;
   }
 
-  if (LangOpts.HIPStdPar && !LangOpts.CUDAIsDevice &&
-      LangOpts.HIPStdParInterposeAlloc)
-    MPM.addPass(HipStdParAllocationInterpositionPass());
-
   // Now that we have all of the passes ready, run them.
   {
     PrettyStackTraceString CrashInfo("Optimizer");
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index abb83701759ce..52e35a01be58d 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -231,10 +231,11 @@ void HIPAMDToolChain::addClangTargetOptions(
   CC1Args.append({"-fcuda-is-device", "-fno-threadsafe-statics"});
 
   if (!DriverArgs.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
-                          false))
+                          false)) {
     CC1Args.append({"-mllvm", "-amdgpu-internalize-symbols"});
-  if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
-    CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+    if (DriverArgs.hasArgNoClaim(options::OPT_hipstdpar))
+      CC1Args.append({"-mllvm", "-amdgpu-enable-hipstdpar"});
+  }
 
   StringRef MaxThreadsPerBlock =
       DriverArgs.getLastArgValue(options::OPT_gpu_max_threads_per_block_EQ);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 4b5c70f09155f..03b1693244879 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -802,17 +802,17 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
 #define GET_PASS_REGISTRY "AMDGPUPassRegistry.def"
 #include "llvm/Passes/TargetPassRegistry.inc"
 
-  PB.registerPipelineStartEPCallback(
-      [](ModulePassManager &PM, OptimizationLevel Level) {
-        if (EnableHipStdPar)
-          PM.addPass(HipStdParAcceleratorCodeSelectionPass());
-      });
-
   PB.registerPipelineEarlySimplificationEPCallback(
       [](ModulePassManager &PM, OptimizationLevel Level,
          ThinOrFullLTOPhase Phase) {
-        if (!isLTOPreLink(Phase))
+        if (!isLTOPreLink(Phase)) {
+          // When we are not using -fgpu-rdc, we can run accelerator code
+          // selection relatively early, but still after linking to prevent
+          // eager removal of potentially reachable symbols.
+          if (EnableHipStdPar)
+            PM.addPass(HipStdParAcceleratorCodeSelectionPass());
           PM.addPass(AMDGPUPrintfRuntimeBindingPass());
+        }
 
         if (Level == OptimizationLevel::O0)
           return;
@@ -883,6 +883,12 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
 
   PB.registerFullLinkTimeOptimizationLastEPCallback(
       [this](ModulePassManager &PM, OptimizationLevel Level) {
+        // When we are using -fgpu-rdc, we can onky run accelerator code
+        // selection after linking to prevent, otherwise we end up removing
+        // potentially reachable symbols that were exported as external in other
+        // modules.
+        if (EnableHipStdPar)
+          PM.addPass(HipStdParAcceleratorCodeSelectionPass());
         // We want to support the -lto-partitions=N option as "best effort".
         // For that, we need to lower LDS earlier in the pipeline before the
         // module is partitioned for codegen.

@AlexVlx AlexVlx added llvm:transforms and removed clang Clang issues not falling into any other category backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Apr 7, 2025
@AlexVlx AlexVlx requested a review from Pierre-vh April 7, 2025 23:24
Copy link

github-actions bot commented Apr 7, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Apr 7, 2025
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs test

arsenm

This comment was marked as duplicate.

Copy link
Contributor Author

@AlexVlx AlexVlx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs test

Done.


// RUN: %clang -### --hipstdpar --offload-arch=gfx906 %s -nogpulib -nogpuinc \
// RUN: 2>&1 | FileCheck -check-prefix=NORDC %s
// NORDC: {{".*clang.*".* "-triple" "amdgcn-amd-amdhsa".* "-mllvm" "-amdgpu-enable-hipstdpar".*}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for both check lines you don't need the regex ({{ }}), you can just CHECK: "-mllvm" "-amdgpu-enable-hipstdpar" no ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just regex the .* parts. Alternatively you can use CHECK-SAME


// RUN: %clang -### --hipstdpar --offload-arch=gfx906 %s -nogpulib -nogpuinc -fgpu-rdc \
// RUN: 2>&1 | FileCheck -check-prefix=RDC %s
// RDC-NOT: {{.*}}"-mllvm" "-amdgpu-enable-hipstdpar"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// RDC-NOT: {{.*}}"-mllvm" "-amdgpu-enable-hipstdpar"
// RDC-NOT: -amdgpu-enable-hipstdpar

-NOT checks are hazardous and should be as permissive as possible

Suggested change
// RDC-NOT: {{.*}}"-mllvm" "-amdgpu-enable-hipstdpar"
// RDC-NOT: {{.*}}"-mllvm" "-amdgpu-enable-hipstdpar"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case it actually has to be "-mllvm" because we only care about it not being passed to the initial from source, per TU compilation; forming the check as you suggest would (erroneously) match the (intentional) passing of the argument via -plugin-opt, when we do the final lowering from bitcode. This merely tests/validates the change we did in HIPAMDToolChain::addClangTargetOptions.

// Ensure Pass HipStdParAcceleratorCodeSelectionPass is not invoked in PreLink.
// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager \
// RUN: %s -o - 2>&1 | FileCheck --check-prefix=HIPSTDPAR-PRE %s
// HIPSTDPAR-PRE-NOT: Running pass: HipStdParAcceleratorCodeSelectionPass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use -NEXT checks with the passes before and after it

// Ensure Pass HipStdParAcceleratorCodeSelectionPass is not invoked in PreLink.
// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager \
// RUN: %s -o /dev/null 2>&1 | FileCheck --check-prefix=HIPSTDPAR-PRE %s
// HIPSTDPAR-PRE-NOT: Running pass: HipStdParAcceleratorCodeSelectionPass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still should use -next checks around where it should run

@AlexVlx AlexVlx merged commit 1bcec03 into llvm:main Apr 14, 2025
11 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 14, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-sie-ubuntu-fast running on sie-linux-worker while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/144/builds/22732

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/clang -cc1 -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/lib/clang/21/include -nostdsysteminc -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager   /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp -o /dev/null 2>&1 | /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck --check-prefix=HIPSTDPAR-PRE /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp # RUN: at line 4
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/clang -cc1 -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/lib/clang/21/include -nostdsysteminc -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp -o /dev/null
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck --check-prefix=HIPSTDPAR-PRE /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp
�[1m/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp:6:19: �[0m�[0;1;31merror: �[0m�[1mHIPSTDPAR-PRE: expected string not found in input
�[0m// HIPSTDPAR-PRE: Running pass: EntryExitInstrumenterPass
�[0;1;32m                  ^
�[0m�[1m<stdin>:1:1: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mclang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help'
�[0;1;32m^
�[0m�[1m<stdin>:1:26: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0mclang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help'
�[0;1;32m                         ^
�[0m
Input file: <stdin>
Check file: /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m           1: �[0m�[1m�[0;1;46mclang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help' �[0m
�[0;1;31mcheck:6'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
�[0m�[0;1;35mcheck:6'1                              ?                                                                                                        possible intended match
�[0m�[0;1;30m           2: �[0m�[1m�[0;1;46mclang (LLVM option parsing): Did you mean '--enable-ipra'? �[0m
�[0;1;31mcheck:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m>>>>>>

--

********************


@AlexVlx
Copy link
Contributor Author

AlexVlx commented Apr 14, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-sie-ubuntu-fast running on sie-linux-worker while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/144/builds/22732

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/clang -cc1 -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/lib/clang/21/include -nostdsysteminc -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager   /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp -o /dev/null 2>&1 | /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck --check-prefix=HIPSTDPAR-PRE /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp # RUN: at line 4
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/clang -cc1 -internal-isystem /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/lib/clang/21/include -nostdsysteminc -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp -o /dev/null
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck --check-prefix=HIPSTDPAR-PRE /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp
�[1m/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp:6:19: �[0m�[0;1;31merror: �[0m�[1mHIPSTDPAR-PRE: expected string not found in input
�[0m// HIPSTDPAR-PRE: Running pass: EntryExitInstrumenterPass
�[0;1;32m                  ^
�[0m�[1m<stdin>:1:1: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mclang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help'
�[0;1;32m^
�[0m�[1m<stdin>:1:26: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0mclang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help'
�[0;1;32m                         ^
�[0m
Input file: <stdin>
Check file: /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m           1: �[0m�[1m�[0;1;46mclang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help' �[0m
�[0;1;31mcheck:6'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
�[0m�[0;1;35mcheck:6'1                              ?                                                                                                        possible intended match
�[0m�[0;1;30m           2: �[0m�[1m�[0;1;46mclang (LLVM option parsing): Did you mean '--enable-ipra'? �[0m
�[0;1;31mcheck:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m>>>>>>

--

********************

This looks somewhat odd and I'm not privy to the workings of the SIE bots (others seem to pass). Should this test require AMDGPU (guessing that the fast bot doesn't enable it)?

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 14, 2025

LLVM Buildbot has detected a new failure on builder arc-builder running on arc-worker while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/3/builds/14539

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/buildbot/worker/arc-folder/build/bin/clang -cc1 -internal-isystem /buildbot/worker/arc-folder/build/lib/clang/21/include -nostdsysteminc -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager   /buildbot/worker/arc-folder/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp -o /dev/null 2>&1 | /buildbot/worker/arc-folder/build/bin/FileCheck --check-prefix=HIPSTDPAR-PRE /buildbot/worker/arc-folder/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp # RUN: at line 4
+ /buildbot/worker/arc-folder/build/bin/clang -cc1 -internal-isystem /buildbot/worker/arc-folder/build/lib/clang/21/include -nostdsysteminc -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager /buildbot/worker/arc-folder/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp -o /dev/null
+ /buildbot/worker/arc-folder/build/bin/FileCheck --check-prefix=HIPSTDPAR-PRE /buildbot/worker/arc-folder/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp
/buildbot/worker/arc-folder/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp:6:19: error: HIPSTDPAR-PRE: expected string not found in input
// HIPSTDPAR-PRE: Running pass: EntryExitInstrumenterPass
                  ^
<stdin>:1:1: note: scanning from here
clang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help'
^
<stdin>:1:26: note: possible intended match here
clang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help'
                         ^

Input file: <stdin>
Check file: /buildbot/worker/arc-folder/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: clang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help' 
check:6'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
check:6'1                              ?                                                                                                        possible intended match
           2: clang (LLVM option parsing): Did you mean '--enable-ipra'? 
check:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 14, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-5 while building clang,llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/18291

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/Users/buildbot/buildbot-root/aarch64-darwin/build/bin/clang -cc1 -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/build/lib/clang/21/include -nostdsysteminc -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager   /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp -o /dev/null 2>&1 | /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck --check-prefix=HIPSTDPAR-PRE /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp # RUN: at line 4
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/clang -cc1 -internal-isystem /Users/buildbot/buildbot-root/aarch64-darwin/build/lib/clang/21/include -nostdsysteminc -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp -o /dev/null
+ /Users/buildbot/buildbot-root/aarch64-darwin/build/bin/FileCheck --check-prefix=HIPSTDPAR-PRE /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp
�[1m/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp:6:19: �[0m�[0;1;31merror: �[0m�[1mHIPSTDPAR-PRE: expected string not found in input
�[0m// HIPSTDPAR-PRE: Running pass: EntryExitInstrumenterPass
�[0;1;32m                  ^
�[0m�[1m<stdin>:1:1: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0mclang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help'
�[0;1;32m^
�[0m�[1m<stdin>:1:26: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0mclang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help'
�[0;1;32m                         ^
�[0m
Input file: <stdin>
Check file: /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m           1: �[0m�[1m�[0;1;46mclang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help' �[0m
�[0;1;31mcheck:6'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
�[0m�[0;1;35mcheck:6'1                              ?                                                                                                        possible intended match
�[0m�[0;1;30m           2: �[0m�[1m�[0;1;46mclang (LLVM option parsing): Did you mean '--aarch64-enable-pipeliner'? �[0m
�[0;1;31mcheck:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m>>>>>>

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 14, 2025

LLVM Buildbot has detected a new failure on builder clang-cmake-x86_64-avx512-linux running on avx512-intel64 while building clang,llvm at step 7 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/133/builds/14505

Here is the relevant piece of the build log for the reference
Step 7 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'Clang :: CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/clang -cc1 -internal-isystem /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/lib/clang/21/include -nostdsysteminc -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager   /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp -o /dev/null 2>&1 | /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/FileCheck --check-prefix=HIPSTDPAR-PRE /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp # RUN: at line 4
+ /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/clang -cc1 -internal-isystem /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/lib/clang/21/include -nostdsysteminc -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc -fcuda-is-device -fdebug-pass-manager /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp -o /dev/null
+ /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/FileCheck --check-prefix=HIPSTDPAR-PRE /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp
/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp:6:19: error: HIPSTDPAR-PRE: expected string not found in input
// HIPSTDPAR-PRE: Running pass: EntryExitInstrumenterPass
                  ^
<stdin>:1:1: note: scanning from here
clang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help'
^
<stdin>:1:26: note: possible intended match here
clang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help'
                         ^

Input file: <stdin>
Check file: /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/llvm/clang/test/CodeGenHipStdPar/select-accelerator-code-pass-ordering.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: clang (LLVM option parsing): Unknown command line argument '-amdgpu-enable-hipstdpar'. Try: 'clang (LLVM option parsing) --help' 
check:6'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
check:6'1                              ?                                                                                                        possible intended match
           2: clang (LLVM option parsing): Did you mean '--enable-ipra'? 
check:6'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

--

********************


AlexVlx added a commit that referenced this pull request Apr 14, 2025
This test needs the amdgpu target, and its absence wreaked havoc with
some of the bots, therefore we now require it.
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…lvm#134753)

The `hipstdpar` specific passes were not ordered ideally, especially for
`fgpu-rdc` compilations, which meant that we'd eagerly run accelerator
code selection and remove symbols that might end up used. This change
corrects that aspect by ensuring that accelerator code selection is only
done after linking (this will have to be revisited in the future once
the closed-world assumption no longer holds). Furthermore, we take the
opportunity to move allocation interposition so that it properly gets
printed when print-pipeline-passes is requested. NFC.
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
This test needs the amdgpu target, and its absence wreaked havoc with
some of the bots, therefore we now require it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants