Skip to content

Add kernel property to propagate compile options to backend #7373

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

asudarsa
Copy link
Contributor

@asudarsa asudarsa commented Nov 11, 2022

There is a user requirement to provide ease of debuggability. Consider compilation flow:
clang++ -O0 -fsycl -c -g test.cpp -o test.o
clang++ -fsycl test.o
The resulting a.out fat binary is sent to the runtime without any associated flag. The -O0 flag used using first stage of compilation is lost. This causes user pain during debug sessions and has been reported as such.
Change in this PR does the following:

  1. Introduces a new SYCL kernel-level attribute sycl-device-compile-optlevel with the optlevel as the associated value.
  2. This option is added during front-end code generation.
  3. During sycl-post-link stage, the modules are split based on opt-level.
  4. A new entry in the props file to record the opt-level is provided.
  5. SYCL runtime then interprets this opt-level and passes the appropriate flag to the backend.
    Signed-off-by: Arvind Sudarsanam [email protected]

@asudarsa asudarsa requested review from a team as code owners November 11, 2022 13:31
@asudarsa asudarsa marked this pull request as draft November 11, 2022 13:31
std::string getOptString(module_split::ModuleDesc &MD) {
auto &M = MD.getModule();
// Process all properties on kernels.
for (Function &F : M) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have functions with different options metadata in the same module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module splitting code based on opt level has been added.

@asudarsa asudarsa force-pushed the asudarsa/propagate_compiler_opt_flags branch from 779ac48 to 54d0eb4 Compare December 21, 2022 19:38
@asudarsa asudarsa temporarily deployed to aws December 21, 2022 19:42 — with GitHub Actions Inactive
Signed-off-by: Arvind Sudarsanam <[email protected]>
@asudarsa
Copy link
Contributor Author

I will be adding tests and updating the documentation file before EOD tomorrow, if not sooner. Thanks

@asudarsa asudarsa temporarily deployed to aws December 21, 2022 20:58 — with GitHub Actions Inactive
@asudarsa asudarsa temporarily deployed to aws December 21, 2022 22:52 — with GitHub Actions Inactive
@asudarsa asudarsa temporarily deployed to aws December 21, 2022 23:25 — with GitHub Actions Inactive
Signed-off-by: Arvind Sudarsanam <[email protected]>
@asudarsa asudarsa temporarily deployed to aws December 22, 2022 02:07 — with GitHub Actions Inactive
@asudarsa asudarsa temporarily deployed to aws December 22, 2022 02:42 — with GitHub Actions Inactive
Signed-off-by: Arvind Sudarsanam <[email protected]>
@asudarsa asudarsa temporarily deployed to aws December 22, 2022 03:12 — with GitHub Actions Inactive
@asudarsa asudarsa temporarily deployed to aws December 22, 2022 03:43 — with GitHub Actions Inactive
Signed-off-by: Arvind Sudarsanam <[email protected]>
@asudarsa asudarsa marked this pull request as ready for review December 22, 2022 03:59
@asudarsa asudarsa requested review from a team as code owners December 22, 2022 03:59
@asudarsa asudarsa temporarily deployed to aws December 22, 2022 04:21 — with GitHub Actions Inactive
@asudarsa asudarsa temporarily deployed to aws December 22, 2022 04:51 — with GitHub Actions Inactive
@asudarsa
Copy link
Contributor Author

The test failure seems sporadic. Will check.

Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that we finish and agree on the design doc first to avoid doing implementation steps which may be reverted/revisited later.

I agree that this feature needs a design doc, but the proposed one does not seem to be detailed enough

@@ -9,7 +9,6 @@
// This coordinates the per-function state used while generating code.
//
//===----------------------------------------------------------------------===//

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid doing unrelated changes


## Background

When building an application with several source and object files, it should be possible to specify the optimization parameters individually for each source file/object file (for each invocation of the DPCPP compiler). The linker should pass the original optimization options (e.g. -O0 or -O2) used when building an object file to the device backend compiler (IGC compiler). This will improve the debugging experience by selectively disabling/enabling optimizations for each source file, and therefore achieving better debuggability and better performance as needed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we should mention IGC here, because we have several device compilers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I will remove it.

Here is an example that demonstrates this pain point:

```
icx -c -fsycl test1.c -o test1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither icx nor icpx exist in this repo, I suggest that we use clang++

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops..That's correct. I will change it. Good catch. Thanks

In order to support module-level debuggability, the user will compile different module files with different levels of optimization. These optimization levels must be preserved and made use of during every stage of compilation. Following are the requirements for this feature.
- If the user specifies '-Ox' as a front-end compile option for a particular module, this option must be preserved during compilation, linking, AOT compilation as well as JIT compilation.
- If the user specifies '-Ox' option as a front-end linker option, this option will override any front-end compile options and the linker option will be preserved during AOT and JIT compilation.
- If the user specifies '-O0' option, we need to pass '-cl-opt-disable' to AOT and JIT compilation stages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should either remove this or rewrite this. Level Zero does not support -cl-opt-disable flag. I think our aim is to achieve a mapping between front-end optimizations options and backend optimizations options, which is as close as possible and respects original options passed to compilation of each translation unit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cl-opt-disable is an option being passed to the backend compiler. I don't think we need to consider whether it's L0 or OpenCL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Level Zero spec does not list -cl-opt-disable as a supported option. It is not clear to me if zeModuleCreate must return an error for unsupported options and which one it would be, but I would prefer if we don't pass there options which are not considered supported.


Following are changes required in various stages of the compilation pipeline:
- Front-end code generation: For each SYCL kernel, identify the compilation option. Add an appropriate attribute to that kernel. Name of that attribute is 'sycl-device-compile-optlevel'.
- During the llvm-link stage, all modules are linked into a single module. This is an existing behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we only document changes to the existing implementation

SYCLDeviceCompileOptLevel = CGM.getCodeGenOpts().OptimizationLevel;
}
if (SYCLDeviceCompileOptLevel >= 0)
Fn->addFnAttr("sycl-device-compile-optlevel", std::to_string(SYCLDeviceCompileOptLevel));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIT test is missing for this change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the check inside the sycl test to check if he attribute is being added. Do you want me to split it up? Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, we have unit-tests for each component. The test in sycl/ subdirectory is ok, but that is an integration test and if it fails, it won't be immediately clear why, because the failure could be in any component throughout the stack.

Unit-tests which are limited to a single component will allow to highlight issues more precisely


auto OptLevel = MD.getOptLevel();
if (OptLevel >= 0)
PropSet[PropSetRegTy::SYCL_MISC_PROP].insert({"OptLevel", OptLevel});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIT test is missing for this change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I am covering this in sycl-opt-level.cpp. Can you please take a look? Thanks

EmitOnlyKernelsAsEntryPoints);
SplitByOptLevel |= OptLevelSplitter->remainingSplits() > 1;
while (OptLevelSplitter->hasMoreSplits()) {
TopLevelModules.emplace_back(OptLevelSplitter->nextSplit());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIT tests for new split dimension are missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am currently exploring the way to add LIT tests that include multiple source files. I will add this test soon. is that going to be a blocker o get this checked in? Thanks

int OptLevel = Prop ? DeviceBinaryProperty(Prop).asUint32() : -1;
std::string OptLevelStr = "";
// Currently, we do not do anything for other opt levels
// TODO: Figure out a way to send some info across for other opt levels.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: optimizations flags are different for each backend.

FYI:

if (!OptLevelStr.empty()) {
if (!CompileOpts.empty())
CompileOpts += " ";
CompileOpts += OptLevelStr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit-tests are missing for RT changes

Comment on lines +590 to +599
int SYCLDeviceCompileOptLevel = -1;
switch (CGM.getCodeGenOpts().OptimizationLevel) {
default:
llvm_unreachable("Invalid optimization level!");
case 0:
case 1:
case 2:
case 3:
SYCLDeviceCompileOptLevel = CGM.getCodeGenOpts().OptimizationLevel;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int SYCLDeviceCompileOptLevel = -1;
switch (CGM.getCodeGenOpts().OptimizationLevel) {
default:
llvm_unreachable("Invalid optimization level!");
case 0:
case 1:
case 2:
case 3:
SYCLDeviceCompileOptLevel = CGM.getCodeGenOpts().OptimizationLevel;
}
int SYCLDeviceCompileOptLevel = CGM.getCodeGenOpts().OptimizationLevel;
if (SYCLDeviceCompileOptLevel < 0) ...;

Why not (something like) this instead? I am trying to understand the benefit of checking the 0,1,2,3 values in the switch.

@bader
Copy link
Contributor

bader commented Jan 27, 2023

@asudarsa, what is status of PR?

@bader
Copy link
Contributor

bader commented Feb 24, 2023

@asudarsa, what is status of PR?

Ping.

Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, resolve merge conflicts.

bader pushed a commit that referenced this pull request Apr 13, 2023
…ler (#8763)

This change is a second attempt to add this support. An earlier attempt
was here: #7373

In this change, following changes have been made:
1. clang changes to add a new function attribute: sycl-optlevel
2. sycl-post-link changes to process this attribute, split modules based
on optimization level, and add a new property named 'optLevel' to
SYCL/misc properties' property set.
3. SYCL runtime and plugin changes to access this device image property
and propagate a backend specific optimization flag to the backend
compiler.
4. Documentation 
5. 2 unit tests and 1 e2e test

---------

Signed-off-by: Arvind Sudarsanam <[email protected]>
@AlexeySachkov
Copy link
Contributor

Superseded by #8763

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants