[SYCL-MLIR] Merge from intel/llvm sycl branch #8600

whitneywhtsang · 2023-03-09T23:34:30Z

Please only review 1e90026, and

CONFLICT (content): Merge conflict in mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
CONFLICT (content): Merge conflict in mlir/lib/Conversion/FuncToLLVM/FuncToLLVM.cpp
CONFLICT (content): Merge conflict in mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

Reverted 02f9cf9, as it forces to use opaque pointers.

Will remove Plugin/interop-opencl-make-kernel-bundle.cpp and Basic/gpu_max_wgs_error.cpp from xfail.

Please do not squash and merge this PR.

We currently don't handle tail calls in fast-isel but we continue with the lowering when -mlongcall is specified and lower the calls normally. We should defer to SDISel for this so that it is lowered correctly. Differential revision: https://reviews.llvm.org/D123997

Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D139452

This came out of from https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902 With big binaries we can have .dwp files where .debug_info.dwo section can grow beyond 4GB. We would like to support this in LLVM and in LLDB. The plan is to enable manual parsing of cu/tu index in DWARF library (https://reviews.llvm.org/D137882), and then switch internal index data structure to 64 bit. For the second part is to enable 64bit offset support in LLDB with this patch. Depends on D139955 Reviewed By: labath Differential Revision: https://reviews.llvm.org/D138618

…on non-annotated tensor. No need for a temp COO and sort even when converting dense -> CSC, we can instead rotate the loop to yield a ordered coordinates at beginning. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144213

This pass isn't a simplification, it's a non-canonical optimization. This makes it only run once in a (Thin)LTO pipeline during postlink, just like all the other optimization pipeline passes. Reviewed By: xur Differential Revision: https://reviews.llvm.org/D143424

Reviewed By: TIFitis Differential Revision: https://reviews.llvm.org/D143964

Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144231

…indexed load/store with constant offset. Unlike ARM target, current AArch64 target doesn't have facility to encode the operation bit: whether to add an offset to base pointer for pre-inc/post-inc addressing mode, or to subtract an offset from base pointer for pre-dec/post-dec addressing mode. A mis-compile (llvm/llvm-project#60645) was noticed due to this limitation. Therefore, for AArch64 auto-indexed load/store with constant offset, always use pre-inc/post-inc addressing mode. The constant offset is negated for pre-dec/post-dec addressing mode. An auto-indexed address with non-constant offset is currently not split into base and offset parts. If we are to handle non-constant offset in the future, offset node will need to take a negate. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D143796

If the value from constant-pool is a splat value of vector type, do not need swap after load from constant-pool. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D139491

…l edit descriptor My earlier misreading of the Fortran standards had convinced me that child I/O -- meaning the use of user-defined subroutines via generic interfaces to implement data transfer statements -- was not allowed to advance the current record in the ultimate unit of the original (non-child parent) data transfer statement. This turns out to be wrong, so forward AdvanceRecord() from ChildFormattedIoStatement<> to its parent I/O statement rather than implementing it as a no-op. Differential Revision: https://reviews.llvm.org/D144205

…iar behavior. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144236

According to github.com/WebAssembly/simd/blob/main/proposals/simd/SIMD.md, the shift count of bit shift instructions is taken modulo lane width. This patch adds such operation. Fixes PR#60655 Differential Revision: https://reviews.llvm.org/D144169

This reverts commit 2062e90.

…patch; NFC

…(NFC) /data/llvm-project/lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionParser.cpp:398:34: error: variable 'lang_rt' set but not used [-Werror,-Wunused-but-set-variable] lldb_private::LanguageRuntime *lang_rt = nullptr; ^ 1 error generated.

When we simplify loads we need to adjust types (esp. null-values) properly to avoid inconsinstencies down the line. Add a cast and an error message. Fixes: llvm/llvm-project#60788

Before this patch ``` $ echo "leal (,%r15), %eax" | llvm-mc --show-encoding --output-asm-variant=1 lea eax, [r15] # encoding: [0x42,0x8d,0x04,0x3d,0x00,0x00,0x00,0x00] $ echo "lea eax, [r15]" | llvm-mc --show-encoding -x86-asm-syntax=intel --output-asm-variant=1 lea eax, [r15] # encoding: [0x41,0x8d,0x07] ``` MC printed the register r15 as a base in intel syntax even when it's an index. Then we got a different encoding by using the assembly from the output of the first command. I believe the behavior is too weird to be called a feature. After this patch, we get ``` $ echo "leal (,%r15), %eax" | llvm-mc --show-encoding --output-asm-variant=1 lea eax, [1*r15] # encoding: [0x42,0x8d,0x04,0x3d,0x00,0x00,0x00,0x00] ``` Reviewed By: RKSimon, pengfei, MaskRay Differential Revision: https://reviews.llvm.org/D144183

If we have a CallBrInst with output that's used, we need to split critical edges so that we have some place to insert COPYs for physregs to virtregs. Part 2a of https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8. Test cases and logic re-purposed from D138078. Reviewed By: efriedma, void, jyknight Differential Revision: https://reviews.llvm.org/D139872

This will be necessary to support outputs from asm goto along indirect edges. Test via: $ pushd llvm/build; ninja IRTests; popd $ ./llvm/build/unittests/IR/IRTests \ --gtest_filter=DominatorTree.CallBrDomination Also, return nullptr in Instruction::getInsertionPointAfterDef for CallBrInst as was recommened in https://reviews.llvm.org/D135997#3991427. The following phab review was folded into this commit: https://reviews.llvm.org/D140166 Link: Link: https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8 Reviewed By: void, efriedma, ChuanqiXu, MaskRay Differential Revision: https://reviews.llvm.org/D135997

Insert a new intrinsic call after splitting critical edges, and verify it. Later commits will update the SSA values to use this new value along indirect branches rather than the callbr's value, and have SelectionDAG consume this new value. Part 2b of https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8. Reviewed By: efriedma, jyknight Differential Revision: https://reviews.llvm.org/D139883

Because this pass is to be a codegen pass, it must use the legacy pass manager. Link: https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8 Reviewed By: aeubanks, void Differential Revision: https://reviews.llvm.org/D139861

Now that we've inserted a call to an intrinsic, we need to update certain previous uses of CallBrInst values to use the value of this intrinsic instead. There are 3 cases to handle: 1. The @llvm.callbr.landingpad.<type>() intrinsic call is in the same BasicBlock as the use of the callbr we're replacing. 2. The use is dominated by the direct destination. 3. The use is not dominated by the direct destination, and may or may not be dominated by the indirect destination. Part 2c of https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8. Reviewed By: efriedma, void, jyknight Differential Revision: https://reviews.llvm.org/D139970

Given a CallBrInst, retain its first virtual register in SelectionDagBuilder's FunctionLoweringInfo if there's corresponding landingpad. Walk the list of COPY MachineInstr to find the original virtual and physical registers defined by the INLINEASM_BR MachineInst. Test cases from https://reviews.llvm.org/D139565. Link: llvm/llvm-project#59538 Part 3 from https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8 Follow up patches still need to wire up CallBrPrepare into the pass pipelines. Reviewed By: efriedma, void Differential Revision: https://reviews.llvm.org/D140160

Capstone of https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8 Clang changes are still necessary to enable the use of outputs along indirect edges of asm goto statements. Link: llvm/llvm-project#53562 Reviewed By: void Differential Revision: https://reviews.llvm.org/D140180

Prerequisite to further modifications in D136497. Basically, there is a large body of code in CodeGenFunction::EmitAsmStmt for emitting stores of outputs. We want to be able to repeat this logic, for each destination of a callbr (rather than just the default destination which is what the code currently does). Also does some smaller cleanups like whitespace cleanups, and removing pointless casts. Reviewed By: void, jyknight Differential Revision: https://reviews.llvm.org/D137113

Initial support for asm goto w/ outputs (D69876) only supported outputs along the "default" (aka "fallthrough") edge. We can support outputs along all edges by repeating the same pattern of stores along the indirect edges that we allready do for the default edge. One complication is that these indirect edges may be critical edges which would need to be split. Another issue is that mid-codgen of LLVM IR, the control flow graph might not reflect the control flow of the final function. To avoid this "chicken and the egg" problem assume that any given indirect edge may become a critical edge, and pro-actively split it. This is unnecessary if the edge does not become critical, but LLVM will optimize such cases via tail duplication. Fixes: llvm/llvm-project#53562 Reviewed By: void Differential Revision: https://reviews.llvm.org/D136497

Now that we support outputs from asm goto along indirect edges, we can remove/revert some code that was added to help warn about the previous limitation that outputs were not supported along indirect edges. Reverts some code added in: commit 72aa619 ("Warn of uninitialized variables on asm goto's indirect branch") commit 3a604fd ("[Clang][CFG] check children statements of asm goto") But keeps+updates the tests. Link: llvm/llvm-project#53562 Reviewed By: void Differential Revision: https://reviews.llvm.org/D140508

Also move the line about __has_extension(gnu_asm_goto_with_outputs) so that it is more generally about asm goto, not the paragraph on symbolic references. Reviewed By: efriedma, void Differential Revision: https://reviews.llvm.org/D143205

…#8519) Fixes intel#8326. Signed-off-by: Maronas, Marcos <[email protected]>

)

…n_ww8

When it is failed to build a program, L0 throws an exception, which causes sycl::program to fail to create. Because of this, RT won't call piProgramRelease(), which leads to memory leaks. The solution is to release ZeModule created by plugin when the program build is failed. --------- Signed-off-by: Byoungro So <[email protected]>

)

This commit changes target::host_buffer deprecation warning to correctly refer to host_accessor as the replacement. Additionally it gives target::host_task an explicit value, similar to the other enumerators. Signed-off-by: Larsen, Steffen <[email protected]>

The new lsc_slm_block_load() has an additional operand 'old_values' that contains the values returned from the function if the predicate passed to it is 0. The corresponding LIT test: intel/llvm-test-suite#1637 Signed-off-by: Vyacheslav N Klochkov <[email protected]>

Currently we see a problem with llvm-test-suite checks on CUDA in AWS cloud, which looks like AWS was never allocated to execute the job. Current hypothesis is that we have a matrix job spawning 6 jobs, but we limit number of runners working in parallel to 4. This means that llvm-test-suite check on CUDA might wait for 1 or 2 llvm-test-suite checks on other machines to complete before we can run llvm-test-suite on CUDA machine. This strategy conflicts with the rules established for AWS machines, which waits for 1 hour after machine is allocated to start the job. If job has not arrived, machine returns to AWS. Since AWS allocation is speicific to each workflow run, once machine is returned to the AWS, GitHub is not able to use it anymore. GitHub waits for 24 hours and kills the workflow due to missing runner. This change removes the limitation for the number of runners we can use in parallel to run llvm-test-suite. This should fix the problem with missing results on CUDA machines.

LLVM: llvm/llvm-project@890146b SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@4af277c

This change fixes two issues reported by static analyzer: 1. Uninitialized pointer field 2. AUTO_CAUSES_COPY Signed-off-by: Arvind Sudarsanam <[email protected]>

This reverts commit 02f9cf9.

Signed-off-by: Tsang, Whitney <[email protected]>

sommerlukas · 2023-03-10T08:36:09Z

Looks good overall, just one question: What's the reason to switch from FailureOr<MemoryAccessMode> to plain MemoryAccessNode in SYCLOpTypes.{h,cpp}? Some upstream API change?

victor-eds · 2023-03-10T14:17:07Z

Looks good overall, just one question: What's the reason to switch from FailureOr<MemoryAccessMode> to plain MemoryAccessNode in SYCLOpTypes.{h,cpp}? Some upstream API change?

By looking at the docs, this change seems to be correct. However, I don't see at which point did this change.

Note that this will be obsolete when we switch to defining these attributes as enum attributes (builtin parser). I have that in the pipeline.

etiotto · 2023-03-10T15:36:16Z

LGTM. Now we have more tests passing. Thanks @whitneywhtsang !

etiotto · 2023-03-10T15:53:31Z

Guys I have asked the MLIR folks whether they are willing to revert their breaking change. See https://reviews.llvm.org/rG02f9cf9266c35795aa44ded0c37f0c44d8c49069 and subscribe to the issue to follow the conversation (and weight in).

sommerlukas · 2023-03-10T16:35:50Z

Guys I have asked the MLIR folks whether they are willing to revert their breaking change. See https://reviews.llvm.org/rG02f9cf9266c35795aa44ded0c37f0c44d8c49069 and subscribe to the issue to follow the conversation (and weight in).

I get that this would avoid some work right now, but the switch probably is inevitable, right?

Currently, status is:

Typed pointers are not supported on the main branch as a matter of policy. Fixes for typed pointer support are not accepted. Typed pointer support code may be removed without notice at any time.

IIUC, typed pointers will defacto be unsupported with LLVM 17, so it might be a matter of weeks/months before that happens.

etiotto · 2023-03-10T16:39:46Z

Guys I have asked the MLIR folks whether they are willing to revert their breaking change. See https://reviews.llvm.org/rG02f9cf9266c35795aa44ded0c37f0c44d8c49069 and subscribe to the issue to follow the conversation (and weight in).

I get that this would avoid some work right now, but the switch probably is inevitable, right?

Currently, status is:

Typed pointers are not supported on the main branch as a matter of policy. Fixes for typed pointer support are not accepted. Typed pointer support code may be removed without notice at any time.

IIUC, typed pointers will defacto be unsupported with LLVM 17, so it might be a matter of weeks/months before that happens.

Yes I actually missed that LLVM 17 (in principle) no longer allows typed pointers. So we (and the DPC++ compiler as well) will need to switch.

whitneywhtsang · 2023-03-10T16:48:05Z

Created #8616 to handle opaque pointers.

nemanjai and others added 30 commits February 16, 2023 16:15

[Object][NFC] Remove unneeded llvm_unreachable

956c7dc

Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D139452

[gn build] Manually port D143983

b66fdba

[MLIR][OpenMP] Add Conversion for Atomic Update Op

22cdeb5

Reviewed By: TIFitis Differential Revision: https://reviews.llvm.org/D143964

[mlir][sparse] split reshape.mlir into expand/collapse_shape.mlir.

0af5f84

Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144231

[PowerPC] remove XXSWAPD after load from CP which is a splat value

52a774f

If the value from constant-pool is a splat value of vector type, do not need swap after load from constant-pool. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D139491

[mlir][sparse] comment out test cases in sparse_conversion with simil…

e6758c8

…iar behavior. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D144236

[WebAssembly] Update wasm.c with update_cc_test_checks.py. NFC

e073de8

Revert "[LLDB] Enable 64 bit debug/type offset"

8116fc5

This reverts commit 2062e90.

[ComplexLogicCombine] Precommit tests for complex logic combine init …

2738dc0

…patch; NFC

[Attributor][FIX] Ensure we adjust types properly

7495a2e

When we simplify loads we need to adjust types (esp. null-values) properly to avoid inconsinstencies down the line. Add a cast and an error message. Fixes: llvm/llvm-project#60788

maarquitos14 and others added 13 commits March 7, 2023 16:31

[SYCL] Adds sycl::stream size functions definition for device. (intel…

368b660

…#8519) Fixes intel#8326. Signed-off-by: Maronas, Marcos <[email protected]>

[SYCL]Allow parameter pack expansion in device_has attribute (intel#8536

ecc3bc6

)

Merge remote-tracking branch 'intel_llvm/sycl' into llvmspirv_pulldow…

8bfee24

…n_ww8

[SYCL] Add size_type to accessor and local_accessor classes. (intel#8379

0a17e16

)

LLVM and SPIRV-LLVM-Translator pulldown (WW08)

230d562

LLVM: llvm/llvm-project@890146b SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@4af277c

Fix static analyzer issues (intel#8506)

cf75290

This change fixes two issues reported by static analyzer: 1. Uninitialized pointer field 2. AUTO_CAUSES_COPY Signed-off-by: Arvind Sudarsanam <[email protected]>

Merge remote-tracking branch 'upstream/sycl' into sycl-mlir

b67cd9f

Revert "[MLIR] Remove LLVM typed pointer support"

328f5fc

This reverts commit 02f9cf9.

[SYCL-MLIR] Fix merge

1e90026

Signed-off-by: Tsang, Whitney <[email protected]>

whitneywhtsang added disable-lint Skip linter check step and proceed with build jobs sycl-mlir Pull requests or issues for sycl-mlir branch labels Mar 9, 2023

whitneywhtsang self-assigned this Mar 9, 2023

whitneywhtsang requested a review from etiotto as a code owner March 9, 2023 23:34

whitneywhtsang requested review from victor-eds and sommerlukas March 9, 2023 23:34

etiotto approved these changes Mar 10, 2023

View reviewed changes

whitneywhtsang merged commit f8f6916 into intel:sycl-mlir Mar 10, 2023

whitneywhtsang deleted the merge branch March 10, 2023 15:53

whitneywhtsang mentioned this pull request Mar 10, 2023

[SYCL-MLIR] Opaque pointer handling #8616

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL-MLIR] Merge from intel/llvm sycl branch #8600

[SYCL-MLIR] Merge from intel/llvm sycl branch #8600

whitneywhtsang commented Mar 9, 2023 •

edited

Loading

sommerlukas commented Mar 10, 2023

victor-eds commented Mar 10, 2023

etiotto commented Mar 10, 2023 •

edited

Loading

etiotto commented Mar 10, 2023

sommerlukas commented Mar 10, 2023

etiotto commented Mar 10, 2023

whitneywhtsang commented Mar 10, 2023

[SYCL-MLIR] Merge from intel/llvm sycl branch #8600

[SYCL-MLIR] Merge from intel/llvm sycl branch #8600

Conversation

whitneywhtsang commented Mar 9, 2023 • edited Loading

sommerlukas commented Mar 10, 2023

victor-eds commented Mar 10, 2023

etiotto commented Mar 10, 2023 • edited Loading

etiotto commented Mar 10, 2023

sommerlukas commented Mar 10, 2023

etiotto commented Mar 10, 2023

whitneywhtsang commented Mar 10, 2023

whitneywhtsang commented Mar 9, 2023 •

edited

Loading

etiotto commented Mar 10, 2023 •

edited

Loading