LLVM and SPIRV-LLVM-Translator pulldown (WW16 2025) #18105

iclsrc · 2025-04-21T02:42:47Z

LLVM: llvm/llvm-project@16c84c4
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@610059c

This is a mostly straightforward replacement of the previous `std::pair<int, std::set<std::pair<...>>>` data structure used in `SLPVectorizerPass::vectorizeStores()` with slightly more readable alternatives. I had done that change in my local tree to help me better understand the code. It’s not very invasive, so I thought I’d create a PR for it.

Add function and subroutine forms of FSEEK and FTELL as intrinsic procedures. Accept common aliases from legacy compilers as well. A separate patch to llvm-test-suite will enable tests for these procedures once this patch has merged. Depends on llvm/llvm-project#132423; CI builds will likely fail until that patch is merged and this PR is rebased.

A function or subroutine can allow an object of the same name to appear in its scope, so long as the name is not used. This is similar to the case of a name being imported from multiple distinct modules, and implemented by the same representation. It's not clear whether this is conforming behavior or a common extension.

Fortran::runtime::Descriptor::BytesFor() only works for Fortran intrinsic types for which a C++ type counterpart exists, so it crashes on some types that are legitimate Fortran types like REAL(2). Move some logic from Evaluate into a new header in flang/Common, then use it to avoid this needless dependence on C++.

The RUNTIME_CHECK in question doesn't allow for the possibility that an allocatable or pointer component could be processed by defined I/O. Remove it in favor of a dynamic allocation check.

…34270) The optional second argument to IEEE_SUPPORT_FLAG (and related functions from the intrinsic IEEE_ARITHMETIC module) is needed only for its type, not its value. Restrictions on local objects as arguments to function references in specification expressions shouldn't apply to it. Define a new attribute for dummy data object characteristics to distinguish such arguments, set it for the appropriate intrinsic function references, and test it during specification expression validation.

…#134149) When a compiler directive continuation line starts with keyword macro names that have empty expansions, skip them.

…ers (#134302) The preprocessor can perform macro replacement within identifiers when they are split up with Fortran line continuation, but is failing to do macro replacement on a continued identifier when none of its parts are replaced.

…921) There were some remaining headers that were not guarded with _LIBCPP_HAS_LOCALIZATION, leading to errors when trying to use modules on platforms that don't support localization (since all the headers get pulled in when building the 'std' module). This patch brings these headers in line with what we do for every other header that depends on localization. This patch also requires including <picolibc.h> from <__configuration/platform.h> in order to define _NEWLIB_VERSION. In the long term, we should use a better approach for doing that, such as defining a macro in the __config_site header.

…race. This way we emit the error message that explains the full syntax for a register list. parseZcmpStackAdj had to be modified to not assume the previous operand had been successfully parsed as a register list.

The try-compile mechanism requires that `CMAKE_REQUIRED_FLAGS` is a space-separated string instead of a list of flags. The original code expanded `BUILTIN_FLAGS` into `CMAKE_REQUIRED_FLAGS` as a space-separated string and then would overwrite `CMAKE_REQUIRED_FLAGS` with `TARGET_${arch}_CFLAGS` prepended to the unexpanded `BUILTIN_CFLAGS_${arch}`. This resulted in the first two arguments being passed into the try-compile invocation, but dropping the other arguments listed in `BUILTIN_CFLAGS_${arch}`. This patch appends `TARGET_${arch}_CFLAGS` and `BUILTIN_CFLAGS_${arch}` to `CMAKE_REQUIRED_FLAGS` before expanding CMAKE_REQUIRED_FLAGS as a space-separated string. This passes any pre-set required flags, in addition to all of the builtin and target flags to the Float16 detection.

After llvm/llvm-project#133220 we had some empty complex literals (`tensor<0xcomplex<f32>>`) failing to parse. This was largely due to the ambiguity between `shape.empty()` meaning splat (`dense<1>`) or empty literal (`dense<>`). Used type's numel to disambiguate during verification.

When a pending relocation is created it is also marked whether it is optional or not. It can be optional when such relocation is added as part of an optimization (i.e., `scanExternalRefs`). When bolt tries to `flushPendingRelocations`, it safely skips any optional relocations that cannot be encoded due to being out of range. A pre-requisite to that is the usage of the `-force-patch` flag. Alternatrively, BOLT will bail out with a relevant message. Background: BOLT, as part of scanExternalRefs, identifies external references from calls and creates some pending relocations for them. Those when flushed will update references to point to the optimized functions. This optimization can be disabled using `--no-scan`. BOLT can assert if any of these pending relocations cannot be encoded. This patch does not disable this optimization but instead selectively applies it given that a pending relocation is optional and `-force-patch` was enabled.

Compute the result types and bail out before modifying any IR. That is more efficient when type conversion failed, because no modifications must be rolled back. Note: This is in preparation of the One-Shot Dialect Conversion refactoring.

…(#129960) This patch introduces a new option `-preserve-merged-debug-info` to preserve an arbitrary but deterministic version of debug information when DILocations are merged. This is intended to be used in production environments from which sample based profiles are derived such as AutoFDO and MemProf. With this patch we have see a 0.2% improvement on an internal workload at Google when generating AutoFDO profiles. It also significantly improves the ability for MemProf by preserving debug info for merged call instructions used in the contextual profile. --------- Co-authored-by: Krzysztof Pszeniczny <[email protected]>

This makes it more obvious what the R means. I've kept rlist in place that refer to the encoding.

No longer require -fopenmp or -fopenacc with -E, unless specific version number options are also required for predefined macros. This means that most source can be preprocessed with -E and then later compiled with -fopenmp, -fopenacc, or neither. This means that OpenMP conditional compilation lines (!$) are also passed through to -E output. The tricky part of this patch was dealing with the fact that those conditional lines can also contain regular Fortran line continuation, and that now has to be deferred when !$ lines are interspersed.

…#134397) Currently, these breakpoints are being accumulated every time a new process if created (e.g. through a `run`). Depending on the circumstances, the old breakpoints are even left enabled, interfering with subsequent processes. This is addressed by removing the breakpoints in ProcessGDBRemote::Clear Note that these breakpoints are more of a PlatformDarwin thing, so in the future we should look into moving them there.

…)interleaved" This reverts commit daab7d0 to fix a crash reported in llvm/llvm-project#134411.

This PR makes it so that `CompilerInvocation` is the sole owner of the `PreprocessorOptions` instance.

Fallout from PR #133467.

This is crucial when recovering from fatal loader errors. Without it, the `Lexer` keeps yielding more tokens and the compiler may access invalid `ASTReader` state. rdar://133388373

Needed after 6ee5e69

...and add missing TargetsToBuild dep.

The signature was changed from void(char *, char *) to void(void *, void *) to match GCC's signature for the same builtin. Fixes #47833

Trap checks fail at most once (when the program crashes).

@f

InstCombine will combine this zext of an icmp where the source has a single bit set to a lshr plus trunc (`InstCombinerImpl::transformZExtICmp`): ```llvm define <vscale x 1 x i8> @f(<vscale x 1 x i64> %x) { %1 = and <vscale x 1 x i64> %x, splat (i64 8) %2 = icmp ne <vscale x 1 x i64> %1, splat (i64 0) %3 = zext <vscale x 1 x i1> %2 to <vscale x 1 x i8> ret <vscale x 1 x i8> %3 } ``` ```llvm define <vscale x 1 x i8> @reverse_zexticmp_i64(<vscale x 1 x i64> %x) { %1 = trunc <vscale x 1 x i64> %x to <vscale x 1 x i8> %2 = lshr <vscale x 1 x i8> %1, splat (i8 2) %3 = and <vscale x 1 x i8> %2, splat (i8 1) ret <vscale x 1 x i8> %3 } ``` In a loop, this ends up being unprofitable for RISC-V because the codegen now goes from: ```asm f: # @f .cfi_startproc # %bb.0: vsetvli a0, zero, e64, m1, ta, ma vand.vi v8, v8, 8 vmsne.vi v0, v8, 0 vsetvli zero, zero, e8, mf8, ta, ma vmv.v.i v8, 0 vmerge.vim v8, v8, 1, v0 ret ``` To a series of narrowing vnsrl.wis: ```asm f: # @f .cfi_startproc # %bb.0: vsetvli a0, zero, e64, m1, ta, ma vand.vi v8, v8, 8 vsetvli zero, zero, e32, mf2, ta, ma vnsrl.wi v8, v8, 3 vsetvli zero, zero, e16, mf4, ta, ma vnsrl.wi v8, v8, 0 vsetvli zero, zero, e8, mf8, ta, ma vnsrl.wi v8, v8, 0 ret ``` In the original form, the vmv.v.i is loop invariant and is hoisted out, and the vmerge.vim usually gets folded away into a masked instruction, so you usually just end up with a vsetvli + vmsne.vi. The truncate requires multiple instructions and introduces a vtype toggle for each one, and is measurably slower on the BPI-F3. This reverses the transform in RISCVISelLowering for truncations greater than twice the bitwidth, i.e. it keeps single vnsrl.wis. Fixes #132245

Add operations for `nvvm.vote.all.sync` and `nvvm.vote.any.sync` intrinsics similar to `nvvm.vote.ballot.sync`.

We have supplied additional versions of atomic intrinsics, to extend the range of data types, for example. Some syntax modifications have also been made. These modifications were dropped during the WW16 pulldown. Not all of these look strictly necessary but until there's time for a proper investigation, all have been restored.

These were dropped during the WW14 pulldown. Code has been added more or less intact, rather than making it more idiomatic to suit the surrounding code. Such work can be achieved later. Some CHECKs had to be amended, swapping around the order of various syntax components, due to how the upstream multiclasses now order things. To minimize downstream changes, it is simpler to adjust our tests than it is to fiddle with the asm syntax.

…mory ed022d9 start to enable partitions in LTO. So the input IR to LocalAccessorToSharedMemory might contains delarations only, we should skip such functions. Or else we will get into assertsions or segv when getting entryblock.

These tables were split into head/tail tables upstream. The libspirv versions of atan2/atan2pi were still using the old unified table, which was never defined. A previous pulldown observed a build failure and resolved it by adding a declaration of the old table; this table was never defined so could result in a link failure at build time. The correct fix is to have libspirv's atan2/atan2pi builtins call into CLC's atan2/atan2pi, and at the same time remove the use of the old tables from libspirv. We can now also remove the declaration of the missing table.

This reverts commit 9ce7725.

It does not fix the name collision

jsji · 2025-04-30T20:19:49Z

@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a /merge? Thanks!

uditagarwal97 · 2025-04-30T20:21:20Z

/merge

jsji · 2025-04-30T20:59:55Z

/merge

@DoyleLi Automation not working again...

gbossu and others added 30 commits April 4, 2025 16:27

[flang][runtime] Remove bad runtime assertion (#134176)

ade9d1f

The RUNTIME_CHECK in question doesn't allow for the possibility that an allocatable or pointer component could be processed by defined I/O. Remove it in favor of a dynamic allocation check.

[flang][preprocessor] Directive continuation must skip empty macros (…

507ce46

…#134149) When a compiler directive continuation line starts with keyword macro names that have empty expansions, skip them.

[RISCV] Prefer RegList over Rlist in assembler. NFC

70a1445

This makes it more obvious what the R means. I've kept rlist in place that refer to the encoding.

Revert "[SLP]Initial support for (masked)loads + compress and (masked…

90cf2e3

…)interleaved" This reverts commit daab7d0 to fix a crash reported in llvm/llvm-project#134411.

[clang] Do not share ownership of PreprocessorOptions (#133467)

1688c30

This PR makes it so that `CompilerInvocation` is the sole owner of the `PreprocessorOptions` instance.

[clang][parse] Fix build of ParseHLSLRootSignatureTest.cpp

ea0869c

Fallout from PR #133467.

[clang][deps] Respect Lexer::cutOffLexing() (#134404)

cde90e6

This is crucial when recovering from fatal loader errors. Without it, the `Lexer` keeps yielding more tokens and the compiler may access invalid `ASTReader` state. rdar://133388373

[gn] port 10c6ebc (-gen-clang-diags-compat-ids)

6ee5e69

[gn] Add a missing dependency

8f65519

Needed after 6ee5e69

Merge from 'main' to 'sycl-web' (57 commits)

4f3647f

[gn] port 4a4d41e

d9bf390

...and add missing TargetsToBuild dep.

Fix the signature for __builtin___clear_cache (#134376)

b6b0257

The signature was changed from void(char *, char *) to void(void *, void *) to match GCC's signature for the same builtin. Fixes #47833

[clang] [sanitizer] predict trap checks succeed (#134310)

30f2e92

Trap checks fail at most once (when the program crashes).

[mlir][NVVM] Add ops for vote all and any sync (#134309)

cd2f85a

Add operations for `nvvm.vote.all.sync` and `nvvm.vote.any.sync` intrinsics similar to `nvvm.vote.ballot.sync`.

frasercrmck and others added 7 commits April 30, 2025 08:08

Revert "[HLSL] Add __spirv__ macro (#132848)"

f525be8

This reverts commit 9ce7725.

[SYCL] Revert fix in 84af96b for 9ce7725

6baba64

It does not fix the name collision

[SYCL][E2E] XFAIL 7 work group memory test for spirv-backend

fac2fd8

jsji force-pushed the llvmspirv_pulldown branch from a93a537 to 5d35d4d Compare April 30, 2025 15:10

jsji had a problem deploying to WindowsCILock April 30, 2025 15:10 — with GitHub Actions Error

mdtoguchi approved these changes Apr 30, 2025

View reviewed changes

jsji force-pushed the llvmspirv_pulldown branch from 5d35d4d to 780ce4f Compare April 30, 2025 15:12

jsji had a problem deploying to WindowsCILock April 30, 2025 15:13 — with GitHub Actions Error

[SYCL][HIP] Remove dup emit-llvm option in tests

066ced2

jsji force-pushed the llvmspirv_pulldown branch from 780ce4f to 066ced2 Compare April 30, 2025 15:17

jsji had a problem deploying to WindowsCILock April 30, 2025 15:17 — with GitHub Actions Error

jsji closed this Apr 30, 2025

jsji reopened this Apr 30, 2025

jsji temporarily deployed to WindowsCILock April 30, 2025 15:20 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock April 30, 2025 15:45 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock April 30, 2025 15:48 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock April 30, 2025 15:55 — with GitHub Actions Inactive

uditagarwal97 merged commit 3ed0666 into sycl Apr 30, 2025
81 of 94 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLVM and SPIRV-LLVM-Translator pulldown (WW16 2025) #18105

LLVM and SPIRV-LLVM-Translator pulldown (WW16 2025) #18105

Uh oh!

iclsrc commented Apr 21, 2025

Uh oh!

jsji commented Apr 30, 2025

Uh oh!

uditagarwal97 commented Apr 30, 2025

Uh oh!

Uh oh!

jsji commented Apr 30, 2025

Uh oh!

Uh oh!

LLVM and SPIRV-LLVM-Translator pulldown (WW16 2025) #18105

LLVM and SPIRV-LLVM-Translator pulldown (WW16 2025) #18105

Uh oh!

Conversation

iclsrc commented Apr 21, 2025

Uh oh!

jsji commented Apr 30, 2025

Uh oh!

uditagarwal97 commented Apr 30, 2025

Uh oh!

Uh oh!

jsji commented Apr 30, 2025

Uh oh!

Uh oh!