Skip to content

[DO NOT MERGE] Test only #17762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6,923 commits into from
Closed

[DO NOT MERGE] Test only #17762

wants to merge 6,923 commits into from

Conversation

jsji
Copy link
Contributor

@jsji jsji commented Apr 1, 2025

TEST ONLY

arsenm and others added 30 commits March 28, 2025 23:20
…#133407)

The attribute APIs make this cumbersome. There seem to be missing
overloads using AttrBuilder for the function attrs. Plus there doesn't
seem to be a direct way to set the function attrs on the call.
Convert vector 64-bit shl to 32-bit if shift amt is known to be >= 32.

---------

Signed-off-by: John Lu <[email protected]>
…guments (#133411)

In undefined mismatch cases, this was fixing the callsite to use the calling
convention of the new function. Preserve the original wrong callsite's calling
convention.
  CONFLICT (content): Merge conflict in libclc/cmake/modules/AddLibclc.cmake
…(#133412)

Invokes and others are not handled, so this was leaving broken callsites
behind for anything other than CallInst
- extract KnownFPClass for future use inside of GISelKnownBits

---------

Co-authored-by: Matt Arsenault <[email protected]>
Unused sdst writing to null to avoid a false VALU->SALU dependency
stall. This requires using the VOP3 encoding.
Avoid the pattern of always calling collectInstsToScalarize after
collectUniformsAndScalars, and call it in collectUniformsAndScalars
instead. Also strengthen checks for early exits in the function.
  CONFLICT (content): Merge conflict in sycl-jit/jit-compiler/lib/translation/KernelTranslation.cpp
…3123)

In 236f938, I introduced a generic version of this routine. I believe
that the SystemZ specific version of this is less general than the
generic version, and is thus unrequired. I wasn't 100% given the
difference in sub-register, multiple use and defs, but from the SystemZ
code, it looks like those cases simply don't arise?
Front-end option `-print-stats` can be used to print statistics around
the compilation process. But clang with this options will crash when
input is IR file. This patch fixes the crash by checking preprocessor
presence before invoking it.
Fix for #133446.

According to the RISC-V spec: "C.LUI is valid only when rd≠{x0,x2}, and
when the immediate is not equal to zero. The code points with imm=0 are
reserved".

This change makes the disassembler consider code points with imm=0 as
illegal. It introduces a test which exercises every C.LUI opcode
including the illegal ones but excluding those assigned to C.ADDI16SP).
Output for +c, +c +Zcmop, and +c +no-rvc-hints is checked.
  CONFLICT (modify/delete): libclc/amdgcn-amdhsa/lib/workitem/get_global_size.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/amdgcn-amdhsa/lib/workitem/get_global_size.cl left in tree.
  CONFLICT (modify/delete): libclc/amdgcn-amdhsa/lib/workitem/get_local_size.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/amdgcn-amdhsa/lib/workitem/get_local_size.cl left in tree.
  CONFLICT (modify/delete): libclc/amdgcn-amdhsa/lib/workitem/get_num_groups.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/amdgcn-amdhsa/lib/workitem/get_num_groups.cl left in tree.
  CONFLICT (modify/delete): libclc/amdgcn/lib/workitem/get_global_offset.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/amdgcn/lib/workitem/get_global_offset.cl left in tree.
  CONFLICT (modify/delete): libclc/amdgcn/lib/workitem/get_global_size.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/amdgcn/lib/workitem/get_global_size.cl left in tree.
  CONFLICT (modify/delete): libclc/amdgcn/lib/workitem/get_group_id.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/amdgcn/lib/workitem/get_group_id.cl left in tree.
  CONFLICT (modify/delete): libclc/amdgcn/lib/workitem/get_local_id.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/amdgcn/lib/workitem/get_local_id.cl left in tree.
  CONFLICT (modify/delete): libclc/amdgcn/lib/workitem/get_local_size.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/amdgcn/lib/workitem/get_local_size.cl left in tree.
  CONFLICT (modify/delete): libclc/amdgcn/lib/workitem/get_num_groups.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/amdgcn/lib/workitem/get_num_groups.cl left in tree.
  CONFLICT (modify/delete): libclc/amdgcn/lib/workitem/get_work_dim.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/amdgcn/lib/workitem/get_work_dim.cl left in tree.
  CONFLICT (content): Merge conflict in libclc/clc/include/clc/geometric/floatn.inc
  CONFLICT (content): Merge conflict in libclc/clc/include/clc/integer/gentype.inc
  CONFLICT (content): Merge conflict in libclc/clc/include/clc/relational/floatn.inc
  CONFLICT (content): Merge conflict in libclc/generic/include/as_type.h
  CONFLICT (content): Merge conflict in libclc/generic/include/clc/convert.h
  CONFLICT (content): Merge conflict in libclc/generic/include/clc/integer/integer-gentype.inc
  CONFLICT (content): Merge conflict in libclc/generic/include/clc/math/sincos.inc
  CONFLICT (content): Merge conflict in libclc/generic/include/clc/shared/vload.h
  CONFLICT (content): Merge conflict in libclc/generic/include/clc/shared/vstore.h
  CONFLICT (content): Merge conflict in libclc/generic/include/macros.h
  CONFLICT (content): Merge conflict in libclc/generic/lib/async/async_work_group_strided_copy.inc
  CONFLICT (content): Merge conflict in libclc/generic/lib/async/prefetch.inc
  CONFLICT (content): Merge conflict in libclc/generic/lib/async/wait_group_events.cl
  CONFLICT (modify/delete): libclc/generic/lib/clc_unary.inc deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/generic/lib/clc_unary.inc left in tree.
  CONFLICT (content): Merge conflict in libclc/generic/lib/math/minmag.inc
  CONFLICT (content): Merge conflict in libclc/libspirv/lib/r600/workitem/get_work_dim.cl
  CONFLICT (modify/delete): libclc/ptx-nvidiacl/lib/synchronization/barrier.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/ptx-nvidiacl/lib/synchronization/barrier.cl left in tree.
  CONFLICT (modify/delete): libclc/ptx-nvidiacl/lib/workitem/get_global_id.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/ptx-nvidiacl/lib/workitem/get_global_id.cl left in tree.
  CONFLICT (modify/delete): libclc/ptx-nvidiacl/lib/workitem/get_group_id.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/ptx-nvidiacl/lib/workitem/get_group_id.cl left in tree.
  CONFLICT (modify/delete): libclc/ptx-nvidiacl/lib/workitem/get_local_id.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/ptx-nvidiacl/lib/workitem/get_local_id.cl left in tree.
  CONFLICT (modify/delete): libclc/ptx-nvidiacl/lib/workitem/get_local_size.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/ptx-nvidiacl/lib/workitem/get_local_size.cl left in tree.
  CONFLICT (modify/delete): libclc/ptx-nvidiacl/lib/workitem/get_num_groups.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/ptx-nvidiacl/lib/workitem/get_num_groups.cl left in tree.
  CONFLICT (modify/delete): libclc/r600/lib/workitem/get_global_offset.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/r600/lib/workitem/get_global_offset.cl left in tree.
  CONFLICT (modify/delete): libclc/r600/lib/workitem/get_global_size.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/r600/lib/workitem/get_global_size.cl left in tree.
  CONFLICT (modify/delete): libclc/r600/lib/workitem/get_group_id.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/r600/lib/workitem/get_group_id.cl left in tree.
  CONFLICT (modify/delete): libclc/r600/lib/workitem/get_local_id.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/r600/lib/workitem/get_local_id.cl left in tree.
  CONFLICT (modify/delete): libclc/r600/lib/workitem/get_local_size.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/r600/lib/workitem/get_local_size.cl left in tree.
  CONFLICT (modify/delete): libclc/r600/lib/workitem/get_num_groups.cl deleted in HEAD and modified in 7d04867.  Version 7d04867 of libclc/r600/lib/workitem/get_num_groups.cl left in tree.
… (#131384)

Updates `BuiltinTypeMethodBuilder` helper class to support creation of
constructors and updates the code that creates default constructor for
resource classes to use it.

This enables us to share code when creating builtin methods and
constructors and will come in handy when we add more constructors in the
future.

Depends on #131032.
As discussed in llvm/llvm-project#53680, add
support for ld64's -interposable flag on Apple platforms to lld.
The comment shows that at the time we were worried about producing the
alias in assembly that might be ingested by a binutils version that
doesn't yet support it. binutils gained support over 4 years ago
<https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=c2137f55ad04e451d834048d4bfec1de2daea20e>.
With all the changes in areas such as ELF attributes, if you tried to
use LLVM's RISC-V assembler output with a binutils that old then zext.b
would be the least of your worries.
…NFC (#133364)

The current names look just like predicates we use for regular
immediates, but branches and jumps also allow bare symbols.

While I was there I realized I could use PredicateMethod to have the
AsmMatcher directly call the template function we use in the asm parser.
…nt (#133476)

Fixes dominance verifier error with
`FoldReshapeWithGenericOpByCollapsing` by setting the insertion point
after `producer`. The `tensor.collapse_shape` op only has a single
operand (`producer`) so it is safe to insert after the producer.

Signed-off-by: Ian Wood <[email protected]>
If the entry is SplitVectorize, it can be skipped in favor of its
operands, operands allow correctly detect spill costs.

Fixes #133288
Currently, when we set URLs from JS, we set them only using the protocol
and host locations. This works fine when docs are served from the base
directory of the site, but if you want to nest it under another
directory, our JS fails to set the correct path, leading to broken
links.

This patch adds a --base option to specify the path prefix to use, which
is set in the generated index_json.js file. index.json can then fill in
the prefix appropriately when generating links in a browser. This flag
has no effect for non HTML output.

Given an index hosted at: www.docs.com/base_directory/index.html
we used to generate the following link:
    www.docs.com/file.html
Using --base base_directory we now generate:
    www.docs.com/base_directory/file.html

This allows such links to work when hosting pages without using a custom
index.js.
Instead, make the few functions `map` relies on public. This makes it
more clear what is private to `__tree` and what is part of the
library-internal interface.
phoebewang and others added 23 commits March 31, 2025 22:05
… and m[no-]evex512 (#132542)

The 256-bit maximum vector register size control was removed from AVX10
whitepaper, ref: https://cdrdv2.intel.com/v1/dl/getContent/784343

- Re-target m[no-]avx10.1 to enable AVX10.1 with 512-bit maximum vector
register size;
- Emit warning for mavx10.x-256, noting AVX10/256 is not supported;
- Emit warning for mavx10.x-512, noting to use m[no-]avx10.x instead;
- Emit warning for m[no-]evex512, noting AVX10/256 is not supported;

This patch only changes Clang driver behavior. The features
avx10.x-256/512 keep unchanged and will be removed in the next release.
Implement hypot for Float16 along with tests.
…918)

We only emits v_mov_b32/64_dpp. Don't combine t16 instructions with mov
dpp. Update the test inputs to be legal.

It is future work to emit v_mov_b16_dpp, and then update GCNDPPCombine
to combine it with the 16-bit instructions.
This turns on the unnecessary-virtual-specifier warning in general, but
disables it when building LLVM. It also tweaks the warning description
to be slightly more accurate.

Background: I've been working on cleaning up this warning in two
codebases: LLVM and chromium (plus its dependencies). The chromium
cleanup has been straightforward. Git archaeology shows that there are
two reasons for the warnings: classes to which `final` was added after
they were initially committed, and classes with virtual destructors that
nobody remarks on. Presumably the latter case is because people are just
very used to destructors being virtual.

The LLVM cleanup was more surprising: I discovered that we have an [old
policy](https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers)
about including out-of-line virtual functions in every class with a
vtable, even `final` ones. This means our codebase has many virtual
"anchor" functions which do nothing except control where the vtable is
emitted, and which trigger the warning. I looked into alternatives to
satisfy the policy, such as using destructors instead of introducing a
new function, but it wasn't clear if they had larger implications.

Overall, it seems like the warning is genuinely useful in most codebases
(evidenced by chromium and its dependencies), and LLVM is an unusual
case. Therefore we should enable the warning by default, and turn it off
only for LLVM builds.
This adds DWARF generation for fixed-point types. This feature is needed
by Ada.

Note that a pre-existing GNU extension is used in one case. This has
been emitted by GCC for years, and is needed because standard DWARF is
otherwise incapable of representing these types.
Add a new `CreateIntrinsic` overload with no `Types`, useful for
creating calls to non-overloaded intrinsics that don't need additional
mangling.
Expose u target API mutex through the SB API. This is motivated by
lldb-dap, which is built on top of the SB API and needs a way to execute
a series of SB API calls in an atomic manner (see #131242).

We can solve this problem by either introducing an additional layer of
locking at the DAP level or by exposing the existing locking at the SB
API level. This patch implements the second approach.

This was discussed in an RFC on Discourse [0]. The original
implementation exposed a move-only lock rather than a mutex [1] which
doesn't work well with SWIG 4.0 [2]. This implement the alternative
solution of exposing the mutex rather than the lock. The SBMutex
conforms to the BasicLockable requirement [3] (which is why the methods
are called `lock` and `unlock` rather than Lock and Unlock) so it can be
used as `std::lock_guard<lldb::SBMutex>` and
`std::unique_lock<lldb::SBMutex>`.

[0]: https://discourse.llvm.org/t/rfc-exposing-the-target-api-lock-through-the-sb-api/85215/6
[1]: llvm/llvm-project#131404
[2]: https://discourse.llvm.org/t/rfc-bumping-the-minimum-swig-version-to-4-1-0/85377/9
[3]: https://en.cppreference.com/w/cpp/named_req/BasicLockable
Hey,

This solves an issue where running lldb-server-20 with a non-absolute
path (for example, when it's installed into `/usr/bin` and the user runs
it as `lldb-server-20 ...` and not `/usr/bin/lldb-server-20 ...`) fails
with `error: spawn_process failed: execve failed: No such file or
directory`. The underlying issue is that when run that way, it attempts
to execute a binary named `lldb-server-20` from its current directory.
This is also a mild security hazard because lldb-server is often being
run as root in the directory /tmp, meaning that an unprivileged user can
create the file /tmp/lldb-server-20 and lldb-server will execute it as
root. (although, well, it's a debugging server we're talking about, so
that may not be a real concern)

I haven't previously contributed to this project; if you want me to
change anything in the code please don't hesitate to let me know.
In #133211, Greg suggested making the rate limit configurable through a
setting. Although adding the setting is easy, the two places where we
currently use rate limiting aren't tied to a particular debugger.
Although it'd be possible to hook up, given how few progress events
currently implement rate limiting, I don't think it's worth threading
this through, if that's even possible.

I still think it's a good idea to be consistent and make it easy to pick
the same rate limiting value, so I've moved it into a constant in the
Progress class.
… 31 (#133713)

Fixes #133712.

The change causes `c.slli` instructions whose immediate has bit 5 set to
be rejected when disassembling RV32C. Added a test to exhaustively cover
c.slli for 32 bit targets. A minor tweak to make the debug output a
little more readable.

The spec. (20240411) says:

> For RV32C, shamt[5] must be zero; the code points with shamt[5]=1 are
designated for custom extensions. For RV32C and RV64C, the shift amount
must be non-zero; the code points with shamt=0 are HINTs. For all base
ISAs, the code points with rd=x0 are HINTs, except those with shamt[5]=1
in RV32C.
This came up during a discussion on #129679, which has been split out as
a preparatory commit.

An example of the AMDGPU codegen is:

    define <2 x float> @_Z10native_expDv2_f(<2 x float> %val) {
      %mul = fmul afn <2 x float> %val, splat (float 0x3FF7154760000000)
      %0 = extractelement <2 x float> %mul, i64 0
      %1 = tail call float @llvm.amdgcn.exp2.f32(float %0)
      %vecinit.i = insertelement <2 x float> poison, float %1, i64 0
      %2 = extractelement <2 x float> %mul, i64 1
      %3 = tail call float @llvm.amdgcn.exp2.f32(float %2)
%vecinit2.i = insertelement <2 x float> %vecinit.i, float %3, i64 1
      ret <2 x float> %vecinit2.i
    }

    define <2 x float> @_Z11native_exp2Dv2_f(<2 x float> %x) {
      %0 = extractelement <2 x float> %x, i64 0
      %1 = tail call float @llvm.amdgcn.exp2.f32(float %0)
      %vecinit = insertelement <2 x float> poison, float %1, i64 0
      %2 = extractelement <2 x float> %x, i64 1
      %3 = tail call float @llvm.amdgcn.exp2.f32(float %2)
      %vecinit2 = insertelement <2 x float> %vecinit, float %3, i64 1
      ret <2 x float> %vecinit2
    }
Splitting the 'ln_tbl' into two in db98e29 wasn't done thoroughly
enough as some references to the old table still remained. This commit
fixes the unresolved references by updating to the new split table.
  CONFLICT (content): Merge conflict in clang/lib/CodeGen/CodeGenModule.cpp
 Conflicts:
	clang/lib/Basic/Targets/NVPTX.cpp
@jsji jsji closed this May 2, 2025
@jsji jsji deleted the syclwebtest branch May 2, 2025 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.