Skip to content

Noret atomics #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Noret atomics #2

wants to merge 1 commit into from

Conversation

madhur13490
Copy link
Owner

No description provided.

madhur13490 pushed a commit that referenced this pull request Sep 13, 2023
The new ACLE PR#225[1] now combines the slice parameters for some
builtins. This patch is the #2 of 3 patches to update the interface.

Slice specifies the ZA slice number directly and needs to be explicity
implemented by the "user" with the base register plus the immediate
offset

[1]https://github.com/ARM-software/acle/pull/225/files
madhur13490 pushed a commit that referenced this pull request Oct 11, 2023
…fine.parallel verifier

This patch updates AffineParallelOp::verify() to check each result type matches
its corresponding reduction op (i.e, the result type must be a `FloatType` if
the reduction attribute is `addf`)

affine.parallel will crash on --lower-affine if the corresponding result type
cannot match the reduction attribute.

```
      %128 = affine.parallel (%arg2, %arg3) = (0, 0) to (8, 7) reduce ("maxf") -> (memref<8x7xf32>) {
        %alloc_33 = memref.alloc() : memref<8x7xf32>
        affine.yield %alloc_33 : memref<8x7xf32>
      }
```
This will crash and report a type conversion issue when we run `mlir-opt --lower-affine`

```
Assertion failed: (isa<To>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file Casting.h, line 572.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: mlir-opt --lower-affine temp.mlir
 #0 0x0000000102a18f18 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/workspacebin/mlir-opt+0x1002f8f18)
 #1 0x0000000102a171b4 llvm::sys::RunSignalHandlers() (/workspacebin/mlir-opt+0x1002f71b4)
 #2 0x0000000102a195c4 SignalHandler(int) (/workspacebin/mlir-opt+0x1002f95c4)
 #3 0x00000001be7894c4 (/usr/lib/system/libsystem_platform.dylib+0x1803414c4)
 #4 0x00000001be771ee0 (/usr/lib/system/libsystem_pthread.dylib+0x180329ee0)
 llvm#5 0x00000001be6ac340 (/usr/lib/system/libsystem_c.dylib+0x180264340)
 llvm#6 0x00000001be6ab754 (/usr/lib/system/libsystem_c.dylib+0x180263754)
 llvm#7 0x0000000106864790 mlir::arith::getIdentityValueAttr(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (.cold.4) (/workspacebin/mlir-opt+0x104144790)
 llvm#8 0x0000000102ba66ac mlir::arith::getIdentityValueAttr(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (/workspacebin/mlir-opt+0x1004866ac)
 llvm#9 0x0000000102ba6910 mlir::arith::getIdentityValue(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (/workspacebin/mlir-opt+0x100486910)
...
```

Fixes llvm#64068

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D157985
madhur13490 pushed a commit that referenced this pull request Feb 27, 2024
…ter partial ordering when determining primary template (llvm#82417)

Consider the following:
```
struct A {
  static constexpr bool x = true;
};

template<typename T, typename U>
void f(T, U) noexcept(T::y); // #1, error: no member named 'y' in 'A'

template<typename T, typename U>
void f(T, U*) noexcept(T::x); // #2

template<>
void f(A, int*) noexcept; // explicit specialization of #2
```

We currently instantiate the exception specification of all candidate
function template specializations when deducting template arguments for
an explicit specialization, which results in a error despite `#1` not
being selected by partial ordering as the most specialized template.
According to [except.spec] p13:
> An exception specification is considered to be needed when: 
> - [...]
> - the exception specification is compared to that of another
declaration (e.g., an explicit specialization or an overriding virtual
function);

Assuming that "comparing declarations" means "determining whether the
declarations correspond and declare the same entity" (per [basic.scope.scope] p4 and
[basic.link] p11.1, respectively), the exception specification does _not_ need to be
instantiated until _after_ partial ordering, at which point we determine
whether the implicitly instantiated specialization and the explicit
specialization declare the same entity (the determination of whether two
functions/function templates correspond does not consider the exception
specifications).

This patch defers the instantiation of the exception specification until
a single function template specialization is selected via partial
ordering, matching the behavior of GCC, EDG, and
MSVC: see https://godbolt.org/z/Ebb6GTcWE.
madhur13490 pushed a commit that referenced this pull request Mar 12, 2024
TestCases/Misc/Linux/sigaction.cpp fails because dlsym() may call malloc
on failure. And then the wrapped malloc appears to access thread local
storage using global dynamic accesses, thus calling
___interceptor___tls_get_addr, before REAL(__tls_get_addr) has
been set, so we get a crash inside ___interceptor___tls_get_addr. For
example, this can happen when looking up __isoc23_scanf which might not
exist in some libcs.

Fix this by marking the thread local variable accessed inside the
debug checks as "initial-exec", which does not require __tls_get_addr.

This is probably a better alternative to llvm#83886.

This fixes a different crash but is related to llvm#46204.

Backtrace:
```
#0 0x0000000000000000 in ?? ()
#1 0x00007ffff6a9d89e in ___interceptor___tls_get_addr (arg=0x7ffff6b27be8) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:2759
#2 0x00007ffff6a46bc6 in __sanitizer::CheckedMutex::LockImpl (this=0x7ffff6b27be8, pc=140737331846066) at /path/to/llvm/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:218
#3 0x00007ffff6a448b2 in __sanitizer::CheckedMutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:129
#4 __sanitizer::Mutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:167
llvm#5 0x00007ffff6abdbb2 in __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock (mu=0x730000000580, this=<optimized out>) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:383
llvm#6 __sanitizer::SizeClassAllocator64<__tsan::AP64>::GetFromAllocator (this=0x7ffff7487dc0 <__tsan::allocator_placeholder>, stat=stat@entry=0x7ffff570db68, class_id=11, chunks=chunks@entry=0x7ffff5702cc8, n_chunks=n_chunks@entry=128) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_primary64.h:207
llvm#7 0x00007ffff6abdaa0 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Refill (this=<optimized out>, c=c@entry=0x7ffff5702cb8, allocator=<optimized out>, class_id=<optimized out>)
 at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:103
llvm#8 0x00007ffff6abd731 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Allocate (this=0x7ffff6b27be8, allocator=0x7ffff5702cc8, class_id=140737311157448)
 at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:39
llvm#9 0x00007ffff6abc397 in __sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocator64<__tsan::AP64>, __sanitizer::LargeMmapAllocatorPtrArrayDynamic>::Allocate (this=0x7ffff5702cc8, cache=0x7ffff6b27be8, size=<optimized out>, size@entry=175, alignment=alignment@entry=16)
 at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_combined.h:69
llvm#10 0x00007ffff6abaa6a in __tsan::user_alloc_internal (thr=0x7ffff7ebd980, pc=140737331499943, sz=sz@entry=175, align=align@entry=16, signal=true) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:198
llvm#11 0x00007ffff6abb0d1 in __tsan::user_alloc (thr=0x7ffff6b27be8, pc=140737331846066, sz=11, sz@entry=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:223
llvm#12 0x00007ffff6a693b5 in ___interceptor_malloc (size=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:666
llvm#13 0x00007ffff7fce7f2 in malloc (size=175) at ../include/rtld-malloc.h:56
llvm#14 __GI__dl_exception_create_format (exception=exception@entry=0x7fffffffd0d0, objname=0x7ffff7fc3550 "/path/to/llvm/compiler-rt/cmake-build-all-sanitizers/lib/linux/libclang_rt.tsan-x86_64.so",
 fmt=fmt@entry=0x7ffff7ff2db9 "undefined symbol: %s%s%s") at ./elf/dl-exception.c:157
llvm#15 0x00007ffff7fd50e8 in _dl_lookup_symbol_x (undef_name=0x7ffff6af868b "__isoc23_scanf", undef_map=<optimized out>, ref=0x7fffffffd148, symbol_scope=<optimized out>, version=<optimized out>, type_class=0, flags=2, skip_map=0x7ffff7fc35e0) at ./elf/dl-lookup.c:793
--Type <RET> for more, q to quit, c to continue without paging--
llvm#16 0x00007ffff656d6ed in do_sym (handle=<optimized out>, name=0x7ffff6af868b "__isoc23_scanf", who=0x7ffff6a3bb84 <__interception::InterceptFunction(char const*, unsigned long*, unsigned long, unsigned long)+36>, vers=vers@entry=0x0, flags=flags@entry=2) at ./elf/dl-sym.c:146
llvm#17 0x00007ffff656d9dd in _dl_sym (handle=<optimized out>, name=<optimized out>, who=<optimized out>) at ./elf/dl-sym.c:195
llvm#18 0x00007ffff64a2854 in dlsym_doit (a=a@entry=0x7fffffffd3b0) at ./dlfcn/dlsym.c:40
llvm#19 0x00007ffff7fcc489 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffd310, operate=0x7ffff64a2840 <dlsym_doit>, args=0x7fffffffd3b0) at ./elf/dl-catch.c:237
llvm#20 0x00007ffff7fcc5af in _dl_catch_error (objname=0x7fffffffd368, errstring=0x7fffffffd370, mallocedp=0x7fffffffd367, operate=<optimized out>, args=<optimized out>) at ./elf/dl-catch.c:256
llvm#21 0x00007ffff64a2257 in _dlerror_run (operate=operate@entry=0x7ffff64a2840 <dlsym_doit>, args=args@entry=0x7fffffffd3b0) at ./dlfcn/dlerror.c:138
llvm#22 0x00007ffff64a28e5 in dlsym_implementation (dl_caller=<optimized out>, name=<optimized out>, handle=<optimized out>) at ./dlfcn/dlsym.c:54
llvm#23 ___dlsym (handle=<optimized out>, name=<optimized out>) at ./dlfcn/dlsym.c:68
llvm#24 0x00007ffff6a3bb84 in __interception::GetFuncAddr (name=0x7ffff6af868b "__isoc23_scanf", trampoline=140737311157448) at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:42
llvm#25 __interception::InterceptFunction (name=0x7ffff6af868b "__isoc23_scanf", ptr_to_real=0x7ffff74850e8 <__interception::real___isoc23_scanf>, func=11, trampoline=140737311157448)
 at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:61
llvm#26 0x00007ffff6a9f2d9 in InitializeCommonInterceptors () at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_common_interceptors.inc:10315
```

Reviewed By: vitalybuka, MaskRay

Pull Request: llvm#83890
madhur13490 pushed a commit that referenced this pull request Mar 19, 2024
…lvm#85653)

This reverts commit daebe5c.

This commit causes the following asan issue:

```
<snip>/llvm-project/build/bin/mlir-opt <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir | <snip>/llvm-project/build/bin/FileCheck <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir
# executed command: <snip>/llvm-project/build/bin/mlir-opt <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir
# .---command stderr------------
# | =================================================================
# | ==2772558==ERROR: AddressSanitizer: stack-use-after-return on address 0x7fd2c2c42b90 at pc 0x55e406d54614 bp 0x7ffc810e4070 sp 0x7ffc810e4068
# | READ of size 8 at 0x7fd2c2c42b90 thread T0
# |     #0 0x55e406d54613 in operator()<long int const*> /usr/include/c++/13/bits/predefined_ops.h:318
# |     #1 0x55e406d54613 in __count_if<long int const*, __gnu_cxx::__ops::_Iter_pred<mlir::verifyListOfOperandsOrIntegers(Operation*, llvm::StringRef, unsigned int, llvm::ArrayRef<long int>, ValueRange)::<lambda(int64_t)> > > /usr/include/c++/13/bits/stl_algobase.h:2125
# |     #2 0x55e406d54613 in count_if<long int const*, mlir::verifyListOfOperandsOrIntegers(Operation*, 
...
```
madhur13490 pushed a commit that referenced this pull request Mar 21, 2024
…oint. (llvm#83821)"

This reverts commit c2c1e6e. It creates
a use after free.

==8342==ERROR: AddressSanitizer: heap-use-after-free on address 0x50f000001760 at pc 0x55b9fb84a8fb bp 0x7ffc18468a10 sp 0x7ffc18468a08
READ of size 1 at 0x50f000001760 thread T0
 #0 0x55b9fb84a8fa in dropPoisonGeneratingFlags llvm/lib/Transforms/Vectorize/VPlan.h:1040:13
 #1 0x55b9fb84a8fa in llvm::VPlanTransforms::dropPoisonGeneratingRecipes(llvm::VPlan&, llvm::function_ref<bool (llvm::BasicBlock*)>)::$_0::operator()(llvm::VPRecipeBase*) const llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:1236:23
 #2 0x55b9fb84a196 in llvm::VPlanTransforms::dropPoisonGeneratingRecipes(llvm::VPlan&, llvm::function_ref<bool (llvm::BasicBlock*)>) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

Can be reproduced with asan on
Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll
Transforms/LoopVectorize/X86/pr81872.ll
Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
madhur13490 pushed a commit that referenced this pull request Aug 5, 2024
```
  UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp
```
`FAIL`s on 32 and 64-bit Linux/sparc64 (and on Solaris/sparcv9, too: the
test isn't Linux-specific at all). With
`UBSAN_OPTIONS=fast_unwind_on_fatal=1`, the stack trace shows a
duplicate innermost frame:
```
compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:31: runtime error: execution reached the end of a value-returning function without returning a value
    #0 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
    #1 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
    #2 0x7003a714 in g() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:17:38
```
which isn't seen with `fast_unwind_on_fatal=0`.

This turns out to be another fallout from fixing
`__builtin_return_address`/`__builtin_extract_return_addr` on SPARC. In
`sanitizer_stacktrace_sparc.cpp` (`BufferedStackTrace::UnwindFast`) the
`pc` arg is the return address, while `pc1` from the stack frame
(`fr_savpc`) is the address of the `call` insn, leading to a double
entry for the innermost frame in `trace_buffer[]`.

This patch fixes this by moving the adjustment before all uses.

Tested on `sparc64-unknown-linux-gnu` and `sparcv9-sun-solaris2.11`
(with the `ubsan/TestCases/Misc/Linux` tests enabled).
madhur13490 pushed a commit that referenced this pull request Aug 27, 2024
…lvm#104148)

`hasOperands` does not always execute matchers in the order they are
written. This can cause issue in code using bindings when one operand
matcher is relying on a binding set by the other. With this change, the
first matcher present in the code is always executed first and any
binding it sets are available to the second matcher.

Simple example with current version (1 match) and new version (2
matches):
```bash
> cat tmp.cpp
int a = 13;
int b = ((int) a) - a;
int c = a - ((int) a);

> clang-query tmp.cpp
clang-query> set traversal IgnoreUnlessSpelledInSource
clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d"))))))

Match #1:

tmp.cpp:1:1: note: "d" binds here
int a = 13;
^~~~~~~~~~
tmp.cpp:2:9: note: "root" binds here
int b = ((int)a) - a;
        ^~~~~~~~~~~~
1 match.

> ./build/bin/clang-query tmp.cpp
clang-query> set traversal IgnoreUnlessSpelledInSource
clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d"))))))

Match #1:

tmp.cpp:1:1: note: "d" binds here
    1 | int a = 13;
      | ^~~~~~~~~~
tmp.cpp:2:9: note: "root" binds here
    2 | int b = ((int)a) - a;
      |         ^~~~~~~~~~~~

Match #2:

tmp.cpp:1:1: note: "d" binds here
    1 | int a = 13;
      | ^~~~~~~~~~
tmp.cpp:3:9: note: "root" binds here
    3 | int c = a - ((int)a);
      |         ^~~~~~~~~~~~
2 matches.
```

If this should be documented or regression tested anywhere please let me
know where.
madhur13490 pushed a commit that referenced this pull request Aug 27, 2024
…104523)

Compilers and language runtimes often use helper functions that are
fundamentally uninteresting when debugging anything but the
compiler/runtime itself. This patch introduces a user-extensible
mechanism that allows for these frames to be hidden from backtraces and
automatically skipped over when navigating the stack with `up` and
`down`.

This does not affect the numbering of frames, so `f <N>` will still
provide access to the hidden frames. The `bt` output will also print a
hint that frames have been hidden.

My primary motivation for this feature is to hide thunks in the Swift
programming language, but I'm including an example recognizer for
`std::function::operator()` that I wished for myself many times while
debugging LLDB.

rdar://126629381


Example output. (Yes, my proof-of-concept recognizer could hide even
more frames if we had a method that returned the function name without
the return type or I used something that isn't based off regex, but it's
really only meant as an example).

before:
```
(lldb) thread backtrace --filtered=false
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
    frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
    frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
    frame #3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12
    frame #4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10
    frame llvm#5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12
    frame llvm#6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
    frame llvm#7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
    frame llvm#8: 0x0000000183cdf154 dyld`start + 2476
(lldb) 
```

after

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
    frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
    frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
    frame llvm#6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
    frame llvm#7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
    frame llvm#8: 0x0000000183cdf154 dyld`start + 2476
Note: Some frames were hidden by frame recognizers
```
madhur13490 pushed a commit that referenced this pull request Aug 27, 2024
)

Currently, process of replacing bitwise operations consisting of
`LSR`/`LSL` with `And` is performed by `DAGCombiner`.

However, in certain cases, the `AND` generated by this process
can be removed.

Consider following case:
```
        lsr x8, x8, llvm#56
        and x8, x8, #0xfc
        ldr w0, [x2, x8]
        ret
```

In this case, we can remove the `AND` by changing the target of `LDR`
to `[X2, X8, LSL #2]` and right-shifting amount change to 56 to 58.

after changed:
```
        lsr x8, x8, llvm#58
        ldr w0, [x2, x8, lsl #2]
        ret
```

This patch checks to see if the `SHIFTING` + `AND` operation on load
target can be optimized and optimizes it if it can.
madhur13490 pushed a commit that referenced this pull request Aug 30, 2024
`JITDylibSearchOrderResolver` local variable can be destroyed before
completion of all callbacks. Capture it together with `Deps` in
`OnEmitted` callback.

Original error:

```
==2035==ERROR: AddressSanitizer: stack-use-after-return on address 0x7bebfa155b70 at pc 0x7ff2a9a88b4a bp 0x7bec08d51980 sp 0x7bec08d51978
READ of size 8 at 0x7bebfa155b70 thread T87 (tf_xla-cpu-llvm)
    #0 0x7ff2a9a88b49 in operator() llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:58
    #1 0x7ff2a9a88b49 in __invoke<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:149:25
    #2 0x7ff2a9a88b49 in __call<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:224:5
    #3 0x7ff2a9a88b49 in operator() libcxx/include/__functional/function.h:210:12
    #4 0x7ff2a9a88b49 in void std::__u::__function::__policy_invoker<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr,
```
madhur13490 pushed a commit that referenced this pull request Sep 3, 2024
Static destructor can race with calls to notify and trigger tsan
warning.

```
WARNING: ThreadSanitizer: data race (pid=5787)
  Write of size 1 at 0x55bec9df8de8 by thread T23:
    #0 pthread_mutex_destroy [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1344](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=1344&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b12affb) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #1 __libcpp_recursive_mutex_destroy [third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h:91](third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h?l=91&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d4e9) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #2 std::__tsan::recursive_mutex::~recursive_mutex() [third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp:52](third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp?l=52&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d4e9)
    #3 ~SmartMutex [third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h:28](third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h?l=28&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaedfe) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #4 (anonymous namespace)::PerfJITEventListener::~PerfJITEventListener() [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:65](third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=65&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaedfe)
    llvm#5 cxa_at_exit_callback_installed_at(void*) [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:437](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=437&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b172cb9) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    llvm#6 llvm::JITEventListener::createPerfJITEventListener() [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:496](third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=496&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcad8f5) (BuildId: ff25ace8b17d9863348bb1759c47246c)
```
```
Previous atomic read of size 1 at 0x55bec9df8de8 by thread T192 (mutexes: write M0, write M1):
    #0 pthread_mutex_unlock [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1387](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=1387&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b12b6bb) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #1 __libcpp_recursive_mutex_unlock [third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h:87](third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h?l=87&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d589) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #2 std::__tsan::recursive_mutex::unlock() [third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp:64](third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp?l=64&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d589)
    #3 unlock [third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h:47](third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h?l=47&cl=669089572):16 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968) (BuildId: ff25ace8b17d9863348bb1759c47246c)
    #4 ~lock_guard [third_party/crosstool/v18/stable/src/libcxx/include/__mutex/lock_guard.h:39](third_party/crosstool/v18/stable/src/libcxx/include/__mutex/lock_guard.h?l=39&cl=669089572):101 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968)
    llvm#5 (anonymous namespace)::PerfJITEventListener::notifyObjectLoaded(unsigned long, llvm::object::ObjectFile const&, llvm::RuntimeDyld::LoadedObjectInfo const&) [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:290](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=290&cl=669089572):1 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968)
    llvm#6 llvm::orc::RTDyldObjectLinkingLayer::onObjEmit(llvm::orc::MaterializationResponsibility&, llvm::object::OwningBinary<llvm::object::ObjectFile>, std::__tsan::unique_ptr<llvm::RuntimeDyld::MemoryManager, std::__tsan::default_delete<llvm::RuntimeDyld::MemoryManager>>, std::__tsan::unique_ptr<llvm::RuntimeDyld::LoadedObjectInfo, std::__tsan::default_delete<llvm::RuntimeDyld::LoadedObjectInfo>>, std::__tsan::unique_ptr<llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>>>, std::__tsan::default_delete<llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>>>>>, llvm::Error) [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:386](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp?l=386&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bc404a8) (BuildId: ff25ace8b17d9863348bb1759c47246c)
```
madhur13490 pushed a commit that referenced this pull request Sep 11, 2024
…llvm#94981)

This extends default argument deduction to cover class templates as
well, applying only to partial ordering, adding to the provisional
wording introduced in llvm#89807.

This solves some ambuguity introduced in P0522 regarding how template
template parameters are partially ordered, and should reduce the
negative impact of enabling `-frelaxed-template-template-args` by
default.

Given the following example:
```C++
template <class T1, class T2 = float> struct A;
template <class T3> struct B;

template <template <class T4> class TT1, class T5> struct B<TT1<T5>>;   // #1
template <class T6, class T7>                      struct B<A<T6, T7>>; // #2

template struct B<A<int>>;
```
Prior to P0522, `#2` was picked. Afterwards, this became ambiguous. This
patch restores the pre-P0522 behavior, `#2` is picked again.
madhur13490 pushed a commit that referenced this pull request Sep 18, 2024
When SPARC Asan testing is enabled by PR llvm#107405, many Linux/sparc64
tests just hang like
```
#0  0xf7ae8e90 in syscall () from /usr/lib32/libc.so.6
#1  0x701065e8 in __sanitizer::FutexWait(__sanitizer::atomic_uint32_t*, unsigned int) ()
    at compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp:766
#2  0x70107c90 in Wait ()
    at compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:35
#3  0x700f7cac in Lock ()
    at compiler-rt/lib/asan/../sanitizer_common/sanitizer_mutex.h:196
#4  Lock ()
    at compiler-rt/lib/asan/../sanitizer_common/sanitizer_thread_registry.h:98
llvm#5  LockThreads ()
    at compiler-rt/lib/asan/asan_thread.cpp:489
llvm#6  0x700e9c8c in __asan::BeforeFork() ()
    at compiler-rt/lib/asan/asan_posix.cpp:157
llvm#7  0xf7ac83f4 in ?? () from /usr/lib32/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
```
It turns out that this happens in tests using `internal_fork` (e.g.
invoking `llvm-symbolizer`): unlike most other Linux targets, which use
`clone`, Linux/sparc64 has to use `__fork` instead. While `clone`
doesn't trigger `pthread_atfork` handlers, `__fork` obviously does,
causing the hang.

To avoid this, this patch disables `InstallAtForkHandler` and lets the
ASan tests run to completion.

Tested on `sparc64-unknown-linux-gnu`.
madhur13490 pushed a commit that referenced this pull request Nov 5, 2024
…ates explicitly specialized for an implicitly instantiated class template specialization (llvm#113464)

Consider the following:
```
template<typename T>
struct A {
  template<typename U>
  struct B {
    static constexpr int x = 0; // #1
  };

  template<typename U>
  struct B<U*> {
    static constexpr int x = 1; // #2
  };
};

template<>
template<typename U>
struct A<long>::B {
  static constexpr int x = 2; // #3
};

static_assert(A<short>::B<int>::y == 0); // uses #1
static_assert(A<short>::B<int*>::y == 1); // uses #2

static_assert(A<long>::B<int>::y == 2); // uses #3
static_assert(A<long>::B<int*>::y == 2); // uses #3
```

According to [temp.spec.partial.member] p2:
> If the primary member template is explicitly specialized for a given
(implicit) specialization of the enclosing class template, the partial
specializations of the member template are ignored for this
specialization of the enclosing class template.
If a partial specialization of the member template is explicitly
specialized for a given (implicit) specialization of the enclosing class
template, the primary member template and its other partial
specializations are still considered for this specialization of the
enclosing class template.

The example above fails to compile because we currently don't implement
[temp.spec.partial.member] p2. This patch implements the wording, fixing llvm#51051.
madhur13490 pushed a commit that referenced this pull request Nov 14, 2024
… depobj construct (llvm#114221)

A codegen crash is occurring when a depend object was initialized with
omp_all_memory in the depobj directive.
llvm#114214
The root cause of issue looks to be the improper handling of the
dependency list when omp_all_memory was specified.

The change introduces the use of OMPTaskDataTy to manage dependencies.
The buildDependences function is called to construct the dependency
list, and the list is iterated over to emit and store the dependencies.

Reduced Test Case : 
```
#include <omp.h>

int main()

{ omp_depend_t obj; #pragma omp depobj(obj) depend(inout: omp_all_memory) }
```

```
 #1 0x0000000003de6623 SignalHandler(int) Signals.cpp:0:0
 #2 0x00007f8e4a6b990f (/lib64/libpthread.so.0+0x1690f)
 #3 0x00007f8e4a117d2a raise (/lib64/libc.so.6+0x4ad2a)
 #4 0x00007f8e4a1193e4 abort (/lib64/libc.so.6+0x4c3e4)
 llvm#5 0x00007f8e4a10fc69 __assert_fail_base (/lib64/libc.so.6+0x42c69)
 llvm#6 0x00007f8e4a10fcf1 __assert_fail (/lib64/libc.so.6+0x42cf1)
 llvm#7 0x0000000004114367 clang::CodeGen::CodeGenFunction::EmitOMPDepobjDirective(clang::OMPDepobjDirective const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4114367)
 llvm#8 0x00000000040f8fac clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x40f8fac)
 llvm#9 0x00000000040ff4fb clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x40ff4fb)
llvm#10 0x00000000041847b2 clang::CodeGen::CodeGenFunction::EmitFunctionBody(clang::Stmt const*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41847b2)
llvm#11 0x0000000004199e4a clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4199e4a)
llvm#12 0x00000000041f7b9d clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41f7b9d)
llvm#13 0x00000000041f16a3 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41f16a3)
llvm#14 0x00000000041fd954 clang::CodeGen::CodeGenModule::EmitDeferred() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41fd954)
llvm#15 0x0000000004200277 clang::CodeGen::CodeGenModule::Release() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4200277)
llvm#16 0x00000000046b6a49 (anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit(clang::ASTContext&) ModuleBuilder.cpp:0:0
llvm#17 0x00000000046b4cb6 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x46b4cb6)
llvm#18 0x0000000006204d5c clang::ParseAST(clang::Sema&, bool, bool) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x6204d5c)
llvm#19 0x000000000496b278 clang::FrontendAction::Execute() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x496b278)
llvm#20 0x00000000048dd074 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x48dd074)
llvm#21 0x0000000004a38092 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4a38092)
llvm#22 0x0000000000fd4e9c cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xfd4e9c)
llvm#23 0x0000000000fcca73 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
llvm#24 0x0000000000fd140c clang_main(int, char**, llvm::ToolContext const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xfd140c)
llvm#25 0x0000000000ee2ef3 main (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xee2ef3)
llvm#26 0x00007f8e4a10224c __libc_start_main (/lib64/libc.so.6+0x3524c)
llvm#27 0x0000000000fcaae9 _start /home/abuild/rpmbuild/BUILD/glibc-2.31/csu/../sysdeps/x86_64/start.S:120:0
clang: error: unable to execute command: Aborted
```

---------

Co-authored-by: Chandra Ghale <[email protected]>
madhur13490 pushed a commit that referenced this pull request Nov 19, 2024
…onger cause a crash (llvm#116569)

This PR fixes a bug introduced by llvm#110199, which causes any half float
argument to crash the compiler on MIPS64.

Currently compiling this bit of code with `llc -mtriple=mips64`: 
```
define void @half_args(half %a) nounwind {
entry:
        ret void
}
```

Crashes with the following log:
```
LLVM ERROR: unable to allocate function argument #0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: llc -mtriple=mips64
1.	Running pass 'Function Pass Manager' on module '<stdin>'.
2.	Running pass 'MIPS DAG->DAG Pattern Instruction Selection' on function '@half_args'
 #0 0x000055a3a4013df8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x32d0df8)
 #1 0x000055a3a401199e llvm::sys::RunSignalHandlers() (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x32ce99e)
 #2 0x000055a3a40144a8 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f00bde558c0 __restore_rt libc_sigaction.c:0:0
 #4 0x00007f00bdea462c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 llvm#5 0x00007f00bde55822 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 llvm#6 0x00007f00bde3e4af abort ./stdlib/abort.c:81:7
 llvm#7 0x000055a3a3f80e3c llvm::report_fatal_error(llvm::Twine const&, bool) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x323de3c)
 llvm#8 0x000055a3a2e20dfa (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x20dddfa)
 llvm#9 0x000055a3a2a34e20 llvm::MipsTargetLowering::LowerFormalArguments(llvm::SDValue, unsigned int, bool, llvm::SmallVectorImpl<llvm::ISD::InputArg> const&, llvm::SDLoc const&, llvm::SelectionDAG&, llvm::SmallVectorImpl<llvm::SDValue>&) const MipsISelLowering.cpp:0:0
llvm#10 0x000055a3a3d896a9 llvm::SelectionDAGISel::LowerArguments(llvm::Function const&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30466a9)
llvm#11 0x000055a3a3e0b3ec llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c83ec)
llvm#12 0x000055a3a3e09e21 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c6e21)
llvm#13 0x000055a3a2aae1ca llvm::MipsDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) MipsISelDAGToDAG.cpp:0:0
llvm#14 0x000055a3a3e07706 llvm::SelectionDAGISelLegacy::runOnMachineFunction(llvm::MachineFunction&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c4706)
llvm#15 0x000055a3a3051ed6 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x230eed6)
llvm#16 0x000055a3a35a3ec9 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x2860ec9)
llvm#17 0x000055a3a35ac3b2 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x28693b2)
llvm#18 0x000055a3a35a499c llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x286199c)
llvm#19 0x000055a3a262abbb main (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x18e7bbb)
llvm#20 0x00007f00bde3fc4c __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3
llvm#21 0x00007f00bde3fd05 call_init ./csu/../csu/libc-start.c:128:20
llvm#22 0x00007f00bde3fd05 __libc_start_main@GLIBC_2.2.5 ./csu/../csu/libc-start.c:347:5
llvm#23 0x000055a3a2624921 _start /builddir/glibc-2.39/csu/../sysdeps/x86_64/start.S:117:0
```

This is caused by the fact that after the change, `f16`s are no longer
lowered as `f32`s in calls.

Two possible fixes are available:
- Update calling conventions to properly support passing `f16` as
integers.
- Update `useFPRegsForHalfType()` to return `true` so that `f16` are
still kept in `f32` registers, as before llvm#110199.

This PR implements the first solution to not introduce any more ABI
changes as llvm#110199 already did.

As of what is the correct ABI for halfs, I don't think there is a
correct answer. GCC doesn't support halfs on MIPS, and I couldn't find
any information on old MIPS ABI manuals either.
madhur13490 pushed a commit that referenced this pull request Nov 26, 2024
…#116656)

The main issue to solve is that OpenMP modifiers can be specified in any
order, so the parser cannot expect any specific modifier at a given
position. To solve that, define modifier to be a union of all allowable
specific modifiers for a given clause.

Additionally, implement modifier descriptors: for each modifier the
corresponding descriptor contains a set of properties of the modifier
that allow a common set of semantic checks. Start with the syntactic
properties defined in the spec: Required, Unique, Exclusive, Ultimate,
and implement common checks to verify each of them.

OpenMP modifier overhaul: #2/3
madhur13490 pushed a commit that referenced this pull request Nov 29, 2024
…plementation (llvm#108413. llvm#117704) (llvm#117894)

Relands llvm#117704, which relanded changes from llvm#108413 - this was reverted
due to build issues. The new offload library did not build with
`LIBOMPTARGET_OMPT_SUPPORT` enabled, which was not picked up by
pre-merge testing.

The last commit contains the fix; everything else is otherwise identical
to the approved PR.
___

### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
madhur13490 pushed a commit that referenced this pull request Nov 29, 2024
…abort (llvm#117603)

Hey guys, I found that Flang's built-in ABORT function is incomplete
when I was using it. Compared with gfortran's ABORT (which can both
abort and print out a backtrace), flang's ABORT implementation lacks the
function of printing out a backtrace. This feature is essential for
debugging and understanding the call stack at the failure point.

To solve this problem, I completed the "// TODO:" of the abort function,
and then implemented an additional built-in function BACKTRACE for
flang. After a brief reading of the relevant source code, I used
backtrace and backtrace_symbols in "execinfo.h" to quickly implement
this. But since I used the above two functions directly, my
implementation is slightly different from gfortran's implementation (in
the output, the function call stack before main is additionally output,
and the function line number is missing). In addition, since I used the
above two functions, I did not need to add -g to embed debug information
into the ELF file, but needed -rdynamic to ensure that the symbols are
added to the dynamic symbol table (so that the function name will be
printed out).

Here is a comparison of the output between gfortran 's backtrace and my
implementation:
gfortran's implemention output:
```
#0  0x557eb71f4184 in testfun2_
        at /home/hunter/plct/fortran/test.f90:5
#1  0x557eb71f4165 in testfun1_
        at /home/hunter/plct/fortran/test.f90:13
#2  0x557eb71f4192 in test_backtrace
        at /home/hunter/plct/fortran/test.f90:17
#3  0x557eb71f41ce in main
        at /home/hunter/plct/fortran/test.f90:18
```
my impelmention output:
```
Backtrace:
#0 ./test(_FortranABacktrace+0x32) [0x574f07efcf92]
#1 ./test(testfun2_+0x14) [0x574f07efc7b4]
#2 ./test(testfun1_+0xd) [0x574f07efc7cd]
#3 ./test(_QQmain+0x9) [0x574f07efc7e9]
#4 ./test(main+0x12) [0x574f07efc802]
llvm#5 /usr/lib/libc.so.6(+0x25e08) [0x76954694fe08]
llvm#6 /usr/lib/libc.so.6(__libc_start_main+0x8c) [0x76954694fecc]
llvm#7 ./test(_start+0x25) [0x574f07efc6c5]
```
test program is:
```
function testfun2() result(err)
  implicit none
  integer :: err
  err = 1
  call backtrace
end function testfun2

subroutine testfun1()
  implicit none
  integer :: err
  integer :: testfun2

  err = testfun2()
end subroutine testfun1

program test_backtrace
  call testfun1()
end program test_backtrace
```
I am well aware of the importance of line numbers, so I am now working
on implementing line numbers (by parsing DWARF information) and
supporting cross-platform (Windows) support.
madhur13490 pushed a commit that referenced this pull request Nov 29, 2024
madhur13490 pushed a commit that referenced this pull request Dec 3, 2024
…ne symbol size as symbols are created (llvm#117079)"

This reverts commit ba668eb.

Below test started failing again on x86_64 macOS CI. We're unsure
if this patch is the exact cause, but since this patch has broken
this test before, we speculatively revert it to see if it was indeed
the root cause.
```
FAIL: lldb-shell :: Unwind/trap_frame_sym_ctx.test (1692 of 2162)
******************** TEST 'lldb-shell :: Unwind/trap_frame_sym_ctx.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 7: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
clang: warning: argument unused during compilation: '-fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell' [-Wunused-command-line-argument]
RUN: at line 8: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit | /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input
         ^
<stdin>:26:64: note: scanning from here
 frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
                                                               ^
<stdin>:27:2: note: possible intended match here
 frame #2: 0x00007ff7bfeff6c0
 ^

Input file: <stdin>
Check file: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
           21:  0x100003ed1 <+0>: pushq %rbp
           22:  0x100003ed2 <+1>: movq %rsp, %rbp
           23: (lldb) thread backtrace -u
           24: * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
           25:  * frame #0: 0x0000000100003ecc trap_frame_sym_ctx.test.tmp`bar
           26:  frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
check:21'0                                                                    X error: no match found
           27:  frame #2: 0x00007ff7bfeff6c0
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:21'1      ?                             possible intended match
           28:  frame #3: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           29:  frame #4: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           30:  frame llvm#5: 0x00007ff8193cc41f dyld`start + 1903
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           31: (lldb) exit
check:21'0     ~~~~~~~~~~~~
>>>>>>
```
madhur13490 pushed a commit that referenced this pull request Dec 12, 2024
## Description

This PR fixes a segmentation fault that occurs when passing options
requiring arguments via `-Xopenmp-target=<triple>`. The issue was that
the function `Driver::getOffloadArchs` did not properly parse the
extracted option, but instead assumed it was valid, leading to a crash
when incomplete arguments were provided.

## Backtrace

```sh
llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o 
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o
1.      Compilation construction
2.      Building compilation actions
 #0 0x0000562fb21c363b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (llvm-project/build/bin/clang+++0x392f63b)
 #1 0x0000562fb21c0e3c SignalHandler(int) Signals.cpp:0:0
 #2 0x00007fcbf6c81420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #3 0x0000562fb1fa5d70 llvm::opt::Option::matches(llvm::opt::OptSpecifier) const (llvm-project/build/bin/clang+++0x3711d70)
 #4 0x0000562fb2a78e7d clang::driver::Driver::getOffloadArchs(clang::driver::Compilation&, llvm::opt::DerivedArgList const&, clang::driver::Action::OffloadKind, clang::driver::ToolChain const*, bool) const (llvm-project/build/bin/clang+++0x41e4e7d)
 llvm#5 0x0000562fb2a7a9aa clang::driver::Driver::BuildOffloadingActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, std::pair<clang::driver::types::ID, llvm::opt::Arg const*> const&, clang::driver::Action*) const (.part.1164) Driver.cpp:0:0
 llvm#6 0x0000562fb2a7c093 clang::driver::Driver::BuildActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, llvm::SmallVector<std::pair<clang::driver::types::ID, llvm::opt::Arg const*>, 16u> const&, llvm::SmallVector<clang::driver::Action*, 3u>&) const (llvm-project/build/bin/clang+++0x41e8093)
 llvm#7 0x0000562fb2a8395d clang::driver::Driver::BuildCompilation(llvm::ArrayRef<char const*>) (llvm-project/build/bin/clang+++0x41ef95d)
 llvm#8 0x0000562faf92684c clang_main(int, char**, llvm::ToolContext const&) (llvm-project/build/bin/clang+++0x109284c)
 llvm#9 0x0000562faf826cc6 main (llvm-project/build/bin/clang+++0xf92cc6)
llvm#10 0x00007fcbf6699083 __libc_start_main /build/glibc-LcI20x/glibc-2.31/csu/../csu/libc-start.c:342:3
llvm#11 0x0000562faf923a5e _start (llvm-project/build/bin/clang+++0x108fa5e)
[1]    2628042 segmentation fault (core dumped)   main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu  -o
```
madhur13490 pushed a commit that referenced this pull request Dec 12, 2024
llvm#118923)

…d reentry.

These utilities provide new, more generic and easier to use support for
lazy compilation in ORC.

LazyReexportsManager is an alternative to LazyCallThroughManager. It
takes requests for lazy re-entry points in the form of an alias map:
lazy-reexports = {
  ( <entry point symbol #1>, <implementation symbol #1> ),
  ( <entry point symbol #2>, <implementation symbol #2> ),
  ...
  ( <entry point symbol #n>, <implementation symbol #n> )
}

LazyReexportsManager then:
1. binds the entry points to the implementation names in an internal
table.
2. creates a JIT re-entry trampoline for each entry point.
3. creates a redirectable symbol for each of the entry point name and
binds redirectable symbol to the corresponding reentry trampoline.

When an entry point symbol is first called at runtime (which may be on
any thread of the JIT'd program) it will re-enter the JIT via the
trampoline and trigger a lookup for the implementation symbol stored in
LazyReexportsManager's internal table. When the lookup completes the
entry point symbol will be updated (via the RedirectableSymbolManager)
to point at the implementation symbol, and execution will proceed to the
implementation symbol.

Actual construction of the re-entry trampolines and redirectable symbols
is delegated to an EmitTrampolines functor and the
RedirectableSymbolsManager respectively.

JITLinkReentryTrampolines.h provides a JITLink-based implementation of
the EmitTrampolines functor. (AArch64 only in this patch, but other
architectures will be added in the near future).

Register state save and reentry functionality is added to the ORC
runtime in the __orc_rt_sysv_resolve and __orc_rt_resolve_implementation
functions (the latter is generic, the former will need custom
implementations for each ABI and architecture to be supported, however
this should be much less effort than the existing OrcABISupport
approach, since the ORC runtime allows this code to be written as native
assembly).

The resulting system:
1. Works equally well for in-process and out-of-process JIT'd code.
2. Requires less boilerplate to set up.

Given an ObjectLinkingLayer and PlatformJD (JITDylib containing the ORC
runtime), setup is just:

```c++
auto RSMgr = JITLinkRedirectableSymbolManager::Create(OLL);
if (!RSMgr)
  return RSMgr.takeError();

auto LRMgr = createJITLinkLazyReexportsManager(OLL, **RSMgr, PlatformJD);
if (!LRMgr)
  return LRMgr.takeError();
```

after which lazy reexports can be introduced with:

```c++
JD.define(lazyReexports(LRMgr, <alias map>));
```

LazyObectLinkingLayer is updated to use this new method, but the LLVM-IR
level CompileOnDemandLayer will continue to use LazyCallThroughManager
and OrcABISupport until the new system supports a wider range of
architectures and ABIs.

The llvm-jitlink utility's -lazy option now uses the new scheme. Since
it depends on the ORC runtime, the lazy-link.ll testcase and associated
helpers are moved to the ORC runtime.
madhur13490 pushed a commit that referenced this pull request Dec 12, 2024
The Clang binary (and any binary linking Clang as a library), when built
using PIE, ends up with a pretty shocking number of dynamic relocations
to apply to the executable image: roughly 400k.

Each of these takes up binary space in the executable, and perhaps most
interestingly takes start-up time to apply the relocations.

The largest pattern I identified were the strings used to describe
target builtins. The addresses of these string literals were stored into
huge arrays, each one requiring a dynamic relocation. The way to avoid
this is to design the target builtins to use a single large table of
strings and offsets within the table for the individual strings. This
switches the builtin management to such a scheme.

This saves over 100k dynamic relocations by my measurement, an over 25%
reduction. Just looking at byte size improvements, using the `bloaty`
tool to compare a newly built `clang` binary to an old one:

```
    FILE SIZE        VM SIZE
 --------------  --------------
  +1.4%  +653Ki  +1.4%  +653Ki    .rodata
  +0.0%    +960  +0.0%    +960    .text
  +0.0%    +197  +0.0%    +197    .dynstr
  +0.0%    +184  +0.0%    +184    .eh_frame
  +0.0%     +96  +0.0%     +96    .dynsym
  +0.0%     +40  +0.0%     +40    .eh_frame_hdr
  +114%     +32  [ = ]       0    [Unmapped]
  +0.0%     +20  +0.0%     +20    .gnu.hash
  +0.0%      +8  +0.0%      +8    .gnu.version
  +0.9%      +7  +0.9%      +7    [LOAD #2 [R]]
  [ = ]       0 -75.4% -3.00Ki    .relro_padding
 -16.1%  -802Ki -16.1%  -802Ki    .data.rel.ro
 -27.3% -2.52Mi -27.3% -2.52Mi    .rela.dyn
  -1.6% -2.66Mi  -1.6% -2.66Mi    TOTAL
```

We get a 16% reduction in the `.data.rel.ro` section, and nearly 30%
reduction in `.rela.dyn` where those reloctaions are stored.

This is also visible in my benchmarking of binary start-up overhead at
least:

```
Benchmark 1: ./old_clang --version
  Time (mean ± σ):      17.6 ms ±   1.5 ms    [User: 4.1 ms, System: 13.3 ms]
  Range (min … max):    14.2 ms …  22.8 ms    162 runs

Benchmark 2: ./new_clang --version
  Time (mean ± σ):      15.5 ms ±   1.4 ms    [User: 3.6 ms, System: 11.8 ms]
  Range (min … max):    12.4 ms …  20.3 ms    216 runs

Summary
  './new_clang --version' ran
    1.13 ± 0.14 times faster than './old_clang --version'
```

We get about 2ms faster `--version` runs. While there is a lot of noise
in binary execution time, this delta is pretty consistent, and
represents over 10% improvement. This is particularly interesting to me
because for very short source files, repeatedly starting the `clang`
binary is actually the dominant cost. For example, `configure` scripts
running against the `clang` compiler are slow in large part because of
binary start up time, not the time to process the actual inputs to the
compiler.

----

This PR implements the string tables using `constexpr` code and the
existing macro system. I understand that the builtins are moving towards
a TableGen model, and if complete that would provide more options for
modeling this. Unfortunately, that migration isn't complete, and even
the parts that are migrated still rely on the ability to break out of
the TableGen model and directly expand an X-macro style `BUILTIN(...)`
textually. I looked at trying to complete the move to TableGen, but it
would both require the difficult migration of the remaining targets, and
solving some tricky problems with how to move away from any macro-based
expansion.

I was also able to find a reasonably clean and effective way of doing
this with the existing macros and some `constexpr` code that I think is
clean enough to be a pretty good intermediate state, and maybe give a
good target for the eventual TableGen solution. I was also able to
factor the macros into set of consistent patterns that avoids a
significant regression in overall boilerplate.
madhur13490 pushed a commit that referenced this pull request Jan 22, 2025
llvm#123877)

Reverts llvm#122811 due to buildbot breakage e.g.,
https://lab.llvm.org/buildbot/#/builders/52/builds/5421/steps/11/logs/stdio

ASan output from local re-run:
```
==2780289==ERROR: AddressSanitizer: use-after-poison on address 0x7e0b87e28d28 at pc 0x55a979a99e7e bp 0x7ffe4b18f0b0 sp 0x7ffe4b18f0a8
READ of size 1 at 0x7e0b87e28d28 thread T0
    #0 0x55a979a99e7d in getStorageClass /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:344
    #1 0x55a979a99e7d in isSectionDefinition /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:429:9
    #2 0x55a979a99e7d in getSymbols /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:54:42
    #3 0x55a979a99e7d in lld::coff::writeLLDMapFile(lld::coff::COFFLinkerContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:103:40
    #4 0x55a979a16879 in (anonymous namespace)::Writer::run() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:810:3
    llvm#5 0x55a979a00aac in lld::coff::writeResult(lld::coff::COFFLinkerContext&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:354:15
    llvm#6 0x55a97985f7ed in lld::coff::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:2826:3
    llvm#7 0x55a97984cdd3 in lld::coff::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:97:15
    llvm#8 0x55a9797f9793 in lld::unsafeLldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:163:12
    llvm#9 0x55a9797fa3b6 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:188:15
    llvm#10 0x55a9797fa3b6 in void llvm::function_ref<void ()>::callback_fn<lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>)::$_0>(long) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12
    llvm#11 0x55a97966cb93 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:69:12
    llvm#12 0x55a97966cb93 in llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:426:3
    llvm#13 0x55a9797f9dc3 in lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:187:14
    llvm#14 0x55a979627512 in lld_main(int, char**, llvm::ToolContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/tools/lld/lld.cpp:103:14
    llvm#15 0x55a979628731 in main /usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/tools/lld/tools/lld/lld-driver.cpp:17:10
    llvm#16 0x7ffb8b202c89 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    llvm#17 0x7ffb8b202d44 in __libc_start_main csu/../csu/libc-start.c:360:3
    llvm#18 0x55a97953ef60 in _start (/usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/bin/lld+0x8fd1f60)
```
madhur13490 pushed a commit that referenced this pull request Mar 7, 2025
When compiling VLS SVE, the compiler often replaces VL-based offsets
with immediate-based ones. This leads to a mismatch in the allowed
addressing modes due to SVE loads/stores generally expecting immediate
offsets relative to VL. For example, given:
```c

svfloat64_t foo(const double *x) {
  svbool_t pg = svptrue_b64();
  return svld1_f64(pg, x+svcntd());
}
```

When compiled with `-msve-vector-bits=128`, we currently generate:
```gas
foo:
        ptrue   p0.d
        mov     x8, #2
        ld1d    { z0.d }, p0/z, [x0, x8, lsl #3]
        ret
```

Instead, we could be generating:
```gas
foo:
        ldr     z0, [x0, #1, mul vl]
        ret
```

Likewise for other types, stores, and other VLS lengths.

This patch achieves the above by extending `SelectAddrModeIndexedSVE`
to let constants through when `vscale` is known.
madhur13490 pushed a commit that referenced this pull request Mar 18, 2025
…1027)

No codegen regression on either target. The two builtin_ffs implied on
nvptx CSE away.

```
define internal i64 @__gpu_read_first_lane_u64(i64 noundef %__lane_mask, i64 noundef %__x) #2 {
entry:
  %shr = lshr i64 %__x, 32
  %conv = trunc nuw i64 %shr to i32
  %conv1 = trunc i64 %__x to i32
  %conv2 = trunc i64 %__lane_mask to i32
  %0 = tail call range(i32 0, 33) i32 @llvm.cttz.i32(i32 %conv2, i1 true)
  %iszero = icmp eq i32 %conv2, 0
  %sub = select i1 %iszero, i32 -1, i32 %0
  %1 = tail call i32 @llvm.nvvm.shfl.sync.idx.i32(i32 %conv2, i32 %conv, i32 %sub, i32 31)
  %conv4 = sext i32 %1 to i64
  %shl = shl nsw i64 %conv4, 32
  %2 = tail call i32 @llvm.nvvm.shfl.sync.idx.i32(i32 %conv2, i32 %conv1, i32 %sub, i32 31)
  %conv7 = zext i32 %2 to i64
  %or = or disjoint i64 %shl, %conv7
  ret i64 %or
}
; becomes

define internal i64 @__gpu_competing_read_first_lane_u64(i64 noundef %__lane_mask, i64 noundef %__x) #2 {
entry:
  %shr = lshr i64 %__x, 32
  %conv = trunc nuw i64 %shr to i32
  %conv1 = trunc i64 %__x to i32
  %conv.i = trunc i64 %__lane_mask to i32
  %0 = tail call range(i32 0, 33) i32 @llvm.cttz.i32(i32 %conv.i, i1 true)
  %iszero = icmp eq i32 %conv.i, 0
  %sub.i = select i1 %iszero, i32 -1, i32 %0
  %1 = tail call i32 @llvm.nvvm.shfl.sync.idx.i32(i32 %conv.i, i32 %conv, i32 %sub.i, i32 31)
  %conv4 = zext i32 %1 to i64
  %shl = shl nuw i64 %conv4, 32
  %2 = tail call i32 @llvm.nvvm.shfl.sync.idx.i32(i32 %conv.i, i32 %conv1, i32 %sub.i, i32 31)
  %conv7 = zext i32 %2 to i64
  %or = or disjoint i64 %shl, %conv7
  ret i64 %or
}
```

The sext vs zext difference is vaguely interesting but since the bits
are immediately discarded in either case it make no odds. The amdgcn one
doesn't need CSE, the readfirstlane function is a single call to an
intrinsic.

Drive by fix to __gpu_match_all_u32, it was calling first_lane_u64 and
could use first_lane_u32 instead. Added the missing call to gpuintrin.c
test case and a stray missing static as well.
madhur13490 pushed a commit that referenced this pull request Apr 4, 2025
…too. (llvm#132267)

Observed in Wine when trying to intercept `ExitThread`, which forwards
to `ntdll.RtlExitUserThread`.

`gdb` interprets it as `xchg %ax,%ax`.
`llvm-mc` outputs simply `nop`.

```
==Asan-i386-calls-Dynamic-Test.exe==964==interception_win: unhandled instruction at 0x7be27cf0: 66 90 55 89 e5 56 50 8b
```

```
Wine-gdb> bt
#0  0x789a1766 in __interception::GetInstructionSize (address=<optimized out>, rel_offset=<optimized out>) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:983
#1  0x789ab480 in __sanitizer::SharedPrintfCode(bool, char const*, char*) () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp:311
#2  0x789a18e7 in __interception::OverrideFunctionWithHotPatch (old_func=2078440688, new_func=2023702608, orig_old_func=warning: (Internal error: pc 0x792f1a2c in read in CU, but not in symtab.)warning: (Error: pc 0x792f1a2c in address map, but not in symtab.)0x792f1a2c) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:1118
#3  0x789a1f34 in __interception::OverrideFunction (old_func=2078440688, new_func=2023702608, orig_old_func=warning: (Internal error: pc 0x792f1a2c in read in CU, but not in symtab.)warning: (Error: pc 0x792f1a2c in address map, but not in symtab.)0x792f1a2c) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:1224
#4  0x789a24ce in __interception::OverrideFunction (func_name=0x78a0bc43 <vtable for __asan::AsanThreadContext+1163> "ExitThread", new_func=2023702608, orig_old_func=warning: (Internal error: pc 0x792f1a2c in read in CU, but not in symtab.)warning: (Error: pc 0x792f1a2c in address map, but not in symtab.)0x792f1a2c)    at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:1369
llvm#5  0x789f40ef in __asan::InitializePlatformInterceptors () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_win.cpp:190
llvm#6  0x789e0c3c in __asan::InitializeAsanInterceptors () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_interceptors.cpp:802
llvm#7  0x789ee6b5 in __asan::AsanInitInternal () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:442
llvm#8  0x789eefb0 in __asan::AsanInitFromRtl () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:522
llvm#9  __asan::AsanInitializer::AsanInitializer (this=<optimized out>) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:542
llvm#10 __cxx_global_var_init () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:546
...
Wine-gdb> disassemble /r 2078440688,2078440688+20
Dump of assembler code from 0x7be27cf0 to 0x7be27d04:
   0x7be27cf0 <_RtlExitUserThread@4+0>: 66 90                   xchg   %ax,%ax
...
```
madhur13490 pushed a commit that referenced this pull request Apr 4, 2025
…#134130)

This should fix failures caused by
llvm#133967
Attn: @sarnex
Thanks

Signed-off-by: Arvind Sudarsanam <[email protected]>
madhur13490 pushed a commit that referenced this pull request Apr 9, 2025
…d A520 (llvm#132246)

Inefficient SVE codegen occurs on at least two in-order cores,
those being Cortex-A510 and Cortex-A520. For example a simple vector
add

```
void foo(float a, float b, float dst, unsigned n) {
    for (unsigned i = 0; i < n; ++i)
        dst[i] = a[i] + b[i];
}
```

Vectorizes the inner loop into the following interleaved sequence
of instructions.

```
        add     x12, x1, x10
        ld1b    { z0.b }, p0/z, [x1, x10]
        add     x13, x2, x10
        ld1b    { z1.b }, p0/z, [x2, x10]
        ldr     z2, [x12, #1, mul vl]
        ldr     z3, [x13, #1, mul vl]
        dech    x11
        add     x12, x0, x10
        fadd    z0.s, z1.s, z0.s
        fadd    z1.s, z3.s, z2.s
        st1b    { z0.b }, p0, [x0, x10]
        addvl   x10, x10, #2
        str     z1, [x12, #1, mul vl]
```

By adjusting the target features to prefer fixed over scalable if the
cost is equal we get the following vectorized loop.

```
         ldp q0, q3, [x11, #-16]
         subs    x13, x13, llvm#8
         ldp q1, q2, [x10, #-16]
         add x10, x10, llvm#32
         add x11, x11, llvm#32
         fadd    v0.4s, v1.4s, v0.4s
         fadd    v1.4s, v2.4s, v3.4s
         stp q0, q1, [x12, #-16]
         add x12, x12, llvm#32
```

Which is more efficient.
madhur13490 pushed a commit that referenced this pull request Apr 9, 2025
… A510/A520 (llvm#134606)

Recommit. This work was done by llvm#132246 but failed buildbots due to the
test introduced needing updates

Inefficient SVE codegen occurs on at least two in-order cores, those
being Cortex-A510 and Cortex-A520. For example a simple vector add

```
void foo(float a, float b, float dst, unsigned n) {
    for (unsigned i = 0; i < n; ++i)
        dst[i] = a[i] + b[i];
}
```

Vectorizes the inner loop into the following interleaved sequence of
instructions.

```
        add     x12, x1, x10
        ld1b    { z0.b }, p0/z, [x1, x10]
        add     x13, x2, x10
        ld1b    { z1.b }, p0/z, [x2, x10]
        ldr     z2, [x12, #1, mul vl]
        ldr     z3, [x13, #1, mul vl]
        dech    x11
        add     x12, x0, x10
        fadd    z0.s, z1.s, z0.s
        fadd    z1.s, z3.s, z2.s
        st1b    { z0.b }, p0, [x0, x10]
        addvl   x10, x10, #2
        str     z1, [x12, #1, mul vl]
```

By adjusting the target features to prefer fixed over scalable if the
cost is equal we get the following vectorized loop.

```
         ldp q0, q3, [x11, #-16]
         subs    x13, x13, llvm#8
         ldp q1, q2, [x10, #-16]
         add x10, x10, llvm#32
         add x11, x11, llvm#32
         fadd    v0.4s, v1.4s, v0.4s
         fadd    v1.4s, v2.4s, v3.4s
         stp q0, q1, [x12, #-16]
         add x12, x12, llvm#32
```

Which is more efficient.
madhur13490 pushed a commit that referenced this pull request Apr 11, 2025
…ctor-bits=128." (llvm#134997)

Reverts llvm#134068

Caused a stage 2 build failure:
https://lab.llvm.org/buildbot/#/builders/41/builds/6016

```
FAILED: lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o 
/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/include -mcpu=neoverse-512tvb -mllvm -scalable-vectorization=preferred -mllvm -treat-scalable-fixed-error-as-warning=false -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Werror=global-constructors -O3 -DNDEBUG -std=c++17 -UNDEBUG  -fno-exceptions -funwind-tables -fno-rtti -MD -MT lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -c /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support/Caching.cpp
Opcode has unknown scale!
UNREACHABLE executed at ../llvm/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:4530!
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/include -mcpu=neoverse-512tvb -mllvm -scalable-vectorization=preferred -mllvm -treat-scalable-fixed-error-as-warning=false -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Werror=global-constructors -O3 -DNDEBUG -std=c++17 -UNDEBUG -fno-exceptions -funwind-tables -fno-rtti -MD -MT lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -c /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support/Caching.cpp
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support/Caching.cpp'.
4.	Running pass 'AArch64 load / store optimization pass' on function '@"_ZNSt17_Function_handlerIFN4llvm8ExpectedISt8functionIFNS1_ISt10unique_ptrINS0_16CachedFileStreamESt14default_deleteIS4_EEEEjRKNS0_5TwineEEEEEjNS0_9StringRefESB_EZNS0_10localCacheESB_SB_SB_S2_IFvjSB_S3_INS0_12MemoryBufferES5_ISH_EEEEE3$_0E9_M_invokeERKSt9_Any_dataOjOSF_SB_"'
 #0 0x0000b6eae9b67bf0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x81c7bf0)
 #1 0x0000b6eae9b65aec llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x81c5aec)
 #2 0x0000b6eae9acd5f4 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x0000f16c1aff28f8 (linux-vdso.so.1+0x8f8)
 #4 0x0000f16c1aacf1f0 __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 llvm#5 0x0000f16c1aa8a67c gsignal ./signal/../sysdeps/posix/raise.c:27:6
 llvm#6 0x0000f16c1aa77130 abort ./stdlib/abort.c:81:7
 llvm#7 0x0000b6eae9ad6628 (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x8136628)
 llvm#8 0x0000b6eae72e95a8 (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x59495a8)
 llvm#9 0x0000b6eae74ca9a8 (anonymous namespace)::AArch64LoadStoreOpt::findMatchingInsn(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, (anonymous namespace)::LdStPairFlags&, unsigned int, bool) AArch64LoadStoreOptimizer.cpp:0:0
llvm#10 0x0000b6eae74c85a8 (anonymous namespace)::AArch64LoadStoreOpt::tryToPairLdStInst(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&) AArch64LoadStoreOptimizer.cpp:0:0
llvm#11 0x0000b6eae74c624c (anonymous namespace)::AArch64LoadStoreOpt::optimizeBlock(llvm::MachineBasicBlock&, bool) AArch64LoadStoreOptimizer.cpp:0:0
llvm#12 0x0000b6eae74c429c (anonymous namespace)::AArch64LoadStoreOpt::runOnMachineFunction(llvm::MachineFunction&) AArch64LoadStoreOptimizer.cpp:0:0
```
madhur13490 pushed a commit that referenced this pull request Apr 21, 2025
…vailable (llvm#135343)

When a frame is inlined, LLDB will display its name in backtraces as
follows:
```
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.3
  * frame #0: 0x0000000100000398 a.out`func() [inlined] baz(x=10) at inline.cpp:1:42
    frame #1: 0x0000000100000398 a.out`func() [inlined] bar() at inline.cpp:2:37
    frame #2: 0x0000000100000398 a.out`func() at inline.cpp:4:15
    frame #3: 0x00000001000003c0 a.out`main at inline.cpp:7:5
    frame #4: 0x000000026eb29ab8 dyld`start + 6812
```
The longer the names get the more confusing this gets because the first
function name that appears is the parent frame. My assumption (which may
need some more surveying) is that for the majority of cases we only care
about the actual frame name (not the parent). So this patch removes all
the special logic that prints the parent frame.

Another quirk of the current format is that the inlined frame name does
not abide by the `${function.name-XXX}` format variables. We always just
print the raw demangled name. With this patch, we would format the
inlined frame name according to the `frame-format` setting (see the
test-cases).

If we really want to have the `parentFrame [inlined] inlinedFrame`
format, we could expose it through a new `frame-format` variable (e..g.,
`${function.inlined-at-name}` and let the user decide where to place
things.
madhur13490 pushed a commit that referenced this pull request Apr 21, 2025
The linter messed up the order of includes, which is necessary as is.
madhur13490 pushed a commit that referenced this pull request Apr 29, 2025
…mbolConjured" (llvm#137304)

Reverts llvm#128251

ASAN bots reported some errors:
https://lab.llvm.org/buildbot/#/builders/55/builds/10398
Reverting for investigation.

```
Failed Tests (6):
  Clang :: Analysis/loop-widening-ignore-static-methods.cpp
  Clang :: Analysis/loop-widening-notes.cpp
  Clang :: Analysis/loop-widening-preserve-reference-type.cpp
  Clang :: Analysis/loop-widening.c
  Clang :: Analysis/loop-widening.cpp
  Clang :: Analysis/this-pointer.cpp
Testing Time: 411.55s
Total Discovered Tests: 118563
  Skipped          :     33 (0.03%)
  Unsupported      :   2015 (1.70%)
  Passed           : 116291 (98.08%)
  Expectedly Failed:    218 (0.18%)
  Failed           :      6 (0.01%)
FAILED: CMakeFiles/check-all /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/CMakeFiles/check-all 
cd /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan && /usr/bin/python3 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/./bin/llvm-lit -sv --param USE_Z3_SOLVER=0 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/utils/mlgo-utils /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/lld/test /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/mlir/test /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/clang/test /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/utils/lit /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/test
ninja: build stopped: subcommand failed.
```

```
/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c # RUN: at line 1
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c
1.	<eof> parser at end of file
2.	While analyzing stack: 
	#0 Calling nested_loop_inner_widen
 #0 0x0000c894cca289cc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13
 #1 0x0000c894cca23324 llvm::sys::RunSignalHandlers() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x0000c894cca29bbc SignalHandler(int, siginfo_t*, void*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
 #3 0x0000f6898da4a8f8 (linux-vdso.so.1+0x8f8)
 #4 0x0000f6898d377608 (/lib/aarch64-linux-gnu/libc.so.6+0x87608)
 llvm#5 0x0000f6898d32cb3c raise (/lib/aarch64-linux-gnu/libc.so.6+0x3cb3c)
 llvm#6 0x0000f6898d317e00 abort (/lib/aarch64-linux-gnu/libc.so.6+0x27e00)
 llvm#7 0x0000c894c5e77fec __sanitizer::Atexit(void (*)()) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp:168:10
 llvm#8 0x0000c894c5e76680 __sanitizer::Die() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:52:5
 llvm#9 0x0000c894c5e69650 Unlock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_mutex.h:250:16
llvm#10 0x0000c894c5e69650 ~GenericScopedLock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_mutex.h:386:51
llvm#11 0x0000c894c5e69650 __hwasan::ScopedReport::~ScopedReport() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:54:5
llvm#12 0x0000c894c5e68de0 __hwasan::(anonymous namespace)::BaseReport::~BaseReport() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:476:7
llvm#13 0x0000c894c5e66b74 __hwasan::ReportTagMismatch(__sanitizer::StackTrace*, unsigned long, unsigned long, bool, bool, unsigned long*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:1091:1
llvm#14 0x0000c894c5e52cf8 Destroy /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common.h:532:31
llvm#15 0x0000c894c5e52cf8 ~InternalMmapVector /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common.h:642:56
llvm#16 0x0000c894c5e52cf8 __hwasan::HandleTagMismatch(__hwasan::AccessInfo, unsigned long, unsigned long, void*, unsigned long*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan.cpp:245:1
llvm#17 0x0000c894c5e551c8 __hwasan_tag_mismatch4 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan.cpp:764:1
llvm#18 0x0000c894c5e6a2f8 __interception::InterceptFunction(char const*, unsigned long*, unsigned long, unsigned long) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/interception/interception_linux.cpp:60:0
llvm#19 0x0000c894d166f664 getBlock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CoreEngine.h:217:45
llvm#20 0x0000c894d166f664 getCFGElementRef /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h:230:59
llvm#21 0x0000c894d166f664 clang::ento::ExprEngine::processCFGBlockEntrance(clang::BlockEdge const&, clang::ento::NodeBuilderWithSinks&, clang::ento::ExplodedNode*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp:2570:45
llvm#22 0x0000c894d15f3a1c hasGeneratedNodes /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CoreEngine.h:333:37
llvm#23 0x0000c894d15f3a1c clang::ento::CoreEngine::HandleBlockEdge(clang::BlockEdge const&, clang::ento::ExplodedNode*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:319:20
llvm#24 0x0000c894d15f2c34 clang::ento::CoreEngine::dispatchWorkItem(clang::ento::ExplodedNode*, clang::ProgramPoint, clang::ento::WorkListUnit const&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:220:7
llvm#25 0x0000c894d15f2398 operator-> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__memory/unique_ptr.h:267:101
llvm#26 0x0000c894d15f2398 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>)::$_0::operator()(unsigned int) const /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:140:12
llvm#27 0x0000c894d15f14b4 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:165:7
llvm#28 0x0000c894d0ebb9dc release /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:232:9
llvm#29 0x0000c894d0ebb9dc ~IntrusiveRefCntPtr /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:196:27
llvm#30 0x0000c894d0ebb9dc ExecuteWorkList /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h:192:5
llvm#31 0x0000c894d0ebb9dc RunPathSensitiveChecks /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:772:7
llvm#32 0x0000c894d0ebb9dc (anonymous namespace)::AnalysisConsumer::HandleCode(clang::Decl*, unsigned int, clang::ento::ExprEngine::InliningModes, llvm::DenseSet<clang::Decl const*, llvm::DenseMapInfo<clang::Decl const*, void>>*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:741:5
llvm#33 0x0000c894d0eb6ee4 begin /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/DenseMap.h:0:0
llvm#34 0x0000c894d0eb6ee4 begin /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/DenseSet.h:187:45
llvm#35 0x0000c894d0eb6ee4 HandleDeclsCallGraph /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:516:29
llvm#36 0x0000c894d0eb6ee4 runAnalysisOnTranslationUnit /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:584:5
llvm#37 0x0000c894d0eb6ee4 (anonymous namespace)::AnalysisConsumer::HandleTranslationUnit(clang::ASTContext&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:647:3
llvm#38 0x0000c894d18a7a38 clang::ParseAST(clang::Sema&, bool, bool) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:0:13
llvm#39 0x0000c894ce81ed70 clang::FrontendAction::Execute() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1231:10
llvm#40 0x0000c894ce6f2144 getPtr /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:278:42
llvm#41 0x0000c894ce6f2144 operator bool /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:241:16
llvm#42 0x0000c894ce6f2144 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1058:23
llvm#43 0x0000c894cea718cc operator-> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__memory/shared_ptr.h:635:12
llvm#44 0x0000c894cea718cc getFrontendOpts /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/Frontend/CompilerInstance.h:307:12
llvm#45 0x0000c894cea718cc clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:301:14
llvm#46 0x0000c894c5e9cf28 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/tools/driver/cc1_main.cpp:294:15
llvm#47 0x0000c894c5e92a9c ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/tools/driver/driver.cpp:223:12
llvm#48 0x0000c894c5e902ac clang_main(int, char**, llvm::ToolContext const&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/tools/driver/driver.cpp:0:12
llvm#49 0x0000c894c5eb2e34 main /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/clang/tools/driver/clang-driver.cpp:17:3
llvm#50 0x0000f6898d3184c4 (/lib/aarch64-linux-gnu/libc.so.6+0x284c4)
llvm#51 0x0000f6898d318598 __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x28598)
llvm#52 0x0000c894c5e52a30 _start (/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang+0x6512a30)
/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/clang/test/Analysis/Output/loop-widening.c.script: line 2: 2870204 Aborted                 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c
```
madhur13490 pushed a commit that referenced this pull request Apr 29, 2025
`clang-repl --cuda` was previously crashing with a segmentation fault,
instead of reporting a clean error
```
(base) anutosh491@Anutoshs-MacBook-Air bin % ./clang-repl --cuda
#0 0x0000000111da4fbc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x150fbc)
#1 0x0000000111da31dc llvm::sys::RunSignalHandlers() (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x14f1dc)
#2 0x0000000111da5628 SignalHandler(int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x151628)
#3 0x000000019b242de4 (/usr/lib/system/libsystem_platform.dylib+0x180482de4)
#4 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0)
llvm#5 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0)
llvm#6 0x0000000107f6bac8 clang::Interpreter::createWithCUDA(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x2173ac8)
llvm#7 0x000000010206f8a8 main (/opt/local/libexec/llvm-20/bin/clang-repl+0x1000038a8)
llvm#8 0x000000019ae8c274 
Segmentation fault: 11
```


The underlying issue was that the `DeviceCompilerInstance` (used for
device-side CUDA compilation) was never initialized with a `Sema`, which
is required before constructing the `IncrementalCUDADeviceParser`.


https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/DeviceOffload.cpp#L32


https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/IncrementalParser.cpp#L31

Unlike the host-side `CompilerInstance` which runs `ExecuteAction`
inside the Interpreter constructor (thereby setting up Sema), the
device-side CI was passed into the parser uninitialized, leading to an
assertion or crash when accessing its internals.

To fix this, I refactored the `Interpreter::create` method to include an
optional `DeviceCI` parameter. If provided, we know we need to take care
of this instance too. Only then do we construct the
`IncrementalCUDADeviceParser`.
madhur13490 pushed a commit that referenced this pull request Apr 29, 2025
This were failing on Windows CI with errors like:
```
22: (lldb) bt
23: * thread #1, stop reason = breakpoint 1.1
24:  frame #0: 0x00007ff7c5e41000 TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`func at main.m:2
25:  frame #1: 0x00007ff7c5e4101c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`bar + 12 at main.m:3
26:  frame #2: 0x00007ff7c5e4103c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`main + 16 at main.m:5
27:  custom-frame '()'
     !~~~~~~~~~~~       error: no match expected
28:  custom-frame '(__formal=<unavailable>)'
```
madhur13490 pushed a commit that referenced this pull request May 13, 2025
llvm#138091)

Check this error for more context
(https://github.com/compiler-research/CppInterOp/actions/runs/14749797085/job/41407625681?pr=491#step:10:531)

This fails with 
```
* thread #1, name = 'CppInterOpTests', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x55500356d6d3)
  * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99
    frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830
    frame #2: 0x00007fffee20917a libclangCppInterOp.so.21.0gitstd::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 58
    frame #3: 0x00007fffee224796 libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 838
    frame #4: 0x00007fffee22494d libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 13
    frame llvm#5: 0x00007fffed95ec62 libclangCppInterOp.so.21.0gitclang::IncrementalCUDADeviceParser::~IncrementalCUDADeviceParser() + 98
    frame llvm#6: 0x00007fffed9551b6 libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 102
    frame llvm#7: 0x00007fffed95598d libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 13
    frame llvm#8: 0x00007fffed9181e7 libclangCppInterOp.so.21.0gitcompat::createClangInterpreter(std::vector<char const*, std::allocator<char const*>>&) + 2919
```

Problem : 

1) The destructor currently handles no clearance for the DeviceParser
and the DeviceAct. We currently only have this

https://github.com/llvm/llvm-project/blob/976493822443c52a71ed3c67aaca9a555b20c55d/clang/lib/Interpreter/Interpreter.cpp#L416-L419

2) The ownership for DeviceCI currently is present in
IncrementalCudaDeviceParser. But this should be similar to how the
combination for hostCI, hostAction and hostParser are managed by the
Interpreter. As on master the DeviceAct and DeviceParser are managed by
the Interpreter but not DeviceCI. This is problematic because :
IncrementalParser holds a Sema& which points into the DeviceCI. On
master, DeviceCI is destroyed before the base class ~IncrementalParser()
runs, causing Parser::reset() to access a dangling Sema (and as Sema
holds a reference to Preprocessor which owns PragmaNamespace) we see
this
```
  * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99
    frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830
    
```
madhur13490 pushed a commit that referenced this pull request May 13, 2025
Fix for:
`Assertion failed: (false && "Architecture or OS not supported"),
function CreateRegisterContextForFrame, file
/usr/src/contrib/llvm-project/lldb/source/Plugins/Process/elf-core/ThreadElfCore.cpp,
line 182.
PLEASE submit a bug report to https://bugs.freebsd.org/submit/ and
include the crash backtrace.
#0 0x000000080cd857c8 llvm::sys::PrintStackTrace(llvm::raw_ostream&,
int)
/usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
#1 0x000000080cd85ed4
/usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:797:3
#2 0x000000080cd82ae8 llvm::sys::RunSignalHandlers()
/usr/src/contrib/llvm-project/llvm/lib/Support/Signals.cpp:104:5
#3 0x000000080cd861f0 SignalHandler
/usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:403:3 #4
0x000000080f159644 handle_signal
/usr/src/lib/libthr/thread/thr_sig.c:298:3
`
madhur13490 pushed a commit that referenced this pull request May 13, 2025
The mcmodel=tiny memory model is only valid on ARM targets. While trying
this on X86 compiler throws an internal error along with stack dump.
llvm#125641
This patch resolves the issue.
Reduced test case:
```
#include <stdio.h>
int main( void )
{
printf( "Hello, World!\n" ); 
return 0; 
}
```
```
0.	Program arguments: /opt/compiler-explorer/clang-trunk/bin/clang++ -gdwarf-4 -g -o /app/output.s -fno-verbose-asm -S --gcc-toolchain=/opt/compiler-explorer/gcc-snapshot -fcolor-diagnostics -fno-crash-diagnostics -mcmodel=tiny <source>
1.	<eof> parser at end of file
 #0 0x0000000003b10218 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b10218)
 #1 0x0000000003b0e35c llvm::sys::CleanupOnSignal(unsigned long) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b0e35c)
 #2 0x0000000003a5dbc3 llvm::CrashRecoveryContext::HandleExit(int) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a5dbc3)
 #3 0x0000000003b05cfe llvm::sys::Process::Exit(int, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b05cfe)
 #4 0x0000000000d4e3eb LLVMErrorHandler(void*, char const*, bool) cc1_main.cpp:0:0
 llvm#5 0x0000000003a67c93 llvm::report_fatal_error(llvm::Twine const&, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a67c93)
 llvm#6 0x0000000003a67df8 (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a67df8)
 llvm#7 0x0000000002549148 llvm::X86TargetMachine::X86TargetMachine(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, std::optional<llvm::Reloc::Model>, std::optional<llvm::CodeModel::Model>, llvm::CodeGenOptLevel, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x2549148)
 llvm#8 0x00000000025491fc llvm::RegisterTargetMachine<llvm::X86TargetMachine>::Allocator(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, std::optional<llvm::Reloc::Model>, std::optional<llvm::CodeModel::Model>, llvm::CodeGenOptLevel, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x25491fc)
 llvm#9 0x0000000003db74cc clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3db74cc)
llvm#10 0x0000000004460d95 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4460d95)
llvm#11 0x00000000060005ec clang::ParseAST(clang::Sema&, bool, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x60005ec)
llvm#12 0x00000000044614b5 clang::CodeGenAction::ExecuteAction() (/opt/compiler-explorer/clang-trunk/bin/clang+++0x44614b5)
llvm#13 0x0000000004737121 clang::FrontendAction::Execute() (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4737121)
llvm#14 0x00000000046b777b clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x46b777b)
llvm#15 0x00000000048229e3 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x48229e3)
llvm#16 0x0000000000d50621 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd50621)
llvm#17 0x0000000000d48e2d ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
llvm#18 0x00000000044acc99 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
llvm#19 0x0000000003a5dac3 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a5dac3)
llvm#20 0x00000000044aceb9 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
llvm#21 0x00000000044710dd clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x44710dd)
llvm#22 0x0000000004472071 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4472071)
llvm#23 0x000000000447c3fc clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x447c3fc)
llvm#24 0x0000000000d4d2b1 clang_main(int, char**, llvm::ToolContext const&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd4d2b1)
llvm#25 0x0000000000c12464 main (/opt/compiler-explorer/clang-trunk/bin/clang+++0xc12464)
llvm#26 0x00007ae43b029d90 (/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
llvm#27 0x00007ae43b029e40 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e40)
llvm#28 0x0000000000d488c5 _start (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd488c5)
```

---------

Co-authored-by: Shashwathi N <[email protected]>
madhur13490 pushed a commit that referenced this pull request May 27, 2025
… `getForwardSlice` matchers (llvm#115670)

Improve mlir-query tool by implementing `getBackwardSlice` and
`getForwardSlice` matchers. As an addition `SetQuery` also needed to be
added to enable custom configuration for each query. e.g: `inclusive`,
`omitUsesFromAbove`, `omitBlockArguments`.

Note: backwardSlice and forwardSlice algoritms are the same as the ones
in `mlir/lib/Analysis/SliceAnalysis.cpp`
Example of current matcher. The query was made to the file:
`mlir/test/mlir-query/complex-test.mlir`

```mlir
./mlir-query /home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir -c "match getDefinitions(hasOpName(\"arith.add
f\"),2)"

Match #1:

/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:5:8:
  %0 = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg0 : tensor<5x5xf32>) outs(%arg1 : tensor<5x5xf32>) {
       ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:7:10: note: "root" binds here
    %2 = arith.addf %in, %in : f32
         ^
Match #2:

/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:10:16:
  %collapsed = tensor.collapse_shape %0 [[0, 1]] : tensor<5x5xf32> into tensor<25xf32>
               ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:13:11:
    %c2 = arith.constant 2 : index
          ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:14:18:
    %extracted = tensor.extract %collapsed[%c2] : tensor<25xf32>
                 ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:15:10: note: "root" binds here
    %2 = arith.addf %extracted, %extracted : f32
         ^
2 matches.
```
madhur13490 pushed a commit that referenced this pull request Jun 17, 2025
Fixes llvm#123300

What is seen 
```
clang-repl> int x = 42;
clang-repl> auto capture = [&]() { return x * 2; };
In file included from <<< inputs >>>:1:
input_line_4:1:17: error: non-local lambda expression cannot have a capture-default
    1 | auto capture = [&]() { return x * 2; };
      |                 ^
zsh: segmentation fault  clang-repl --Xcc="-v"

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
  * frame #0: 0x0000000107b4f8b8 libclang-cpp.19.1.dylib`clang::IncrementalParser::CleanUpPTU(clang::PartialTranslationUnit&) + 988
    frame #1: 0x0000000107b4f1b4 libclang-cpp.19.1.dylib`clang::IncrementalParser::ParseOrWrapTopLevelDecl() + 416
    frame #2: 0x0000000107b4fb94 libclang-cpp.19.1.dylib`clang::IncrementalParser::Parse(llvm::StringRef) + 612
    frame #3: 0x0000000107b52fec libclang-cpp.19.1.dylib`clang::Interpreter::ParseAndExecute(llvm::StringRef, clang::Value*) + 180
    frame #4: 0x0000000100003498 clang-repl`main + 3560
    frame llvm#5: 0x000000018d39a0e0 dyld`start + 2360
```

Though the error is justified, we shouldn't be interested in exiting
through a segfault in such cases.

The issue is that empty named decls weren't being taken care of
resulting into this assert


https://github.com/llvm/llvm-project/blob/c1a229252617ed58f943bf3f4698bd8204ee0f04/clang/include/clang/AST/DeclarationName.h#L503

Can also be seen when the example is attempted through xeus-cpp-lite.


![image](https://github.com/user-attachments/assets/9b0e6ead-138e-4b06-9ad9-fcb9f8d5bf6e)
madhur13490 pushed a commit that referenced this pull request Jun 17, 2025
…142952)

This was removed in llvm#135343 in
favour of making it a format variable, which we do here. This follows
the precedent of the `[opt]` and `[artificial]` markers.

Before:
```
 thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
 * frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3
   frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
   frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43
   frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
   frame #4: 0x0000000186345be4 dyld`start + 7040
```

After (note the `[inlined]` markers):
```
thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
* frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3 [inlined]
  frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
  frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43 [inlined]
  frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
  frame #4: 0x0000000186345be4 dyld`start + 7040
```

rdar://152642178
madhur13490 pushed a commit that referenced this pull request Jul 1, 2025
LLVM prevents the sm_32_intrinsics.hpp header from being included with a
#define SM_32_INTRINSICS_HPP. It also provides drop-in replacements of
the functions defined in the CUDA header.

One issue is that some intrinsics were added after the replacement was
written, and thus have no replacement, breaking code that calls them
(Raft is one example).

This commit backport the code from sm_32_intrinsics.hpp for the missing
intrinsics.

This is the second try after PR llvm#143664 broke tests.
madhur13490 pushed a commit that referenced this pull request Jul 1, 2025
The function already exposes a work list to avoid deep recursion, this
commit starts utilizing it in a helper that could also lead to a deep
recursion.

We have observed this crash on `clang/test/C/C99/n590.c` with our
internal builds that enable aggressive optimizations and hit the limit
earlier than default release builds of Clang.

See the added test for an example with a deeper recursion that used to
crash in upstream Clang before this change with the following stack
trace:

```
  #0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13
  #1 llvm::sys::RunSignalHandlers() /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Signals.cpp:106:18
  #2 SignalHandler(int, siginfo_t*, void*) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
  #3 (/lib/x86_64-linux-gnu/libc.so.6+0x3fdf0)
  #4 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12772:0
  llvm#5 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
  llvm#6 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
  llvm#7 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
  llvm#8 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
  llvm#9 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
 llvm#10 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
 llvm#11 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
 llvm#12 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
 llvm#13 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
 llvm#14 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
 llvm#15 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
 llvm#16 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
 llvm#17 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
 llvm#18 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
 llvm#19 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
... 700+ more stack frames.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant