Skip to content

[VPlan] Emit llvm.masked.{u,s}{div,rem}#10

Open
lukel97 wants to merge 3121 commits into
mainfrom
loop-vectorize/masked-divrem
Open

[VPlan] Emit llvm.masked.{u,s}{div,rem}#10
lukel97 wants to merge 3121 commits into
mainfrom
loop-vectorize/masked-divrem

Conversation

@lukel97
Copy link
Copy Markdown
Owner

@lukel97 lukel97 commented Apr 10, 2026

No description provided.

@lukel97
Copy link
Copy Markdown
Owner Author

lukel97 commented Apr 10, 2026

/test-suite

@github-actions
Copy link
Copy Markdown

@lukel97 lukel97 force-pushed the loop-vectorize/masked-divrem branch from b3be2f0 to cea36af Compare April 10, 2026 10:00
lukel97 pushed a commit that referenced this pull request Apr 16, 2026
…bols add' (llvm#188377)

Context: 
lldb might crash when running to a debuggee crashing state and do a
target symbols add command.
Backtrace: 
```
 #0 0x000055ca6790dc65 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:848:11
 #1 0x000055ca6790e434 PrintStackTraceSignalHandler(void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:931:1
 #2 0x000055ca6790b839 llvm::sys::RunSignalHandlers() /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Signals.cpp:104:5
 #3 0x000055ca6790ff6b SignalHandler(int, siginfo_t*, void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:430:38
 #4 0x00007fe9e5e44560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0
 #5 0x00007fe9e5f25649 syscall /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/misc/../sysdeps/unix/sysv/linux/x86_64/syscall.S:38:0
 #6 0x00007fe9ec649170 SignalHandler(int, siginfo_t*, void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:429:7
 #7 0x00007fe9e5e44560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0
 #8 0x00007fe9ebb77bf0 lldb_private::operator<(lldb_private::StackID const&, lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackID.cpp:99:16
 #9 0x00007fe9ebb6863d CompareStackID(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackFrameList.cpp:683:3
#10 0x00007fe9ebb6d049 bool __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>::operator()<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/predefined_ops.h:196:4
#11 0x00007fe9ebb6cefe __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>> std::__lower_bound<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID, __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&, __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/stl_algobase.h:1464:8
#12 0x00007fe9ebb6cdfc __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>> std::lower_bound<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/stl_algo.h:2062:14
llvm#13 0x00007fe9ebb685fa auto llvm::lower_bound<std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>&, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>(std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>&, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)) /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/STLExtras.h:2001:10
llvm#14 0x00007fe9ebb68441 lldb_private::StackFrameList::GetFrameWithStackID(lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackFrameList.cpp:697:11
llvm#15 0x00007fe9ebbee395 lldb_private::Thread::GetFrameWithStackID(lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/include/lldb/Target/Thread.h:459:7
llvm#16 0x00007fe9ebac7cf7 lldb_private::ExecutionContextRef::GetFrameSP() const /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/ExecutionContext.cpp:643:25
llvm#17 0x00007fe9ebac80e1 lldb_private::GetStoppedExecutionContext(lldb_private::ExecutionContextRef const*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/ExecutionContext.cpp:164:34
llvm#18 0x00007fe9eb8903fa lldb_private::Statusline::Redraw(std::optional<lldb_private::ExecutionContextRef>) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Statusline.cpp:139:7
llvm#19 0x00007fe9eb7ac8be lldb_private::Debugger::RedrawStatusline(std::optional<lldb_private::ExecutionContextRef>) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Debugger.cpp:1233:3
llvm#20 0x00007fe9eb804d1e lldb_private::IOHandlerEditline::RedrawCallback() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:446:3
llvm#21 0x00007fe9eb80aa81 lldb_private::IOHandlerEditline::IOHandlerEditline(lldb_private::Debugger&, lldb_private::IOHandler::Type, std::shared_ptr<lldb_private::File> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, unsigned int, char const*, llvm::StringRef, llvm::StringRef, bool, bool, unsigned int, lldb_private::IOHandlerDelegate&)::$_2::operator()() const /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:262:73
llvm#22 0x00007fe9eb80aa5d void llvm::detail::UniqueFunctionBase<void>::CallImpl<lldb_private::IOHandlerEditline::IOHandlerEditline(lldb_private::Debugger&, lldb_private::IOHandler::Type, std::shared_ptr<lldb_private::File> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, unsigned int, char const*, llvm::StringRef, llvm::StringRef, bool, bool, unsigned int, lldb_private::IOHandlerDelegate&)::$_2>(void*) /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:213:5
llvm#23 0x00007fe9eb93bfbf llvm::unique_function<void ()>::operator()() /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:365:5
llvm#24 0x00007fe9eb93bb80 lldb_private::Editline::GetCharacter(wchar_t*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:0:5
llvm#25 0x00007fe9eb941a18 lldb_private::Editline::ConfigureEditor(bool)::$_0::operator()(editline*, wchar_t*) const /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1287:5
llvm#26 0x00007fe9eb9419e2 lldb_private::Editline::ConfigureEditor(bool)::$_0::__invoke(editline*, wchar_t*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1286:27
llvm#27 0x00007fe9f3384e26 el_getc /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:439:14
llvm#28 0x00007fe9f3384e26 el_getc /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:400:1
llvm#29 0x00007fe9f3384f90 read_getcmd /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:247:14
llvm#30 0x00007fe9f3384f90 el_gets /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:586:14
llvm#31 0x00007fe9eb9409f3 lldb_private::Editline::GetLine(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, bool&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1636:16
llvm#32 0x00007fe9eb8044d7 lldb_private::IOHandlerEditline::GetLine(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, bool&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:339:5
llvm#33 0x00007fe9eb805609 lldb_private::IOHandlerEditline::Run() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:600:11
llvm#34 0x00007fe9eb7b214c lldb_private::Debugger::RunIOHandlers() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Debugger.cpp:1280:16
llvm#35 0x00007fe9eb98f00f lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:3620:16
llvm#36 0x00007fe9eb4f0e09 lldb::SBDebugger::RunCommandInterpreter(bool, bool) /home/hyubo/osmeta/external/llvm-project/lldb/source/API/SBDebugger.cpp:1234:42
llvm#37 0x000055ca6788d6b0 Driver::MainLoop() /home/hyubo/osmeta/external/llvm-project/lldb/tools/driver/Driver.cpp:677:3
llvm#38 0x000055ca6788e226 main /home/hyubo/osmeta/external/llvm-project/lldb/tools/driver/Driver.cpp:887:17
llvm#39 0x00007fe9e5e2c657 __libc_start_call_main /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
llvm#40 0x00007fe9e5e2c718 call_init /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:128:20
llvm#41 0x00007fe9e5e2c718 __libc_start_main@GLIBC_2.2.5 /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:379:5
llvm#42 0x000055ca67889a11 _start /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/x86_64/start.S:118:0
Segmentation fault (core dumped)
```

When `target symbols add` is run, `Symtab::AddSymbol()` can reallocate
the underlying `std::vector<Symbol>` and resize it, invalidating all
existing Symbol* pointers. While `Process::Flush()` clears stale stack
frames, the statusline caches its own `ExecutionContextRef` containing a
`StackID` with a `SymbolContextScope*` (which can be a `Symbol*`). This
cached reference is not cleared by `Process::Flush()`, so the next
statusline redraw accesses a dangling pointer and crashes.

Fix this by adding `Statusline::Flush()` which clears the cached frame,
`Debugger::Flush()` which forwards to it under the statusline mutex, and
calling `Debugger::Flush()` from `Process::Flush()` so that all flush
paths (symbol add, exec, module load) also invalidate the statusline's
stale state.

After this fix, lldb is not crashing anymore, new symbols from a symbol
file are correctly loaded

---------

Co-authored-by: George Hu <georgehuyubo@gmail.com>
thurstond and others added 25 commits May 7, 2026 19:08
This adds explicit handling for fpto[us]i_sat, similar to how the
non-saturating versions are handled.

N.B. PR llvm#191365 lowered NEON fcvtz[us] intrinsics into fpto[us]i.sat.
There is a slight inconsistency in MSan insofar as fcvtz[us] were
handled by handleNEONVectorConvertIntrinsic(), which takes an
all-or-nothing propagation approach to the shadows (i.e., even a single
uninitialized bit will result in the corresponding integer being fully
uninitialized), while fpto[us]i were handled by propagating the shadow
unchanged. For now, we choose to have fpto[us]i_sat follow the laxer
behavior of fpto[usi. Future work may consider changing the behavior of
fpto[us]i and fpto[us]i_sat to use the all-or-nothing approach.
…d dummy (llvm#196428)

After llvm#195182 introduced the `UseDevice` attribute, a `use_device(...)`
actual was treated as compatible with **any** dummy attribute. Combined
with the matching distance returning ∞ for `UseDevice →
managed/unified`, this caused generic resolution to misreport a clean
"no match" as an **ambiguity** when only managed/unified specifics
existed.

This PR tightens `AreCompatibleCUDADataAttrs`: a `UseDevice` actual is
only compatible with a `Device` dummy or a host (no-attribute) dummy.
Other attributes (`Managed`, `Unified`, `Pinned`, ...) require their
actual to live in that specific kind of memory.
Custom LSX sign-extensions to combinations of `SLTI` + `VILVL` + `VILVH`
if possible.

For example,  we could lower vector sext to following instructions:
```
%B = sext <4 x i16> %A to <4 x i32>
vslti.h v2, v1, 0
vilvl.h v1, v2, v1 

%B = sext <4 x i32> %A to <4 x i64>
vslti.w v3, v1, 0
vilvh.w v2, v3, v1
vilvl.w v1, v3, v1
```
When these combinations is worse than convert sext to shuffle, we simply
use the latter one instead.
…lvm#190732) (llvm#195983)

Close llvm#190333

For the test case, the root cause of the problem is, the compiler
thought the declaration of `operator &&` in consumer.cpp may change the
meaning of '&&' in the requrie clause of `F::operator()`. But it doesn't
make sense. Here we skip profiling the callee to solve the problem. Note
that we've already record the kind of the operator. So '&&' and '||'
won't be confused.

---

See the discussion in llvm#194283

For the new found pattern that we may have other binary operator (e.g.,
operator +) in the require clause, e.g.,

```C++
template <typename T, typename U>
    requires requires(T t, U u) { t + u; }
  void operator()(T, U) {}
```

This is a new problem and we need to solve it in other PR.
…llvm#196452)

This was a recursive function with a Map to cache things that was never filled.
Now it's a worklist and the map is actually used.

Co-authored-by: Johannes Doerfert <johannes@jdoerfert.de>
When running the test in a runner where the source directory is read
only, this test fails w/ `error: failed to open instrumentor stub
runtime file for writing: Permission denied`. Run the test in a
writeable test dir `%t` to ensure we can actually write to the current
directory.
…95819)

[LWG3884](https://wg21.link/LWG3884) requires allocator-extended
copy/move constructors on the flat container adaptors. All four
container adaptors (flat_map, flat_multimap, flat_set, flat_multiset)
landed with these constructors and their tests already in place:

- flat_map      (llvm#98643)  -- LLVM 20
- flat_multimap (llvm#113835) -- LLVM 20
- flat_set      (llvm#125241) -- LLVM 21
- flat_multiset (llvm#128363) -- LLVM 21

This LWG issue was fully addressed once flat_set/flat_multiset landed in
LLVM 21, so the status is updated to `|Complete|` with first released
version 21.

Closes llvm#105269

Co-authored-by: Hristo Hristov <zingam@outlook.com>
…vm#193918)

EmitLocation and related functions are particularly hot and rederive
current source location metadata (line, column, file). Storing this
metadata when updating the current location in CGDebugInfo::setLocation
and reusing is a nice compile-time improvement on debug builds:

CTMark geomean:
- stage1-ReleaseLTO-g: -0.65%
- stage1-O0-g: -3.48%
- stage1-aarch64-O0-g: -2.82%
- stage2-O0-g: -3.53%

http://llvm-compile-time-tracker.com/compare.php?from=99c9a1f566df3ab4f37e156b62afd1d743882de0&to=78d281116ba7ee7e6c13625906be325b6495205a&stat=instructions%3Au

Assisted-by: codex
…94606)

Previously, calling a host-device mismatch function inside a discarded
`if constexpr` branch would trigger an error. This patch recognizes that
discarded statements are never instantiated and allows such code.
[libc] Include CPU model in overlay CI sccache key

The overlay CI compiles opt_host memory tests with `-march=native`,
which generates object files specific to the runner CPU model. sccache
treats `-march=native` as a literal string in its hash key, so cached
`.o` files compiled on one CPU model get served to runners with a
different CPU. When the cached binary uses instructions the current CPU
lacks, the test crashes with SIGILL.

## Symptoms

The `memcmp_opt_host`, `memmove_opt_host`, `memset_opt_host`,
`bcmp_opt_host`, and `bzero_opt_host` tests crash when SIMD code paths
are first exercised. Simple tests like `CmpZeroByte` pass because they
use small sizes that do not enter SIMD routines. The failures are fully
reproducible on reruns because the cache stays poisoned.

## Evidence

Three consecutive runs of the same fwide PR (llvm#196157), same code:

| Run | Azure Region | Cache Hits | Cache Misses | Result |
|-----|-------------|-----------|-------------|--------|
|
[25512875679](https://github.com/llvm/llvm-project/actions/runs/25512875679/job/74876008545)
| westus3 | 9 | 5354 | PASS |
|
[25524024922](https://github.com/llvm/llvm-project/actions/runs/25524024922/job/74916241365)
| northcentralus | 5345 | 0 | CRASH |
|
[25524839613](https://github.com/llvm/llvm-project/actions/runs/25524839613/job/74965830435)
| westus | 5345 | 0 | CRASH |

The first run had a nearly empty cache and compiled everything locally
(0.17% hit rate). An intermediate [syscall-unistd
run](https://github.com/llvm/llvm-project/actions/runs/25517783708/job/74893495220)
in eastus then populated the cache with object files compiled for that
region's CPU. Subsequent runs on different hardware got 100% cache hits
and crashed because the cached `.o` files use instructions their CPUs
lack.

## Fix

Added a "Detect CPU model" step that reads the CPU model string from
`/proc/cpuinfo` (Linux) or `sysctl` (macOS) and appends it to the
sccache cache key. Runners with different CPUs now get separate cache
buckets.

Assisted-by: Automated tooling, human reviewed.
…lvm#192041)

libclc has configure warning on Windows:
clang: error: no such file or directory:
'/clang:--target=amdgcn-amd-amdhsa-llvm' clang: error: no such file or
directory: '/clang:-print-target-triple'
  CMake Warning at CMakeLists.txt:239 (message):
    Failed to execute `llvm-project/build/bin/clang.exe
/clang:--target=amdgcn-amd-amdhsa-llvm /clang:-print-target-triple` to
    normalize target triple.

Switch to check CMAKE_C_COMPILER_FRONTEND_VARIANT because
- CMAKE_C_SIMULATE_ID=MSVC: true for both clang and clang-cl.
- CMAKE_C_COMPILER_FRONTEND_VARIANT=MSVC: true for clang-cl; false for clang.
…85027)

Replace the explicit specialization lists in `__is_signed_integer_v` and
`__is_unsigned_integer_v` with detection using `is_integral`,
`is_signed`, and `is_unsigned`. This covers `_BitInt(N)` for any N, in
addition to all standard and extended integer types. Character types and
`bool` are excluded via `__is_character_or_bool_v`.

This unblocks `<bit>` operations (`popcount`, `countl_zero`, `rotl`,
etc.) for `_BitInt(N)`.

Part of the [_BitInt(N) libc++
effort](https://discourse.llvm.org/t/bitint-n-support-in-libc-investigations-possible-improvements-looking-for-guidance/90063).

Assisted-by: Claude (Anthropic)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
No codegen and instruction.
It may be ratified in the future. riscv/riscv-isa-manual#2598
The cost of sub-reductions is either the cost of *mlslb + *mlslt, or the
cost of a dot operation with 2 negations:
```
       partial_reduce_umls acc, lhs, rhs
  <=> -partial_reduce_umla -acc, lhs, rhs
```
(codegen for this was added by llvm#186809)

The cost-model was previously a bit of a hack, since sub-reductions were
expanded and therefore expensive, although we made the expansion cost
artifically cheaper so that it would still be a candidate for cdot
instructions.
In private ZA functions without any instructions that require "active"
ZA we can omit all ZA setup (and saves/restores). This is equivalent to
removing the `__arm_new("za/zt0")` attribute when ZA state is unused.
Add fwide function and tests. Part 1/11. All build file changes are in
part 11.

Assisted by Gemini
…6350)

This patch adds some extra state collection methods to DebuggerBase and
implements them for DAP only. These methods are used to fetch a
stacktrace without variable information, and to populate variable
information into a StepIR containing only a stacktrace. These methods
are currently unused, making this patch NFC, but this is a necessary
precursor to the new script model, where we examine the stacktrace to
determine what variable info we will collect.

As part of the stacktrace-collection function, we also fetch the
instruction address for each stack frame, if it is made available by the
debugger; to enable this, this patch adds a new value with default
`None` to `FrameIR`.
madhur13490 and others added 27 commits May 11, 2026 09:46
…llvm#196494)

The followup [patch](llvm#196080)
is folding some of the idempotent binary ops This test has `sub x - x`
operation which is affected by the followup patch. This patch is making
the test immune to the fold.
…rPHI for faster compile time. (llvm#183726)

SliceUpIllegalIntegerPHI searches for PHIs that have illegal type and
are only used by trunc or trunc(lshr) operations. It bails out if
encounters invoke or EH pad instructions.
It first checks whether it encounters invoke or EH pad, which is time
consuming as it checks every instruction. Then it checks whether it is
used by trunc or trunc(lshr). The former check is generally loose, while
the latter one is stricter. Switch the order of the checks will speed up
compilation.

Signed-off-by: XinlongZHANG-Bob <zhangxinlong.bob@bytedance.com>
Targets supporting sve prefer sve for ctpop with fixed length vectors.
Update cost model to reflect the same.
Moves the declarations of the NVVM dialect and some widely used enums
(`FPRoundingModeAttr` and `SaturationModeAttr`) to separate files to make
them easier to maintain and also use in the NVGPU dialect.
…lvm#195794)

So the attached test case works even though it's just an `InitListExpr`.
Add the POSIX setenv() function, with EnvironmentManager::set()
handling environment array management and ownership tracking.

Registered for x86_64, aarch64, and riscv architectures. Integration
tests cover overwrite/no-overwrite semantics, empty/invalid names,
empty values, and repeated replacement.

Assisted-by: Automated tooling, human reviewed.

---------

Co-authored-by: Michael Jones <michaelrj@google.com>
…6530)

tryCombineAllImpl queries target info for every instruction. Cache
TargetInstrInfo/TargetRegisterInfo/RegisterBankInfo in CombinerHelper
and pass to executeMatchTable instead.

This avoids repeated virtual calls on the combiner executeMatchTable
path.

CTMark -0.08% geomean improvement on aarch64-O0-g.

https://llvm-compile-time-tracker.com/compare.php?from=71fef6d5a306d1adf8bf7d30d2fe9e286380fecf&to=13bc49510657450402c066098e3a4b7d1af9d0e6&stat=instructions%3Au

Assisted-by: codex
Pack DILocation fields before hashing. Now that column is 16-bits
Line/Column/ImplicitCode fit in one 64-bit value (32 + 16 + 1 = 49 bits)
and AtomGroup and AtomRank also fit cleanly in one 64-bit value (61 + 3
= 64 bits).

Fewer hash_combine inputs on the hot DILocation path is a small
compile-time improvement.

CTMark geomean:
- stage1-ReleaseLTO-g: -0.10%
- stage1-O0-g: -0.23%
- stage1-aarch64-O0-g: -0.19%
- stage2-O0-g: -0.07%

https://llvm-compile-time-tracker.com/compare.php?from=71fef6d5a306d1adf8bf7d30d2fe9e286380fecf&to=1d80b5f5aa98561d2ba09adc3f20c3eacd24cb88&stat=instructions%3Au

Assisted-by: codex
Loop Fusion has used Dependence Analysis (DA) as the default dependence
check since the option default was flipped in llvm#187309. The SCEV-based
strategy and the combined "all" mode were retained only for fallback and
experimentation, with a comment noting that the SCEV code would be
removed in a follow-up.

This patch removes the SCEV-based dependence path and the now-unused
selector machinery.

Fixes llvm#194821.

Assisted by Cursor.
File::write_unlocked(const wchar_t*, size_t) checked 'write_res.value <
1' after writing a converted UTF-8 sequence. For multi-byte characters,
a short platform write (e.g. 2 of 3 bytes for a 3-byte character) passed
this check and was counted as a successful write. The output stream
would then contain an incomplete UTF-8 sequence with no error reported
to the caller.

Changed the check to 'write_res.value < char_size' and set the error
indicator on the stream when it triggers.

Added a regression test using a mock File subclass that limits
platform_write to 2 bytes per call, simulating short writes on pipes and
sockets.

Assisted-by: Automated tooling, human reviewed.

---------

Co-authored-by: Michael Jones <michaelrj@google.com>
…lvm#193939)

Fences and other synchronizing operations (such as atomic accesses
stronger than monotonic) are modelled as reading and writing all memory,
in order to enforce their implied ordering constraints.

Currently, this happens even for identified function locals that do not
escape. This patch excludes those objects.

Notably, we can *not* reason based on captures-before here, because the
synchronizing operation still has an effect even if the object only
escapes *later*.

The hope here is that with this restriction in place, it may be viable
to respect potential synchronization inside non-nosync function calls.
This fixes ce6605a.

Co-authored-by: Google Bazel Bot <google-bazel-bot@google.com>
…lvm#196501)

and re-enable it on more targets.

I don't think this test was intended to check for alignment. Those
expectations were added as part of FileCheck-izing the test in
e29dadb and we've been working around
them or xfailing the test since.
Upstreaming clangIR PR: llvm/clangir#2052

This PR adds support for lowering of _builtin_amdgcn_ds_swizzle* amdgpu
builtin to clangIR.
The original address used for the "fake breakpoint" is not valid in
Thumb mode. To be safe, change it to have 0's in the LSBs.
…196010)

Just like we do with the first parameter of a regular
`__builtin_object_size` call.

This still doesn't fix the bigger bos test cases since e.g.
```c++
int NoViableOverloadObjectSize3(void *const p PS(3))
    __attribute__((overloadable)) {
  return __builtin_object_size(p, 3);
}
void test4(struct Foo *t) {
  gi = NoViableOverloadObjectSize3(&t[1].t[1]);
}
```
is still broken because we don't have special handling for the
`&t[1].t[1]` handling here and we can't usually access a one-past-end
pointer.
)

This changes the instruction we use to extract the high half of a vector
register from a `ext v0, v1, v1, 8` to a `dup d0, v1.d[1]`. This is
apparently slightly quicker on certain cpus and is generally a simpler
instruction. This matches the instruction that gisel produced.

Some of the old patterns for extract_subvector with index of 1 seem
incorrect but were never used as we do not reach selection with such
instructions. They have been repurposed to emit the new DUPi64
instructions.
Following these two discussions:
* https://discourse.llvm.org/t/rfc-mention-our-ai-policy-in-the-greeting-message-for-first-time-contributors/,
* https://discourse.llvm.org/t/concerns-about-influx-of-ai-generated-bug-fixes/,

add a reference to the LLVM AI policy in the GH greeter. 

In addition:
* Update the message to include links to other relevant policies as
  well, since these are often shared during PR review.
* Add FAQ section and move some of the original content there.
* Include a request for people to confirm that they have familiarised themselves with
  the policies.
* Add `Hello @{self.author} 👋` to make the greeting more personal.
lukel97 pushed a commit that referenced this pull request May 11, 2026
…input" (llvm#195551)

Reverts llvm#190863 due to buildbot breakage e.g.,
https://lab.llvm.org/buildbot/#/builders/52/builds/16951

```
Failed Tests (1):
  LLVM :: tools/llvm-profgen/filter-build-id.test
```
```
==llvm-profgen==3809550==ERROR: AddressSanitizer: container-overflow on address 0x6e80441e1762 at pc 0x6216c3f2cdce bp 0x7fff3c3ddf60 sp 0x7fff3c3dd710
READ of size 8 at 0x6e80441e1762 thread T0
    #0 0x6216c3f2cdcd in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:848:7
    #1 0x6216c3f2d25c in bcmp /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:894:10
    #2 0x6216c400b836 in operator== /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/StringRef.h:914:10
    #3 0x6216c400b836 in operator!= /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/StringRef.h:917:69
    #4 0x6216c400b836 in llvm::sampleprof::PerfScriptReader::extractCallstack(llvm::sampleprof::TraceStream&, llvm::SmallVectorImpl<unsigned long>&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:801:36
    #5 0x6216c400d37a in llvm::sampleprof::HybridPerfReader::parseSample(llvm::sampleprof::TraceStream&, unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:881:8
    #6 0x6216c40150d8 in parseSample /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1118:3
    #7 0x6216c40150d8 in llvm::sampleprof::PerfScriptReader::parseEventOrSample(llvm::sampleprof::TraceStream&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1201:5
    #8 0x6216c401539a in llvm::sampleprof::PerfScriptReader::parseAndAggregateTrace() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1210:5
    #9 0x6216c4018c88 in llvm::sampleprof::PerfScriptReader::parsePerfTraces() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1457:3
    #10 0x6216c3ff2c7a in main /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/llvm-profgen.cpp:229:19
    #11 0x72404502a8c0  (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a8c0) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb)
    #12 0x72404502a9d7 in __libc_start_main (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a9d7) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb)
    llvm#13 0x6216c3f0f3d4 in _start (/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-profgen+0x2f083d4)
0x6e80441e1762 is located 18 bytes inside of 48-byte region [0x6e80441e1750,0x6e80441e1780)
allocated by thread T0 here:
    #0 0x6216c3feab0d in operator new(unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:109:35
    #1 0x724045511c07 in __libcpp_allocate<char> /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__new/allocate.h:42:28
    #2 0x724045511c07 in allocate /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocator.h:92:14
    #3 0x724045511c07 in allocate_at_least /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocator.h:99:13
    #4 0x724045511c07 in allocate_at_least<std::__1::allocator<char> > /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocator_traits.h:340:22
    #5 0x724045511c07 in __allocate_at_least<std::__1::allocator<char> > /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocate_at_least.h:36:16
    #6 0x724045511c07 in __allocate_long_buffer /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/string:2259:21
    #7 0x724045511c07 in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__grow_by(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/string:2769:25
    #8 0x6216c401d90a in __grow_by_without_replace /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/string:2795:3
    #9 0x6216c401d90a in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>& std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::append[abi:sqn230000]<char const*, 0>(char const*, char const*) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/string:1431:9
    #10 0x6216c401d1a6 in std::__1::basic_istream<char, std::__1::char_traits<char>>& std::__1::getline[abi:sqn230000]<char, std::__1::char_traits<char>, std::__1::allocator<char>>(std::__1::basic_istream<char, std::__1::char_traits<char>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, char) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/istream:1309:15
    #11 0x6216c4014a76 in getline<char, std::__1::char_traits<char>, std::__1::allocator<char> > /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/istream:1343:10
    #12 0x6216c4014a76 in advance /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.h:52:10
    llvm#13 0x6216c4014a76 in llvm::sampleprof::PerfScriptReader::parseAggregatedCount(llvm::sampleprof::TraceStream&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1110:13
    llvm#14 0x6216c4015095 in parseSample /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1116:20
    llvm#15 0x6216c4015095 in llvm::sampleprof::PerfScriptReader::parseEventOrSample(llvm::sampleprof::TraceStream&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1201:5
    llvm#16 0x6216c401539a in llvm::sampleprof::PerfScriptReader::parseAndAggregateTrace() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1210:5
    llvm#17 0x6216c4018c88 in llvm::sampleprof::PerfScriptReader::parsePerfTraces() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1457:3
    llvm#18 0x6216c3ff2c7a in main /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/llvm-profgen.cpp:229:19
    llvm#19 0x72404502a8c0  (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a8c0) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb)
    llvm#20 0x72404502a9d7 in __libc_start_main (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a9d7) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb)
    llvm#21 0x6216c3f0f3d4 in _start (/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-profgen+0x2f083d4)
```
lukel97 pushed a commit that referenced this pull request May 11, 2026
… new constant interpreter (llvm#194851)

**Problem:**

A crash is triggered by clangd's hover feature when using C++23 and the
new bytecode interpreter, which calls `Expr::EvaluateAsRValue()` to
attempt constant folding on an expression under the cursor, even when it
is not a valid constant expression.

Tested versions: 22.1.3, Trunk (x86_64-pc-linux-gnu)

**How to reproduce:**
```cpp
struct S { void f(); };
void g() { S s; s.f(); }
```
Running `clangd --check=repro.cpp` 
(with `compile_flags.txt` containing `-std=c++23
-fexperimental-new-constant-interpreter`)
will crash. 

`Assertion ItemTypes.back() == toPrimType<T>() failed.`

You can observe the same crash by hovering over STL iterators like
`vec.begin()`.

**Relevant Stack Trace:**
```text
#8  clang::interp::InterpStack::pop<MemberPointer>()
#9  clang::interp::EvalEmitter::emitRet(PrimType, SourceInfo)
#10 clang::interp::Compiler<EvalEmitter>::visitExpr(Expr const*, bool)
#11 clang::interp::EvalEmitter::interpretExpr(Expr const*, bool, bool)
#12 clang::interp::Context::evaluateAsRValue(State&, Expr const*, APValue&)
llvm#13 EvaluateAsRValue(EvalInfo&, Expr const*, APValue&)
llvm#14 clang::Expr::EvaluateAsRValue(EvalResult&, ASTContext const&, bool) const
llvm#15 clangd::(anon)::printExprValue(Expr const*, ASTContext const&)
llvm#16 clangd::(anon)::printExprValue(SelectionTree::Node const*, ASTContext const&)
llvm#17 clangd::getHover(...)
```
*Basically: `textDocument/hover` → `getHover` → `EvaluateAsRValue` → new
constant interpreter → `MemberPointer` type mismatch on stack pop.*

When `Compiler<Emitter>::VisitMemberExpr()` encounters a non-static
`CXXMethodDecl` member (a bound member function expression such as `s.f`
in `s.f()`), it falls through to `visitDeclRef()`. This pushes a `FnPtr`
onto the interpreter stack. However, the caller expects a
`MemberPointer`, causing an assertion failure in `InterpStack::pop()`:

**Fix:**

* In `VisitMemberExpr()`, bail out early (`return false`) when the
member is a non-static `CXXMethodDecl`, before reaching
`visitDeclRef()`. This causes `EvaluateAsRValue()` to report failure
gracefully rather than crashing. Bound member function expressions
(`s.f`) are not valid constant expressions, so returning `false` should
be semantically correct.

**Testing:**
* Added AST unit test
(`EvaluateAsRValue.FailsGracefullyOnBoundMemberExpr`) that directly
isolates a bound `MemberExpr` and passes it to `EvaluateAsRValue()`,
asserting it returns `false` without crashing.

* Added clangd hover test
(Hover.NoCrashOnBoundMemberFunctionWithNewInterpreter)
  that reproduces the original crash scenario.

* *Note:* I could not add a Lit test because I believe this is
unreachable via normal `clang` invocations. `Sema` strictly catches
isolated bound member functions before constant evaluation. `clangd` has
a unique path to triggering this.

**Root cause:**

This is exposed by C++23 specifically due to (I think P2280R4 /
P2448R2):

- Relaxing the rules around "unknown" objects in constant evaluation,
allowing `s` in `s.f()` to proceed past the base object check even
though `s` is not constexpr and deferring failures to bytecode execution
rather than rejecting them structurally.

Assisted-by: gemini-cli

@tbaederr

---------

Co-authored-by: Timm Baeder <tbaeder@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.