Skip to content

Loop vectorize/simplify select or#6

Open
lukel97 wants to merge 3336 commits into
mainfrom
loop-vectorize/simplify-select-or
Open

Loop vectorize/simplify select or#6
lukel97 wants to merge 3336 commits into
mainfrom
loop-vectorize/simplify-select-or

Conversation

@lukel97
Copy link
Copy Markdown
Owner

@lukel97 lukel97 commented Apr 2, 2026

No description provided.

@lukel97 lukel97 force-pushed the loop-vectorize/simplify-select-or branch from c1a80ac to 998e999 Compare April 2, 2026 12:08
@lukel97
Copy link
Copy Markdown
Owner Author

lukel97 commented Apr 2, 2026

/test-suite

@lukel97 lukel97 force-pushed the loop-vectorize/simplify-select-or branch from 998e999 to 0765065 Compare April 2, 2026 15:07
@lukel97 lukel97 force-pushed the main branch 7 times, most recently from a4b2f02 to cc3e4c6 Compare April 6, 2026 02:59
lukel97 pushed a commit that referenced this pull request Apr 9, 2026
Running gcc test c-c++-common/tsan/tls_race.c on s390 we get:

ThreadSanitizer: CHECK failed: tsan_platform_linux.cpp:618 "((thr_beg))
>= ((tls_addr))" (0x3ffaa35e140, 0x3ffaa35e250) (tid=2419930)
#0 __tsan::CheckUnwind() /devel/src/libsanitizer/tsan/tsan_rtl.cpp:696
(libtsan.so.2+0x91b57)
#1 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long
long, unsigned long long)
/devel/src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:86
(libtsan.so.2+0xd211b)
#2 __tsan::ImitateTlsWrite(__tsan::ThreadState*, unsigned long, unsigned
long) /devel/src/libsanitizer/tsan/tsan_platform_linux.cpp:618
(libtsan.so.2+0x8faa3)
#3 __tsan::ThreadStart(__tsan::ThreadState*, unsigned int, unsigned long
long, __sanitizer::ThreadType)
/devel/src/libsanitizer/tsan/tsan_rtl_thread.cpp:225
(libtsan.so.2+0xaadb5)
#4 __tsan_thread_start_func
/devel/src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1065
(libtsan.so.2+0x3d34d)
#5 start_thread <null> (libc.so.6+0xae70d) (BuildId:
d3b08de1b543c2d15d419bf861b3c2e4c01ac75b)
#6 thread_start <null> (libc.so.6+0x12d2ff) (BuildId:
d3b08de1b543c2d15d419bf861b3c2e4c01ac75b)

In order to determine the static TLS blocks in GetStaticTlsBoundary we
iterate over the modules and try to find the largest range without a
gap. Here we might have that modules are spaced exactly by the
alignment. For example, for the failing test we have:

(gdb) p/x ranges.data_[0]
$1 = {begin = 0x3fff7f9e6b8, end = 0x3fff7f9e740, align = 0x8, tls_modid
= 0x3} (gdb) p/x ranges.data_[1]
$2 = {begin = 0x3fff7f9e740, end = 0x3fff7f9eed0, align = 0x40,
tls_modid = 0x2} (gdb) p/x ranges.data_[2]
$3 = {begin = 0x3fff7f9eed8, end = 0x3fff7f9eef8, align = 0x8, tls_modid
= 0x4} (gdb) p/x ranges.data_[3]
$4 = {begin = 0x3fff7f9eefc, end = 0x3fff7f9ef00, align = 0x4, tls_modid
= 0x1}

where ranges[3].begin == ranges[2].end + ranges[3].align holds. Since in
the loop a strict inequality test is used we compute the wrong address

(gdb) p/x *addr
$5 = 0x3fff7f9eefc

whereas 0x3fff7f9e6b8 is expected which is why we bail out in the
subsequent.
lukel97 pushed a commit that referenced this pull request Apr 16, 2026
…bols add' (llvm#188377)

Context: 
lldb might crash when running to a debuggee crashing state and do a
target symbols add command.
Backtrace: 
```
 #0 0x000055ca6790dc65 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:848:11
 #1 0x000055ca6790e434 PrintStackTraceSignalHandler(void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:931:1
 #2 0x000055ca6790b839 llvm::sys::RunSignalHandlers() /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Signals.cpp:104:5
 #3 0x000055ca6790ff6b SignalHandler(int, siginfo_t*, void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:430:38
 #4 0x00007fe9e5e44560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0
 #5 0x00007fe9e5f25649 syscall /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/misc/../sysdeps/unix/sysv/linux/x86_64/syscall.S:38:0
 #6 0x00007fe9ec649170 SignalHandler(int, siginfo_t*, void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:429:7
 #7 0x00007fe9e5e44560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0
 #8 0x00007fe9ebb77bf0 lldb_private::operator<(lldb_private::StackID const&, lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackID.cpp:99:16
 #9 0x00007fe9ebb6863d CompareStackID(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackFrameList.cpp:683:3
#10 0x00007fe9ebb6d049 bool __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>::operator()<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/predefined_ops.h:196:4
#11 0x00007fe9ebb6cefe __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>> std::__lower_bound<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID, __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&, __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/stl_algobase.h:1464:8
#12 0x00007fe9ebb6cdfc __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>> std::lower_bound<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/stl_algo.h:2062:14
llvm#13 0x00007fe9ebb685fa auto llvm::lower_bound<std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>&, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>(std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>&, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)) /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/STLExtras.h:2001:10
llvm#14 0x00007fe9ebb68441 lldb_private::StackFrameList::GetFrameWithStackID(lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackFrameList.cpp:697:11
llvm#15 0x00007fe9ebbee395 lldb_private::Thread::GetFrameWithStackID(lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/include/lldb/Target/Thread.h:459:7
llvm#16 0x00007fe9ebac7cf7 lldb_private::ExecutionContextRef::GetFrameSP() const /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/ExecutionContext.cpp:643:25
llvm#17 0x00007fe9ebac80e1 lldb_private::GetStoppedExecutionContext(lldb_private::ExecutionContextRef const*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/ExecutionContext.cpp:164:34
llvm#18 0x00007fe9eb8903fa lldb_private::Statusline::Redraw(std::optional<lldb_private::ExecutionContextRef>) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Statusline.cpp:139:7
llvm#19 0x00007fe9eb7ac8be lldb_private::Debugger::RedrawStatusline(std::optional<lldb_private::ExecutionContextRef>) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Debugger.cpp:1233:3
llvm#20 0x00007fe9eb804d1e lldb_private::IOHandlerEditline::RedrawCallback() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:446:3
llvm#21 0x00007fe9eb80aa81 lldb_private::IOHandlerEditline::IOHandlerEditline(lldb_private::Debugger&, lldb_private::IOHandler::Type, std::shared_ptr<lldb_private::File> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, unsigned int, char const*, llvm::StringRef, llvm::StringRef, bool, bool, unsigned int, lldb_private::IOHandlerDelegate&)::$_2::operator()() const /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:262:73
llvm#22 0x00007fe9eb80aa5d void llvm::detail::UniqueFunctionBase<void>::CallImpl<lldb_private::IOHandlerEditline::IOHandlerEditline(lldb_private::Debugger&, lldb_private::IOHandler::Type, std::shared_ptr<lldb_private::File> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, unsigned int, char const*, llvm::StringRef, llvm::StringRef, bool, bool, unsigned int, lldb_private::IOHandlerDelegate&)::$_2>(void*) /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:213:5
llvm#23 0x00007fe9eb93bfbf llvm::unique_function<void ()>::operator()() /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:365:5
llvm#24 0x00007fe9eb93bb80 lldb_private::Editline::GetCharacter(wchar_t*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:0:5
llvm#25 0x00007fe9eb941a18 lldb_private::Editline::ConfigureEditor(bool)::$_0::operator()(editline*, wchar_t*) const /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1287:5
llvm#26 0x00007fe9eb9419e2 lldb_private::Editline::ConfigureEditor(bool)::$_0::__invoke(editline*, wchar_t*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1286:27
llvm#27 0x00007fe9f3384e26 el_getc /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:439:14
llvm#28 0x00007fe9f3384e26 el_getc /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:400:1
llvm#29 0x00007fe9f3384f90 read_getcmd /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:247:14
llvm#30 0x00007fe9f3384f90 el_gets /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:586:14
llvm#31 0x00007fe9eb9409f3 lldb_private::Editline::GetLine(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, bool&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1636:16
llvm#32 0x00007fe9eb8044d7 lldb_private::IOHandlerEditline::GetLine(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, bool&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:339:5
llvm#33 0x00007fe9eb805609 lldb_private::IOHandlerEditline::Run() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:600:11
llvm#34 0x00007fe9eb7b214c lldb_private::Debugger::RunIOHandlers() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Debugger.cpp:1280:16
llvm#35 0x00007fe9eb98f00f lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:3620:16
llvm#36 0x00007fe9eb4f0e09 lldb::SBDebugger::RunCommandInterpreter(bool, bool) /home/hyubo/osmeta/external/llvm-project/lldb/source/API/SBDebugger.cpp:1234:42
llvm#37 0x000055ca6788d6b0 Driver::MainLoop() /home/hyubo/osmeta/external/llvm-project/lldb/tools/driver/Driver.cpp:677:3
llvm#38 0x000055ca6788e226 main /home/hyubo/osmeta/external/llvm-project/lldb/tools/driver/Driver.cpp:887:17
llvm#39 0x00007fe9e5e2c657 __libc_start_call_main /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
llvm#40 0x00007fe9e5e2c718 call_init /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:128:20
llvm#41 0x00007fe9e5e2c718 __libc_start_main@GLIBC_2.2.5 /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:379:5
llvm#42 0x000055ca67889a11 _start /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/x86_64/start.S:118:0
Segmentation fault (core dumped)
```

When `target symbols add` is run, `Symtab::AddSymbol()` can reallocate
the underlying `std::vector<Symbol>` and resize it, invalidating all
existing Symbol* pointers. While `Process::Flush()` clears stale stack
frames, the statusline caches its own `ExecutionContextRef` containing a
`StackID` with a `SymbolContextScope*` (which can be a `Symbol*`). This
cached reference is not cleared by `Process::Flush()`, so the next
statusline redraw accesses a dangling pointer and crashes.

Fix this by adding `Statusline::Flush()` which clears the cached frame,
`Debugger::Flush()` which forwards to it under the statusline mutex, and
calling `Debugger::Flush()` from `Process::Flush()` so that all flush
paths (symbol add, exec, module load) also invalidate the statusline's
stale state.

After this fix, lldb is not crashing anymore, new symbols from a symbol
file are correctly loaded

---------

Co-authored-by: George Hu <georgehuyubo@gmail.com>
GkvJwa and others added 17 commits May 11, 2026 10:46
Add an MSVC-compatible <arm64_neon.h> resource header that forwards to
Clang's generated <arm_neon.h>. This lets ARM64 Windows code using the
MSVC header name lower NEON intrinsics through Clang builtins instead of
eaving external neon_* calls such as neon_ld1m4_q32

Fixes llvm#195683
…llvm#196494)

The followup [patch](llvm#196080)
is folding some of the idempotent binary ops This test has `sub x - x`
operation which is affected by the followup patch. This patch is making
the test immune to the fold.
…rPHI for faster compile time. (llvm#183726)

SliceUpIllegalIntegerPHI searches for PHIs that have illegal type and
are only used by trunc or trunc(lshr) operations. It bails out if
encounters invoke or EH pad instructions.
It first checks whether it encounters invoke or EH pad, which is time
consuming as it checks every instruction. Then it checks whether it is
used by trunc or trunc(lshr). The former check is generally loose, while
the latter one is stricter. Switch the order of the checks will speed up
compilation.

Signed-off-by: XinlongZHANG-Bob <zhangxinlong.bob@bytedance.com>
Targets supporting sve prefer sve for ctpop with fixed length vectors.
Update cost model to reflect the same.
Moves the declarations of the NVVM dialect and some widely used enums
(`FPRoundingModeAttr` and `SaturationModeAttr`) to separate files to make
them easier to maintain and also use in the NVGPU dialect.
…lvm#195794)

So the attached test case works even though it's just an `InitListExpr`.
Add the POSIX setenv() function, with EnvironmentManager::set()
handling environment array management and ownership tracking.

Registered for x86_64, aarch64, and riscv architectures. Integration
tests cover overwrite/no-overwrite semantics, empty/invalid names,
empty values, and repeated replacement.

Assisted-by: Automated tooling, human reviewed.

---------

Co-authored-by: Michael Jones <michaelrj@google.com>
…6530)

tryCombineAllImpl queries target info for every instruction. Cache
TargetInstrInfo/TargetRegisterInfo/RegisterBankInfo in CombinerHelper
and pass to executeMatchTable instead.

This avoids repeated virtual calls on the combiner executeMatchTable
path.

CTMark -0.08% geomean improvement on aarch64-O0-g.

https://llvm-compile-time-tracker.com/compare.php?from=71fef6d5a306d1adf8bf7d30d2fe9e286380fecf&to=13bc49510657450402c066098e3a4b7d1af9d0e6&stat=instructions%3Au

Assisted-by: codex
Pack DILocation fields before hashing. Now that column is 16-bits
Line/Column/ImplicitCode fit in one 64-bit value (32 + 16 + 1 = 49 bits)
and AtomGroup and AtomRank also fit cleanly in one 64-bit value (61 + 3
= 64 bits).

Fewer hash_combine inputs on the hot DILocation path is a small
compile-time improvement.

CTMark geomean:
- stage1-ReleaseLTO-g: -0.10%
- stage1-O0-g: -0.23%
- stage1-aarch64-O0-g: -0.19%
- stage2-O0-g: -0.07%

https://llvm-compile-time-tracker.com/compare.php?from=71fef6d5a306d1adf8bf7d30d2fe9e286380fecf&to=1d80b5f5aa98561d2ba09adc3f20c3eacd24cb88&stat=instructions%3Au

Assisted-by: codex
Loop Fusion has used Dependence Analysis (DA) as the default dependence
check since the option default was flipped in llvm#187309. The SCEV-based
strategy and the combined "all" mode were retained only for fallback and
experimentation, with a comment noting that the SCEV code would be
removed in a follow-up.

This patch removes the SCEV-based dependence path and the now-unused
selector machinery.

Fixes llvm#194821.

Assisted by Cursor.
File::write_unlocked(const wchar_t*, size_t) checked 'write_res.value <
1' after writing a converted UTF-8 sequence. For multi-byte characters,
a short platform write (e.g. 2 of 3 bytes for a 3-byte character) passed
this check and was counted as a successful write. The output stream
would then contain an incomplete UTF-8 sequence with no error reported
to the caller.

Changed the check to 'write_res.value < char_size' and set the error
indicator on the stream when it triggers.

Added a regression test using a mock File subclass that limits
platform_write to 2 bytes per call, simulating short writes on pipes and
sockets.

Assisted-by: Automated tooling, human reviewed.

---------

Co-authored-by: Michael Jones <michaelrj@google.com>
…lvm#193939)

Fences and other synchronizing operations (such as atomic accesses
stronger than monotonic) are modelled as reading and writing all memory,
in order to enforce their implied ordering constraints.

Currently, this happens even for identified function locals that do not
escape. This patch excludes those objects.

Notably, we can *not* reason based on captures-before here, because the
synchronizing operation still has an effect even if the object only
escapes *later*.

The hope here is that with this restriction in place, it may be viable
to respect potential synchronization inside non-nosync function calls.
This fixes ce6605a.

Co-authored-by: Google Bazel Bot <google-bazel-bot@google.com>
RKSimon and others added 30 commits May 12, 2026 11:46
…lvm#197034)

Summary:
The current build uses a curated + deduplicated source list. This PR
seeks to simplify this a little bit and canonicalize the behavior.

Now we create the target up-front, `clc` and `opencl`. We add the
directories which add sources to this target. We normalize the
architecture to the variants. We always add target specific versions
first. When we add sources we check if the file already exists and defer
to the architecture specific one.

This normalized the behavior, the directories are now laid out like this
`clc/<arch>/<os>`. We normalize these to `amdgpu`, `nvptx`, and `spirv`
respectively. We use the OS for the newly created vulkan target. We now
control variants via checking if the directory for that exists, so it's
nested more naturally.

Hopefully this makes more sense, the goal is to exercise the fact that
we have individual builds now. Previously this did not work because you
could not add_subdirectory more than once.
…97097)

Use `computeKnownBits` to tighten the high bit width bound via
`countMaxActiveBits()`, which accounts for known leading zeros.


Co-Authored-By: Simon Pilgrim <RKSimon@users.noreply.github.com>
llvm#197163)

It obviously should use the `NotC` created 4 lines above
Add LIT_UNSUPPORTED support to lit, mirroring the existing LIT_XFAIL
implementation. This allows tests to be marked as UNSUPPORTED via
command line arguments (--unsupported, --unsupported-not) or environment
variables (LIT_UNSUPPORTED, LIT_UNSUPPORTED_NOT).

This feature enables users to dynamically mark tests as unsupported
without modifying test files, useful for CI/CD pipelines and
platform-specific test filtering.

Assisted by AI.
…lvm#197160)

This was using format() instead of formatv() by accident.
Instead of using the Cortex-A510 scheduling model, C1-Nano now uses
its own scheduling model, based off of the C1-Nano Software
Optimization Guide:

https://developer.arm.com/documentation/109590/0001
…-checks` is passed) (llvm#194006)

Fixes llvm#192713.

Currently, clang-tidy exits immediately if the only enabled checks are
`clang-diagnostic-*` ones. This prevents the reasonable use case where a
user isn't interested in any "native" clang-tidy checks and just wants
to use clang-tidy as a frontend for builtin clang warnings.
…rter.h (llvm#196860)

fixed_converter.h and float_hex_converter.h have local declarations with
the same name shadowing these, causing -Wshadow warnings. For now, just
don't have global declarations for these.
…nside OffloadBinary images (llvm#184774)

Enhance the llvm-offload-binary tool to be able to unbudle with logic to
handle different cases related to spirv64-intel offload binary images.

It also allows to extract all images without requiring the use --image
options to simplify its use.

Assisted by Claude.
…lvm#195634)

Fixes llvm#180093
LINEAR variables on composite DO SIMD were being lowered onto omp.simd, which writes back unconditionally causing a race inside PARALLEL. Move them to omp.wsloop instead, which already has correct last-iter write-back with a barrier.
…#197110)

Pull the load-folding bookkeeping out of `PeepholeOptimizer::run` into a
new `PeepholeOptimizer::foldLoadInto` helper.

No functional change intended. 

This is a preliminary NFC split out from llvm#194662 per @RKSimon's review
suggestion:

  > Still think this is worth pulling out as its own NFC PR

The follow-up patch (llvm#194662) adds a second call site that folds a load
into an EFLAGS producer after `optimizeCmpInstr` erases the compare, and
will reuse this helper instead of duplicating the bookkeeping.
llvm-libc-types/stdint-macros.h does not exist. Not sure why this was
passing the CMake build, but this causes the bazel build to fail.
…ons (llvm#194709)

This patch extracts the three diagnostic forms currently duplicated
across the Custom and non-Custom branches of `check()` into a single
`emitDiag()` helper.
…m#197015)

[CWG743](https://wg21.link/cwg743) allows using `decltype` in a
*nested-name-specifier*, i.e.: `decltype(foo)::type`.
[CWG950](https://wg21.link/cwg950) allows using it as a
*base-specifier*, i.e.: `struct B : decltype(foo)`. Both these DRs were
resolved by [N3049](https://wg21.link/n3049).


Clang supports both of these since 3.1: https://godbolt.org/z/aohPs5zaa
…88287)

This allows to detect when G_UNMERGE_VALUES extracts a hi16 element and
select `s_cvt_hi_f32_f16` removing need for a shift.
This feature was already implemented by
llvm#153641.

---------

Signed-off-by: yronglin <yronglin777@gmail.com>
Add support for reading `z/OS` archives, which use EBCDIC-encoded header
fields and an EBCDIC magic string. The `z/OS` archive format shares the
same structural layout as traditional Unix archives but all text fields
(member names, timestamps, permissions, and symbol names) are in EBCDIC.
This patch adds:

 - `K_ZOS` archive kind 
- `ZOSArchiveMemberHeader`: converts EBCDIC header fields to ASCII on
read
- `ZOSArchive`: parses the __.SYMDEF symbol table, converting EBCDIC
symbol names to ASCII
- Updates to symbol table traversal for `K_ZOS`, which uses big-endian
4-byte offsets paired with 4-byte attribute words per symbol
 
 This is part 2 of a patch series adding `z/OS` archive support to LLVM.
 
 Part 1: llvm#186854
When sve2/sme are available, this sequence provides faster and smaller
codegen than the current lowering:
```
    clmul.i16(a, b) = xor(pmullb(a_lo, b_lo),
                          lsl(xor(pmul(a_hi, b_lo),
                                  pmul(a_lo, b_hi)),
                              8))
``` 
Assisted-by: codex with gpt-5.5
This way, clients including libc/shared/math.h don't need to `#pragma
GCC diagnostic ignored "-Wshadow"` around the include.

This works locally after llvm#196337 llvm#196342 llvm#196346.

CI also needed llvm#196529 llvm#196810 llvm#196850 llvm#196851 llvm#196852 llvm#196853 llvm#196855
llvm#196857 llvm#196858 llvm#196859 llvm#196860.
Reverts llvm#196519

Passed CI on the PR, but apparently breaks several bots.
The work optimize fneg and fsub when packed half math instructions are supported.
  On global isel path, for wider vectors of G_FSUB with element type of f16, we should
split them to v2f16 for v_pk_add_f16 to be selected.
  On SelectionDAG path, we make FNEG legal, and also make sure to split wider vectors
to v2f16. In this way, we can fold fneg into the source modifiers for packed half ops.
…ode (llvm#194856)

This is a reland PR, related to llvm#183988 

I added an extra check in handleBlocksAttr to ensure that illegal Decl
values ​​are not passed to downstream functions.
And remove unnecessary check in `CheckCompleteVariableDeclaration`.

Also added a extra regression test.

Fixes llvm#183974
With `EnableDebugBuffering`, the debug log is stored in a circular
buffer and printed, with a nice banner, on program termination - this is
achieved via a signal handler. For in-process tool execution, such as
for running the regression tests using daemon versions of the tools, we
need to be able to trigger the printing/flushing of the debug log from
the process itself. This PR just adds a small function `printDebugLog`
which checks if debug output and debug log buffering are enabled and, if
so, prints the debug log.

The code for printing the debug log in the signal handler is moved to a
new function `printDebugLogImpl` which is called by the signal handler
and `printDebugLog` - the reason this is separate from `printDebugLog`
is to avoid running the option check in the signal handler
implementation, in case options were reset before the signal handler is
called, as this would be an unintentional behavioral change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.