[BOLT] Gadget scanner: reformulate the state for data-flow analysis #131898

atrosinenko · 2025-03-18T19:25:15Z

In preparation for implementing support for detection of non-protected
call instructions, refine the definition of state which is computed for
each register by data-flow analysis.

Explicitly marking the registers which are known to be trusted at
function entry is crucial for finding non-protected calls. In addition,
it fixes less-common false negatives for pac-ret, such as ret x1 in
f_nonx30_ret_non_auted test case.

atrosinenko · 2025-03-18T19:25:29Z

llvmbot · 2025-03-18T19:32:25Z

@llvm/pr-subscribers-bolt

Author: Anatoly Trosinenko (atrosinenko)

Changes

In preparation for implementing support for detection of non-protected
call instructions, refine the definition of state which is computed for
each register by data-flow analysis.

Explicitly marking the registers which are known to be trusted at
function entry is crucial for finding non-protected calls. In addition,
it fixes less-common false negatives for pac-ret, such as ret x1 in
f_nonx30_ret_non_auted test case.

Full diff: https://github.com/llvm/llvm-project/pull/131898.diff

6 Files Affected:

(modified) bolt/include/bolt/Core/MCPlusBuilder.h (+10)
(modified) bolt/include/bolt/Passes/PAuthGadgetScanner.h (+3-4)
(modified) bolt/lib/Passes/PAuthGadgetScanner.cpp (+78-51)
(modified) bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp (+4)
(modified) bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s (+8-11)
(modified) bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s (+1-2)

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index b285138b77fe7..76ea2489e7038 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -551,6 +551,16 @@ class MCPlusBuilder {
     return Analysis->isReturn(Inst);
   }
 
+  /// Returns the registers that are trusted at function entry.
+  ///
+  /// Each register should be treated as if a successfully authenticated
+  /// pointer was written to it before entering the function (i.e. the
+  /// pointer is safe to jump to as well as to be signed).
+  virtual SmallVector<MCPhysReg> getTrustedLiveInRegs() const {
+    llvm_unreachable("not implemented");
+    return {};
+  }
+
   virtual ErrorOr<MCPhysReg> getAuthenticatedReg(const MCInst &Inst) const {
     llvm_unreachable("not implemented");
     return getNoRegister();
diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index f102f1080e2e8..404dde2901767 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -209,13 +209,12 @@ struct Report {
 
 struct GadgetReport : public Report {
   const GadgetKind &Kind;
-  SmallVector<MCPhysReg> AffectedRegisters;
+  SmallVector<MCPhysReg, 1> AffectedRegisters;
   std::vector<MCInstReference> OverwritingInstrs;
 
   GadgetReport(const GadgetKind &Kind, MCInstReference Location,
-               const BitVector &AffectedRegisters)
-      : Report(Location), Kind(Kind),
-        AffectedRegisters(AffectedRegisters.set_bits()) {}
+               MCPhysReg AffectedRegister)
+      : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {}
 
   void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
 
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 04b0923d34b0c..ebfa606ceb7c3 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -126,18 +126,16 @@ class TrackedRegisters {
 
 // The security property that is checked is:
 // When a register is used as the address to jump to in a return instruction,
-// that register must either:
-// (a) never be changed within this function, i.e. have the same value as when
-//     the function started, or
+// that register must be safe-to-dereference. It must either
+// (a) be safe-to-dereference at function entry and never be changed within this
+//     function, i.e. have the same value as when the function started, or
 // (b) the last write to the register must be by an authentication instruction.
 
 // This property is checked by using dataflow analysis to keep track of which
-// registers have been written (def-ed), since last authenticated. Those are
-// exactly the registers containing values that should not be trusted (as they
-// could have changed since the last time they were authenticated). For pac-ret,
-// any return instruction using such a register is a gadget to be reported. For
-// PAuthABI, probably at least any indirect control flow using such a register
-// should be reported.
+// registers have been written (def-ed), since last authenticated. For pac-ret,
+// any return instruction using a register which is not safe-to-dereference is
+// a gadget to be reported. For PAuthABI, probably at least any indirect control
+// flow using such a register should be reported.
 
 // Furthermore, when producing a diagnostic for a found non-pac-ret protected
 // return, the analysis also lists the last instructions that wrote to the
@@ -156,10 +154,29 @@ class TrackedRegisters {
 //    in the gadgets to be reported. This information is used in the second run
 //    to also track which instructions last wrote to those registers.
 
+/// A state representing which registers are safe to use by an instruction
+/// at a given program point.
+///
+/// To simplify reasoning, let's stick with the following approach:
+/// * when state is updated by the data-flow analysis, the sub-, super- and
+///   overlapping registers are marked as needed
+/// * when the particular instruction is checked if it represents a gadget,
+///   the specific bit of BitVector should be usable to answer this.
+///
+/// For example, on AArch64:
+/// * An AUTIZA X0 instruction marks both X0 and W0 (as well as W0_HI) as
+///   safe-to-dereference. It does not change the state of X0_X1, for example,
+///   as super-registers partially retain their old, unsafe values.
+/// * LDR X1, [X0] marks as unsafe both X1 itself and anything it overlaps
+///   with: W1, W1_HI, X0_X1 and so on.
+/// * RET (which is implicitly RET X30) is a protected return if and only if
+///   X30 is safe-to-dereference - the state computed for sub- and
+///   super-registers is not inspected.
 struct State {
-  /// A BitVector containing the registers that have been clobbered, and
-  /// not authenticated.
-  BitVector NonAutClobRegs;
+  /// A BitVector containing the registers that are either safe at function
+  /// entry and were not clobbered yet, or those not clobbered since being
+  /// authenticated.
+  BitVector SafeToDerefRegs;
   /// A vector of sets, only used in the second data flow run.
   /// Each element in the vector represents one of the registers for which we
   /// track the set of last instructions that wrote to this register. For
@@ -169,16 +186,26 @@ struct State {
   std::vector<SmallPtrSet<const MCInst *, 4>> LastInstWritingReg;
   State() {}
   State(unsigned NumRegs, unsigned NumRegsToTrack)
-      : NonAutClobRegs(NumRegs), LastInstWritingReg(NumRegsToTrack) {}
-  State &operator|=(const State &StateIn) {
-    NonAutClobRegs |= StateIn.NonAutClobRegs;
+      : SafeToDerefRegs(NumRegs), LastInstWritingReg(NumRegsToTrack) {}
+
+  /// Returns S, so that S.merge(S1) == S1.merge(S) == S1.
+  static State getMergeNeutralElement(unsigned NumRegs,
+                                      unsigned NumRegsToTrack) {
+    State S(NumRegs, NumRegsToTrack);
+    S.SafeToDerefRegs.set();
+    return S;
+  }
+
+  State &merge(const State &StateIn) {
+    SafeToDerefRegs &= StateIn.SafeToDerefRegs;
     for (unsigned I = 0; I < LastInstWritingReg.size(); ++I)
       for (const MCInst *J : StateIn.LastInstWritingReg[I])
         LastInstWritingReg[I].insert(J);
     return *this;
   }
+
   bool operator==(const State &RHS) const {
-    return NonAutClobRegs == RHS.NonAutClobRegs &&
+    return SafeToDerefRegs == RHS.SafeToDerefRegs &&
            LastInstWritingReg == RHS.LastInstWritingReg;
   }
   bool operator!=(const State &RHS) const { return !((*this) == RHS); }
@@ -199,7 +226,7 @@ static void printLastInsts(
 
 raw_ostream &operator<<(raw_ostream &OS, const State &S) {
   OS << "pacret-state<";
-  OS << "NonAutClobRegs: " << S.NonAutClobRegs << ", ";
+  OS << "SafeToDerefRegs: " << S.SafeToDerefRegs << ", ";
   printLastInsts(OS, S.LastInstWritingReg);
   OS << ">";
   return OS;
@@ -217,8 +244,8 @@ class PacStatePrinter {
 void PacStatePrinter::print(raw_ostream &OS, const State &S) const {
   RegStatePrinter RegStatePrinter(BC);
   OS << "pacret-state<";
-  OS << "NonAutClobRegs: ";
-  RegStatePrinter.print(OS, S.NonAutClobRegs);
+  OS << "SafeToDerefRegs: ";
+  RegStatePrinter.print(OS, S.SafeToDerefRegs);
   OS << ", ";
   printLastInsts(OS, S.LastInstWritingReg);
   OS << ">";
@@ -257,12 +284,24 @@ class PacRetAnalysis
 
   void preflight() {}
 
+  State createEntryState() {
+    State S(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
+    for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs())
+      S.SafeToDerefRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+    return S;
+  }
+
   State getStartingStateAtBB(const BinaryBasicBlock &BB) {
-    return State(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
+    if (BB.isEntryPoint())
+      return createEntryState();
+
+    return State::getMergeNeutralElement(
+        NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
   }
 
   State getStartingStateAtPoint(const MCInst &Point) {
-    return State(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
+    return State::getMergeNeutralElement(
+        NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
   }
 
   void doConfluence(State &StateOut, const State &StateIn) {
@@ -277,7 +316,7 @@ class PacRetAnalysis
       dbgs() << ")\n";
     });
 
-    StateOut |= StateIn;
+    StateOut.merge(StateIn);
 
     LLVM_DEBUG({
       dbgs() << "   merged state: ";
@@ -298,7 +337,7 @@ class PacRetAnalysis
     });
 
     State Next = Cur;
-    BitVector Written = BitVector(NumRegs, false);
+    BitVector Clobbered(NumRegs, false);
     // Assume a call can clobber all registers, including callee-saved
     // registers. There's a good chance that callee-saved registers will be
     // saved on the stack at some point during execution of the callee.
@@ -307,36 +346,27 @@ class PacRetAnalysis
     // Also, not all functions may respect the AAPCS ABI rules about
     // caller/callee-saved registers.
     if (BC.MIB->isCall(Point))
-      Written.set();
+      Clobbered.set();
     else
-      // FIXME: `getWrittenRegs` only sets the register directly written in the
-      // instruction, and the smaller aliasing registers. It does not set the
-      // larger aliasing registers. To also set the larger aliasing registers,
-      // we'd have to call `getClobberedRegs`.
-      // It is unclear if there is any test case which shows a different
-      // behaviour between using `getWrittenRegs` vs `getClobberedRegs`. We'd
-      // first would like to see such a test case before making a decision
-      // on whether using `getClobberedRegs` below would be better.
-      // Also see the discussion on this at
-      // https://github.com/llvm/llvm-project/pull/122304#discussion_r1939511909
-      BC.MIB->getWrittenRegs(Point, Written);
-    Next.NonAutClobRegs |= Written;
+      BC.MIB->getClobberedRegs(Point, Clobbered);
+    Next.SafeToDerefRegs.reset(Clobbered);
     // Keep track of this instruction if it writes to any of the registers we
     // need to track that for:
     for (MCPhysReg Reg : RegsToTrackInstsFor.getRegisters())
-      if (Written[Reg])
+      if (Clobbered[Reg])
         lastWritingInsts(Next, Reg) = {&Point};
 
     ErrorOr<MCPhysReg> AutReg = BC.MIB->getAuthenticatedReg(Point);
     if (AutReg && *AutReg != BC.MIB->getNoRegister()) {
-      // FIXME: should we use `OnlySmaller=false` below? See similar
-      // FIXME about `getWrittenRegs` above and further discussion about this
-      // at
-      // https://github.com/llvm/llvm-project/pull/122304#discussion_r1939515516
-      Next.NonAutClobRegs.reset(
-          BC.MIB->getAliases(*AutReg, /*OnlySmaller=*/true));
-      if (RegsToTrackInstsFor.isTracked(*AutReg))
-        lastWritingInsts(Next, *AutReg).clear();
+      // The sub-registers of *AutReg are also trusted now, but not its
+      // super-registers (as they retain untrusted register units).
+      BitVector AuthenticatedSubregs =
+          BC.MIB->getAliases(*AutReg, /*OnlySmaller=*/true);
+      for (MCPhysReg Reg : AuthenticatedSubregs.set_bits()) {
+        Next.SafeToDerefRegs.set(Reg);
+        if (RegsToTrackInstsFor.isTracked(Reg))
+          lastWritingInsts(Next, Reg).clear();
+      }
     }
 
     LLVM_DEBUG({
@@ -397,14 +427,11 @@ static std::shared_ptr<Report> tryCheckReturn(const BinaryContext &BC,
   });
   if (BC.MIB->isAuthenticationOfReg(Inst, RetReg))
     return nullptr;
-  BitVector UsedDirtyRegs = S.NonAutClobRegs;
-  LLVM_DEBUG({ traceRegMask(BC, "NonAutClobRegs at Ret", UsedDirtyRegs); });
-  UsedDirtyRegs &= BC.MIB->getAliases(RetReg, /*OnlySmaller=*/true);
-  LLVM_DEBUG({ traceRegMask(BC, "Intersection with RetReg", UsedDirtyRegs); });
-  if (!UsedDirtyRegs.any())
+  LLVM_DEBUG({ traceRegMask(BC, "SafeToDerefRegs", S.SafeToDerefRegs); });
+  if (S.SafeToDerefRegs[RetReg])
     return nullptr;
 
-  return std::make_shared<GadgetReport>(RetKind, Inst, UsedDirtyRegs);
+  return std::make_shared<GadgetReport>(RetKind, Inst, RetReg);
 }
 
 FunctionAnalysisResult
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 613b24c4553e2..d238a1df5c7d7 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -191,6 +191,10 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
     return false;
   }
 
+  SmallVector<MCPhysReg> getTrustedLiveInRegs() const override {
+    return {AArch64::LR};
+  }
+
   ErrorOr<MCPhysReg> getAuthenticatedReg(const MCInst &Inst) const override {
     switch (Inst.getOpcode()) {
     case AArch64::AUTIAZ:
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
index 0d263199b376f..586da6d2a92e4 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -183,17 +183,14 @@ f_tail_called:
         .globl  f_nonx30_ret_non_auted
         .type   f_nonx30_ret_non_auted,@function
 f_nonx30_ret_non_auted:
-// FIXME: x1 is not authenticated, so should this be reported?
-//        Note that we assume it's fine for x30 to not be authenticated before
-//        returning to, as assuming that x30 is not attacker controlled at function
-//        entry is part (implicitly) of the pac-ret hardening scheme.
-//        It's probably an open question whether for other hardening schemes, such as
-//        PAuthABI, which registers should be considered "clean" or not at function entry.
-//        In other words, which registers have to be authenticated before being used as
-//        a pointer and which ones not?
-//        For a more detailed discussion, see
-//        https://github.com/llvm/llvm-project/pull/122304#discussion_r1923662744
-// CHECK-NOT: f_nonx30_ret_non_auted
+// x1 is neither authenticated nor implicitly considered safe at function entry.
+// Note that we assume it's fine for x30 to not be authenticated before
+// returning to, as assuming that x30 is not attacker controlled at function
+// entry is part (implicitly) of the pac-ret hardening scheme.
+//
+// CHECK-LABEL: GS-PAUTH: non-protected ret found in function f_nonx30_ret_non_auted, basic block {{[0-9a-zA-Z.]+}}, at address
+// CHECK-NEXT:    The instruction is     {{[0-9a-f]+}}:       ret
+// CHECK-NEXT:    The 0 instructions that write to the affected registers after any authentication are:
         ret     x1
         .size f_nonx30_ret_non_auted, .-f_nonx30_ret_non_auted
 
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
index f74e825ed8fc1..bd8edbc676c34 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-multi-bb.s
@@ -17,9 +17,8 @@ f_crossbb1:
         .size f_crossbb1, .-f_crossbb1
 // CHECK-LABEL: GS-PAUTH: non-protected ret found in function f_crossbb1, basic block {{[^,]+}}, at address
 // CHECK-NEXT:  The instruction is     {{[0-9a-f]+}}:       ret
-// CHECK-NEXT:  The 2 instructions that write to the affected registers after any authentication are:
+// CHECK-NEXT:  The 1 instructions that write to the affected registers after any authentication are:
 // CHECK-NEXT:  1.     {{[0-9a-f]+}}:      ldp     x29, x30, [sp], #0x10
-// CHECK-NEXT:  2.     {{[0-9a-f]+}}:      autiasp
 
 // A test that checks that the dataflow state tracking across when merging BBs
 // seems to work:

In preparation for implementing support for detection of non-protected call instructions, refine the definition of state which is computed for each register by data-flow analysis. Explicitly marking the registers which are known to be trusted at function entry is crucial for finding non-protected calls. In addition, it fixes less-common false negatives for pac-ret, such as `ret x1` in `f_nonx30_ret_non_auted` test case.

kbeyls

Mostly looks good to me, I just have a few minor comments.

kbeyls · 2025-03-24T16:17:10Z

bolt/include/bolt/Core/MCPlusBuilder.h

+  /// Returns the registers that are trusted at function entry.
+  ///
+  /// Each register should be treated as if a successfully authenticated
+  /// pointer was written to it before entering the function (i.e. the
+  /// pointer is safe to jump to as well as to be signed).
+  virtual SmallVector<MCPhysReg> getTrustedLiveInRegs() const {
+    llvm_unreachable("not implemented");
+    return {};
+  }


I'm wondering if MCPlusBuilder is the right place for this to live...
The reason why I'm not sure is that to me it seems that MCPlusBuilder is mostly about querying the property of instructions, maybe at most relative to an assumed ABI.
It seems to me that getTrustedLiveInRegs might be encoding an assumed, implicit, threat model too?

Apologies for not explaining this very well. I'm just trying to make sure this function goes into the most appropriate place.

It looks like the set of registers returned by getTrustedLiveInRegs on AArch64 can be derived from the fact that LR is set by branch-with-link instructions. To some extent, this does look more like a property of the ABI, but as far as I can see target-specific hooks are placed either to lib/Target/XYZ/XYZMCPlusBuilder.cpp or to lib/Target/XYZ/XYZMCSymbolizer.(h|cpp), so there doesn't seem to be many places where such target-specific hook can be defined.

Ah yes, you're right, there aren't many places in Bolt that look target-specific... OK, let's just keep it in MCPlusBuilder.

kbeyls · 2025-03-25T08:45:13Z

bolt/lib/Passes/PAuthGadgetScanner.cpp

+      : SafeToDerefRegs(NumRegs), LastInstWritingReg(NumRegsToTrack) {}
+
+  /// Returns S, so that S.merge(S1) == S1.merge(S) == S1.
+  static State getMergeNeutralElement(unsigned NumRegs,


I wasn't fully sure what getMergeNeutralElement meant when I first read this code linearly.
Reading further, I see this method is used to initialize the state object in the dataflow analysis.
I'm wondering if it would be easier to understand for most readers of this code if the name instead were something like createInitialState or something similar?

My idea was that this static function returns a neutral element w.r.t. merge operation (in an algebraic sense), this is the reason for keeping it right in the State class, as opposed to getStartingStateAt* methods of PacRetAnalysis, as this function should be aligned with merge. Maybe this is a bit "overengineered" name for a function returning an initial state, but after all, "starting state" returned by PacRetAnalysis::getStartingStateAt* is "initial" as well, but for a different purpose.

On the other hand, this function definitely deserves an explanation that this "initial" state is initial in the same sense as 0 being an initial value for accumulator when computing a sum of a set of integers. I will add a comment explaining what is the expected usage of this function, thank you for pointing this out!

After thinking a bit more, finally dropped this function and switched to handling "empty" state explicitly. It is indeed rather hard to reason about when/whether the state of entry block (which is the only state known at the beginning on the analysis) is finally propagated into every other program point's state.

Ah yes, so there now is an additional "empty" or "uninitialized" state. Or sometimes, in dataflow "lattice" theory terms: the bottom (⟂) state. I'm not sure if everywhere in the world, lattice theory is taught using the same symbols and terms. Apologies if the terms I use here don't make sense.

FWIW, in my prototype scanner for stack clash, I did end up writing some supporting classes to make handling "uninitialized" state more generic. See (commits such as 95119f4, 5c7dca7, and 728abe1).

That being said, I don't think it makes sense to use that here, it's simpler the way you've written it here.
If we'd end up adding lots of data flow analyses (maybe also for other binary analyses), we could introduce that generic LatticeT class I'm pointing to above.

bolt/lib/Passes/PAuthGadgetScanner.cpp

kbeyls · 2025-03-25T16:26:24Z

bolt/include/bolt/Core/MCPlusBuilder.h

+  /// Returns the registers that are trusted at function entry.
+  ///
+  /// Each register should be treated as if a successfully authenticated
+  /// pointer was written to it before entering the function (i.e. the
+  /// pointer is safe to jump to as well as to be signed).
+  virtual SmallVector<MCPhysReg> getTrustedLiveInRegs() const {
+    llvm_unreachable("not implemented");
+    return {};
+  }


Ah yes, you're right, there aren't many places in Bolt that look target-specific... OK, let's just keep it in MCPlusBuilder.

kbeyls · 2025-03-25T16:39:04Z

bolt/lib/Passes/PAuthGadgetScanner.cpp

+      : SafeToDerefRegs(NumRegs), LastInstWritingReg(NumRegsToTrack) {}
+
+  /// Returns S, so that S.merge(S1) == S1.merge(S) == S1.
+  static State getMergeNeutralElement(unsigned NumRegs,


Ah yes, so there now is an additional "empty" or "uninitialized" state. Or sometimes, in dataflow "lattice" theory terms: the bottom (⟂) state. I'm not sure if everywhere in the world, lattice theory is taught using the same symbols and terms. Apologies if the terms I use here don't make sense.

FWIW, in my prototype scanner for stack clash, I did end up writing some supporting classes to make handling "uninitialized" state more generic. See (commits such as 95119f4, 5c7dca7, and 728abe1).

That being said, I don't think it makes sense to use that here, it's simpler the way you've written it here.
If we'd end up adding lots of data flow analyses (maybe also for other binary analyses), we could introduce that generic LatticeT class I'm pointing to above.

kbeyls · 2025-03-25T16:42:26Z

bolt/lib/Passes/PAuthGadgetScanner.cpp

+      // to Inst, assume it to be unreachable and report a warning.
+      if (S.empty()) {
+        Result.Diagnostics.push_back(std::make_shared<GenericReport>(
+            Inst, "Warning: unreachable instruction found"));


I seem to remember that if you do this analysis on a large amount of code (for example all libraries in a linux distro), it will find unreachable code.
I'm happy to keep this warning, but it might be that this diagnostic turns out to be too noisy when scanning large amounts of code....
We can fix/tune that in another PR though...

My initial intention was not to skip instructions silently (in case it is skipped due to some error in PacRetAnalysis). On one hand, this should be caught by the tests, on the other hand, this warning helps implementing the tests. Anyway, this should be trivial to remove this warning later.

This was referenced Mar 18, 2025

[BOLT] Gadget scanner: factor out utility code #131895

Merged

[BOLT] Gadget scanner: streamline issue reporting #131896

Merged

This was referenced Mar 18, 2025

[BOLT] Gadget scanner: refactor analysis of RET instructions #131897

Merged

[BOLT] Gadget scanner: detect non-protected indirect calls #131899

Merged

atrosinenko requested review from asl, jacobbramley, kbeyls and smithp35 March 18, 2025 19:30

atrosinenko marked this pull request as ready for review March 18, 2025 19:30

atrosinenko requested review from aaupov, maksfb, rafaelauler, ayermolo, dcci and yota9 as code owners March 18, 2025 19:30

llvmbot added the BOLT label Mar 18, 2025

atrosinenko force-pushed the users/atrosinenko/bolt-gs-reformulate-dfa-state branch from aa9215f to 7fecfc4 Compare March 20, 2025 13:34

atrosinenko force-pushed the users/atrosinenko/bolt-gs-refactor-pacret-analysis branch 2 times, most recently from 023876d to a5d954c Compare March 20, 2025 16:39

atrosinenko force-pushed the users/atrosinenko/bolt-gs-reformulate-dfa-state branch from 7fecfc4 to 9de64af Compare March 20, 2025 16:39

atrosinenko force-pushed the users/atrosinenko/bolt-gs-refactor-pacret-analysis branch from a5d954c to 856d59c Compare March 20, 2025 18:15

atrosinenko force-pushed the users/atrosinenko/bolt-gs-reformulate-dfa-state branch from 9de64af to 3f11341 Compare March 20, 2025 18:15

atrosinenko force-pushed the users/atrosinenko/bolt-gs-refactor-pacret-analysis branch from 856d59c to d9dbf0a Compare March 21, 2025 08:21

atrosinenko force-pushed the users/atrosinenko/bolt-gs-reformulate-dfa-state branch 2 times, most recently from 709008c to 479bf68 Compare March 21, 2025 15:52

Base automatically changed from users/atrosinenko/bolt-gs-refactor-pacret-analysis to main March 21, 2025 16:54

atrosinenko force-pushed the users/atrosinenko/bolt-gs-reformulate-dfa-state branch from 479bf68 to 7e77ccb Compare March 21, 2025 16:57

atrosinenko mentioned this pull request Mar 22, 2025

[BOLT] Gadget scanner: detect address materialization and arithmetic #132540

Merged

kbeyls reviewed Mar 25, 2025

View reviewed changes

Handle empty state explicitly

345ad00

kbeyls approved these changes Mar 25, 2025

View reviewed changes

atrosinenko merged commit b6b40e9 into main Mar 25, 2025
10 checks passed

atrosinenko deleted the users/atrosinenko/bolt-gs-reformulate-dfa-state branch March 25, 2025 18:45

This was referenced Mar 27, 2025

[BOLT] Make DataflowAnalysis::getStateBefore() const (NFC) #133308

Merged

[BOLT] Gadget scanner: analyze functions without CFG information #133461

Merged

[BOLT] Gadget scanner: detect signing oracles #134146

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BOLT] Gadget scanner: reformulate the state for data-flow analysis #131898

[BOLT] Gadget scanner: reformulate the state for data-flow analysis #131898

Uh oh!

atrosinenko commented Mar 18, 2025

Uh oh!

atrosinenko commented Mar 18, 2025 •

edited

Loading

Uh oh!

llvmbot commented Mar 18, 2025

Uh oh!

kbeyls left a comment

Uh oh!

kbeyls Mar 24, 2025

Uh oh!

atrosinenko Mar 25, 2025

Uh oh!

kbeyls Mar 25, 2025

Uh oh!

kbeyls Mar 25, 2025

Uh oh!

atrosinenko Mar 25, 2025

Uh oh!

atrosinenko Mar 25, 2025

Uh oh!

kbeyls Mar 25, 2025

Uh oh!

Uh oh!

kbeyls Mar 25, 2025

Uh oh!

kbeyls Mar 25, 2025

Uh oh!

kbeyls Mar 25, 2025

Uh oh!

atrosinenko Mar 25, 2025

Uh oh!

Uh oh!

Uh oh!

[BOLT] Gadget scanner: reformulate the state for data-flow analysis #131898

[BOLT] Gadget scanner: reformulate the state for data-flow analysis #131898

Uh oh!

Conversation

atrosinenko commented Mar 18, 2025

Uh oh!

atrosinenko commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 18, 2025

Uh oh!

kbeyls left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

atrosinenko commented Mar 18, 2025 •

edited

Loading