Skip to content

[BOLT] Gadget scanner: refactor analysis of RET instructions #131897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 21, 2025

Conversation

atrosinenko
Copy link
Contributor

In preparation for implementing detection of more gadget kinds,
refactor checking for non-protected return instructions.

@llvmbot
Copy link
Member

llvmbot commented Mar 18, 2025

@llvm/pr-subscribers-bolt

Author: Anatoly Trosinenko (atrosinenko)

Changes

In preparation for implementing detection of more gadget kinds,
refactor checking for non-protected return instructions.


Full diff: https://github.com/llvm/llvm-project/pull/131897.diff

2 Files Affected:

  • (modified) bolt/include/bolt/Passes/PAuthGadgetScanner.h (+18-5)
  • (modified) bolt/lib/Passes/PAuthGadgetScanner.cpp (+77-61)
diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
index 2d8109f8ca43b..f102f1080e2e8 100644
--- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h
+++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h
@@ -199,19 +199,34 @@ struct Report {
   virtual void generateReport(raw_ostream &OS,
                               const BinaryContext &BC) const = 0;
 
+  virtual const ArrayRef<MCPhysReg> getAffectedRegisters() const { return {}; }
+  virtual void
+  setOverwritingInstrs(const std::vector<MCInstReference> &Instrs) {}
+
   void printBasicInfo(raw_ostream &OS, const BinaryContext &BC,
                       StringRef IssueKind) const;
 };
 
 struct GadgetReport : public Report {
   const GadgetKind &Kind;
+  SmallVector<MCPhysReg> AffectedRegisters;
   std::vector<MCInstReference> OverwritingInstrs;
 
   GadgetReport(const GadgetKind &Kind, MCInstReference Location,
-               std::vector<MCInstReference> OverwritingInstrs)
-      : Report(Location), Kind(Kind), OverwritingInstrs(OverwritingInstrs) {}
+               const BitVector &AffectedRegisters)
+      : Report(Location), Kind(Kind),
+        AffectedRegisters(AffectedRegisters.set_bits()) {}
 
   void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
+
+  const ArrayRef<MCPhysReg> getAffectedRegisters() const override {
+    return AffectedRegisters;
+  }
+
+  void
+  setOverwritingInstrs(const std::vector<MCInstReference> &Instrs) override {
+    OverwritingInstrs = Instrs;
+  }
 };
 
 /// Report with a free-form message attached.
@@ -224,7 +239,6 @@ struct GenericReport : public Report {
 };
 
 struct FunctionAnalysisResult {
-  SmallSet<MCPhysReg, 1> RegistersAffected;
   std::vector<std::shared_ptr<Report>> Diagnostics;
 };
 
@@ -232,8 +246,7 @@ class Analysis : public BinaryFunctionPass {
   void runOnFunction(BinaryFunction &Function,
                      MCPlusBuilder::AllocatorIdTy AllocatorId);
   FunctionAnalysisResult
-  computeDfState(PacRetAnalysis &PRA, BinaryFunction &BF,
-                 MCPlusBuilder::AllocatorIdTy AllocatorId);
+  computeDfState(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId);
 
   std::map<const BinaryFunction *, FunctionAnalysisResult> AnalysisResults;
   std::mutex AnalysisResultsMutex;
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 813fa43c85dc5..04b0923d34b0c 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -353,7 +353,7 @@ class PacRetAnalysis
 public:
   std::vector<MCInstReference>
   getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF,
-                         const BitVector &UsedDirtyRegs) const {
+                         const ArrayRef<MCPhysReg> UsedDirtyRegs) const {
     if (RegsToTrackInstsFor.empty())
       return {};
     auto MaybeState = getStateAt(Ret);
@@ -362,7 +362,7 @@ class PacRetAnalysis
     const State &S = *MaybeState;
     // Due to aliasing registers, multiple registers may have been tracked.
     std::set<const MCInst *> LastWritingInsts;
-    for (MCPhysReg TrackedReg : UsedDirtyRegs.set_bits()) {
+    for (MCPhysReg TrackedReg : UsedDirtyRegs) {
       for (const MCInst *Inst : lastWritingInsts(S, TrackedReg))
         LastWritingInsts.insert(Inst);
     }
@@ -376,57 +376,81 @@ class PacRetAnalysis
   }
 };
 
+static std::shared_ptr<Report> tryCheckReturn(const BinaryContext &BC,
+                                              const MCInstReference &Inst,
+                                              const State &S) {
+  static const GadgetKind RetKind("non-protected ret found");
+  if (!BC.MIB->isReturn(Inst))
+    return nullptr;
+
+  ErrorOr<MCPhysReg> MaybeRetReg = BC.MIB->getRegUsedAsRetDest(Inst);
+  if (MaybeRetReg.getError()) {
+    return std::make_shared<GenericReport>(
+        Inst, "Warning: pac-ret analysis could not analyze this return "
+              "instruction");
+  }
+  MCPhysReg RetReg = *MaybeRetReg;
+  LLVM_DEBUG({
+    traceInst(BC, "Found RET inst", Inst);
+    traceReg(BC, "RetReg", RetReg);
+    traceReg(BC, "Authenticated reg", BC.MIB->getAuthenticatedReg(Inst));
+  });
+  if (BC.MIB->isAuthenticationOfReg(Inst, RetReg))
+    return nullptr;
+  BitVector UsedDirtyRegs = S.NonAutClobRegs;
+  LLVM_DEBUG({ traceRegMask(BC, "NonAutClobRegs at Ret", UsedDirtyRegs); });
+  UsedDirtyRegs &= BC.MIB->getAliases(RetReg, /*OnlySmaller=*/true);
+  LLVM_DEBUG({ traceRegMask(BC, "Intersection with RetReg", UsedDirtyRegs); });
+  if (!UsedDirtyRegs.any())
+    return nullptr;
+
+  return std::make_shared<GadgetReport>(RetKind, Inst, UsedDirtyRegs);
+}
+
 FunctionAnalysisResult
-Analysis::computeDfState(PacRetAnalysis &PRA, BinaryFunction &BF,
+Analysis::computeDfState(BinaryFunction &BF,
                          MCPlusBuilder::AllocatorIdTy AllocatorId) {
+  FunctionAnalysisResult Result;
+
+  PacRetAnalysis PRA(BF, AllocatorId, {});
   PRA.run();
   LLVM_DEBUG({
     dbgs() << " After PacRetAnalysis:\n";
     BF.dump();
   });
 
-  FunctionAnalysisResult Result;
-  // Now scan the CFG for non-authenticating return instructions that use an
-  // overwritten, non-authenticated register as return address.
   BinaryContext &BC = BF.getBinaryContext();
   for (BinaryBasicBlock &BB : BF) {
-    for (int64_t I = BB.size() - 1; I >= 0; --I) {
-      MCInst &Inst = BB.getInstructionAtIndex(I);
-      if (BC.MIB->isReturn(Inst)) {
-        ErrorOr<MCPhysReg> MaybeRetReg = BC.MIB->getRegUsedAsRetDest(Inst);
-        if (MaybeRetReg.getError()) {
-          Result.Diagnostics.push_back(std::make_shared<GenericReport>(
-              MCInstInBBReference(&BB, I),
-              "Warning: pac-ret analysis could not analyze this return "
-              "instruction"));
-          continue;
-        }
-        MCPhysReg RetReg = *MaybeRetReg;
-        LLVM_DEBUG({
-          traceInst(BC, "Found RET inst", Inst);
-          traceReg(BC, "RetReg", RetReg);
-          traceReg(BC, "Authenticated reg", BC.MIB->getAuthenticatedReg(Inst));
-        });
-        if (BC.MIB->isAuthenticationOfReg(Inst, RetReg))
-          break;
-        BitVector UsedDirtyRegs = PRA.getStateAt(Inst)->NonAutClobRegs;
-        LLVM_DEBUG(
-            { traceRegMask(BC, "NonAutClobRegs at Ret", UsedDirtyRegs); });
-        UsedDirtyRegs &= BC.MIB->getAliases(RetReg, /*OnlySmaller=*/true);
-        LLVM_DEBUG(
-            { traceRegMask(BC, "Intersection with RetReg", UsedDirtyRegs); });
-        if (UsedDirtyRegs.any()) {
-          static const GadgetKind RetKind("non-protected ret found");
-          // This return instruction needs to be reported
-          Result.Diagnostics.push_back(std::make_shared<GadgetReport>(
-              RetKind, MCInstInBBReference(&BB, I),
-              PRA.getLastClobberingInsts(Inst, BF, UsedDirtyRegs)));
-          for (MCPhysReg RetRegWithGadget : UsedDirtyRegs.set_bits())
-            Result.RegistersAffected.insert(RetRegWithGadget);
-        }
-      }
+    for (int64_t I = 0, E = BB.size(); I < E; ++I) {
+      MCInstReference Inst(&BB, I);
+      const State &S = *PRA.getStateAt(Inst);
+
+      if (auto Report = tryCheckReturn(BC, Inst, S))
+        Result.Diagnostics.push_back(Report);
     }
   }
+
+  if (Result.Diagnostics.empty())
+    return Result;
+
+  // Redo the analysis, but now also track which instructions last wrote
+  // to any of the registers in RetRegsWithGadgets, so that better
+  // diagnostics can be produced.
+
+  SmallSet<MCPhysReg, 4> RegsToTrack;
+  for (auto Report : Result.Diagnostics)
+    for (MCPhysReg Reg : Report->getAffectedRegisters())
+      RegsToTrack.insert(Reg);
+
+  std::vector<MCPhysReg> RegsToTrackVec(RegsToTrack.begin(), RegsToTrack.end());
+
+  PacRetAnalysis PRWIA(BF, AllocatorId, RegsToTrackVec);
+  PRWIA.run();
+  for (auto Report : Result.Diagnostics) {
+    Report->setOverwritingInstrs(PRWIA.getLastClobberingInsts(
+        Report->Location, BF, Report->getAffectedRegisters()));
+  }
+
   return Result;
 }
 
@@ -438,27 +462,19 @@ void Analysis::runOnFunction(BinaryFunction &BF,
     BF.dump();
   });
 
-  if (BF.hasCFG()) {
-    PacRetAnalysis PRA(BF, AllocatorId, {});
-    FunctionAnalysisResult FAR = computeDfState(PRA, BF, AllocatorId);
-    if (!FAR.RegistersAffected.empty()) {
-      // Redo the analysis, but now also track which instructions last wrote
-      // to any of the registers in RetRegsWithGadgets, so that better
-      // diagnostics can be produced.
-      std::vector<MCPhysReg> RegsToTrack;
-      for (MCPhysReg R : FAR.RegistersAffected)
-        RegsToTrack.push_back(R);
-      PacRetAnalysis PRWIA(BF, AllocatorId, RegsToTrack);
-      FAR = computeDfState(PRWIA, BF, AllocatorId);
-    }
+  if (!BF.hasCFG())
+    return;
 
-    // `runOnFunction` is typically getting called from multiple threads in
-    // parallel. Therefore, use a lock to avoid data races when storing the
-    // result of the analysis in the `AnalysisResults` map.
-    {
-      std::lock_guard<std::mutex> Lock(AnalysisResultsMutex);
-      AnalysisResults[&BF] = FAR;
-    }
+  FunctionAnalysisResult FAR = computeDfState(BF, AllocatorId);
+  if (FAR.Diagnostics.empty())
+    return;
+
+  // `runOnFunction` is typically getting called from multiple threads in
+  // parallel. Therefore, use a lock to avoid data races when storing the
+  // result of the analysis in the `AnalysisResults` map.
+  {
+    std::lock_guard<std::mutex> Lock(AnalysisResultsMutex);
+    AnalysisResults[&BF] = FAR;
   }
 }
 

@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-streamline-issue-reporting branch from cf9baa5 to fa3befd Compare March 20, 2025 13:34
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-refactor-pacret-analysis branch from b4e9bd6 to 023876d Compare March 20, 2025 13:34
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-streamline-issue-reporting branch from fa3befd to dcbdeb3 Compare March 20, 2025 16:39
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-refactor-pacret-analysis branch from 023876d to a5d954c Compare March 20, 2025 16:39
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-streamline-issue-reporting branch from dcbdeb3 to af6f92b Compare March 20, 2025 18:15
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-refactor-pacret-analysis branch from a5d954c to 856d59c Compare March 20, 2025 18:15
Base automatically changed from users/atrosinenko/bolt-gs-streamline-issue-reporting to main March 21, 2025 08:19
In preparation for implementing detection of more gadget kinds,
refactor checking for non-protected return instructions.
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-refactor-pacret-analysis branch from 856d59c to d9dbf0a Compare March 21, 2025 08:21
Copy link
Collaborator

@kbeyls kbeyls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ready to be merged to me!

@atrosinenko atrosinenko merged commit 72d1058 into main Mar 21, 2025
10 checks passed
@atrosinenko atrosinenko deleted the users/atrosinenko/bolt-gs-refactor-pacret-analysis branch March 21, 2025 16:55
@kazutakahirata
Copy link
Contributor

@atrosinenko I've landed 9933117 to fix a warning from this PR. Thanks!

@atrosinenko
Copy link
Contributor Author

@kazutakahirata Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants