Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move X86-specific MCSymbolRefExpr::VariantKind to X86MCExpr::Specifier #132149

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Mar 20, 2025

Move target-specific members outside of MCSymbolRefExpr::VariantKind
(a legacy interface I am eliminating). Most changes are mechanic,
except:

  • ELFObjectWriter::shouldRelocateWithSymbol
  • The legacy generic code uses ELFObjectWriter::fixSymbolsInTLSFixups
    to set STT_TLS (and use an unnecessary expression walk). The better
    way is to do this in getRelocType, which I have done for
    AArch64, PowerPC, and RISC-V.

In the future, we should encode expressions with a relocation specifier
as X86MCExpr and use MCValue::RefKind to hold the specifier of the
relocatable expression.
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers

While here, rename "Modifier' to "Specifier":

"Relocation modifier", though concise, suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation. I landed on "relocation specifier" as the winner. It's clear, aligns with Arm and IBM’s usage, and fits the assembler's role seamlessly.

Created using spr 1.3.5-bogner
@llvmbot llvmbot added backend:X86 mc Machine (object) code labels Mar 20, 2025
@MaskRay MaskRay requested review from phoebewang and KanRobert March 20, 2025 05:49
@llvmbot
Copy link
Member

llvmbot commented Mar 20, 2025

@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-x86

Author: Fangrui Song (MaskRay)

Changes

Move target-specific members outside of MCSymbolRefExpr::VariantKind
(a legacy interface I am eliminating). Most changes are mechanic,
except:

  • ELFObjectWriter::shouldRelocateWithSymbol
  • ELFObjectWriter::fixSymbolsInTLSFixups

In the future, we should encode expressions with a relocation specifier
as X86MCExpr and use MCValue::RefKind to hold the specifier of the
relocatable expression.
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers

While here, rename "Modifier' to "Specifier":

> "Relocation modifier", though concise, suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation. I landed on "relocation specifier" as the winner. It's clear, aligns with Arm and IBM’s usage, and fits the assembler's role seamlessly.


Patch is 44.64 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/132149.diff

17 Files Affected:

  • (modified) llvm/include/llvm/MC/MCExpr.h (+1-10)
  • (modified) llvm/lib/MC/ELFObjectWriter.cpp (-4)
  • (modified) llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp (+2-2)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp (+3-2)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp (+81-46)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86EncodingOptimization.cpp (+3-2)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MCAsmInfo.cpp (+28-27)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp (+7-6)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MCExpr.h (+35)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp (+13-12)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFObjectWriter.cpp (+3-3)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFStreamer.cpp (+1)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86WinCOFFTargetStreamer.cpp (+1)
  • (modified) llvm/lib/Target/X86/X86ISelLoweringCall.cpp (+2-2)
  • (modified) llvm/lib/Target/X86/X86MCInstLower.cpp (+40-41)
  • (modified) llvm/lib/Target/X86/X86TargetObjectFile.cpp (+9-4)
  • (modified) llvm/lib/Target/X86/X86TargetObjectFile.h (+1-3)
diff --git a/llvm/include/llvm/MC/MCExpr.h b/llvm/include/llvm/MC/MCExpr.h
index bc300ecdc6ee5..7fa5332700b90 100644
--- a/llvm/include/llvm/MC/MCExpr.h
+++ b/llvm/include/llvm/MC/MCExpr.h
@@ -199,14 +199,10 @@ class MCSymbolRefExpr : public MCExpr {
     VK_GOT,
     VK_GOTENT,
     VK_GOTOFF,
-    VK_GOTREL,
-    VK_PCREL,
     VK_GOTPCREL,
-    VK_GOTPCREL_NORELAX,
     VK_GOTTPOFF,
     VK_INDNTPOFF,
     VK_NTPOFF,
-    VK_GOTNTPOFF,
     VK_PLT,
     VK_TLSGD,
     VK_TLSLD,
@@ -223,7 +219,6 @@ class MCSymbolRefExpr : public MCExpr {
     VK_GOTPAGE,
     VK_GOTPAGEOFF,
     VK_SECREL,
-    VK_SIZE,    // symbol@SIZE
     VK_WEAKREF, // The link between the symbols in .weakref foo, bar
     VK_FUNCDESC,
     VK_GOTFUNCDESC,
@@ -232,9 +227,6 @@ class MCSymbolRefExpr : public MCExpr {
     VK_TLSLDM_FDPIC,
     VK_GOTTPOFF_FDPIC,
 
-    VK_X86_ABS8,
-    VK_X86_PLTOFF,
-
     VK_ARM_NONE,
     VK_ARM_GOT_PREL,
     VK_ARM_TARGET1,
@@ -261,8 +253,7 @@ class MCSymbolRefExpr : public MCExpr {
     VK_AMDGPU_ABS32_LO,      // symbol@abs32@lo
     VK_AMDGPU_ABS32_HI,      // symbol@abs32@hi
 
-    VK_TPREL,
-    VK_DTPREL
+    FirstTargetSpecifier,
   };
 
 private:
diff --git a/llvm/lib/MC/ELFObjectWriter.cpp b/llvm/lib/MC/ELFObjectWriter.cpp
index dbef2875c6777..884633ea0d79a 100644
--- a/llvm/lib/MC/ELFObjectWriter.cpp
+++ b/llvm/lib/MC/ELFObjectWriter.cpp
@@ -1269,7 +1269,6 @@ bool ELFObjectWriter::shouldRelocateWithSymbol(const MCAssembler &Asm,
   case MCSymbolRefExpr::VK_GOT:
   case MCSymbolRefExpr::VK_PLT:
   case MCSymbolRefExpr::VK_GOTPCREL:
-  case MCSymbolRefExpr::VK_GOTPCREL_NORELAX:
     return true;
   }
 
@@ -1526,16 +1525,13 @@ void ELFObjectWriter::fixSymbolsInTLSFixups(MCAssembler &Asm,
     case MCSymbolRefExpr::VK_GOTTPOFF:
     case MCSymbolRefExpr::VK_INDNTPOFF:
     case MCSymbolRefExpr::VK_NTPOFF:
-    case MCSymbolRefExpr::VK_GOTNTPOFF:
     case MCSymbolRefExpr::VK_TLSCALL:
     case MCSymbolRefExpr::VK_TLSDESC:
     case MCSymbolRefExpr::VK_TLSGD:
     case MCSymbolRefExpr::VK_TLSLD:
     case MCSymbolRefExpr::VK_TLSLDM:
     case MCSymbolRefExpr::VK_TPOFF:
-    case MCSymbolRefExpr::VK_TPREL:
     case MCSymbolRefExpr::VK_DTPOFF:
-    case MCSymbolRefExpr::VK_DTPREL:
       break;
     }
     Asm.registerSymbol(symRef.getSymbol());
diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
index a6285a55f4155..cd38ac85dac7a 100644
--- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -2116,7 +2116,7 @@ bool X86AsmParser::ParseIntelExpression(IntelExprStateMachine &SM, SMLoc &End) {
         if (IDVal == "f" || IDVal == "b") {
           MCSymbol *Sym =
               getContext().getDirectionalLocalSymbol(IntVal, IDVal == "b");
-          MCSymbolRefExpr::VariantKind Variant = MCSymbolRefExpr::VK_None;
+          auto Variant = X86MCExpr::VK_None;
           const MCExpr *Val =
               MCSymbolRefExpr::create(Sym, Variant, getContext());
           if (IDVal == "b" && Sym->isUndefined())
@@ -2263,7 +2263,7 @@ bool X86AsmParser::ParseIntelInlineAsmIdentifier(
     return false;
   // Create the symbol reference.
   MCSymbol *Sym = getContext().getOrCreateSymbol(Identifier);
-  MCSymbolRefExpr::VariantKind Variant = MCSymbolRefExpr::VK_None;
+  auto Variant = X86MCExpr::VK_None;
   Val = MCSymbolRefExpr::create(Sym, Variant, getParser().getContext());
   return false;
 }
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
index 11ae2a90cbdec..b5b36e7bea2c7 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
@@ -9,6 +9,7 @@
 #include "MCTargetDesc/X86BaseInfo.h"
 #include "MCTargetDesc/X86EncodingOptimization.h"
 #include "MCTargetDesc/X86FixupKinds.h"
+#include "MCTargetDesc/X86MCExpr.h"
 #include "llvm/ADT/StringSwitch.h"
 #include "llvm/BinaryFormat/ELF.h"
 #include "llvm/BinaryFormat/MachO.h"
@@ -360,7 +361,7 @@ static bool hasVariantSymbol(const MCInst &MI) {
       continue;
     const MCExpr &Expr = *Operand.getExpr();
     if (Expr.getKind() == MCExpr::SymbolRef &&
-        cast<MCSymbolRefExpr>(Expr).getKind() != MCSymbolRefExpr::VK_None)
+        getSpecifier(cast<MCSymbolRefExpr>(&Expr)) != X86MCExpr::VK_None)
       return true;
   }
   return false;
@@ -746,7 +747,7 @@ bool X86AsmBackend::fixupNeedsRelaxationAdvanced(const MCAssembler &Asm,
     MCValue Target;
     if (Fixup.getValue()->evaluateAsRelocatable(Target, &Asm) &&
         Target.getSymA() &&
-        Target.getSymA()->getKind() == MCSymbolRefExpr::VK_X86_ABS8)
+        getSpecifier(Target.getSymA()) == X86MCExpr::VK_ABS8)
       return false;
   }
   return true;
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
index 927a566f8c2bf..112137eda163a 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86ELFObjectWriter.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "MCTargetDesc/X86FixupKinds.h"
+#include "MCTargetDesc/X86MCExpr.h"
 #include "MCTargetDesc/X86MCTargetDesc.h"
 #include "llvm/BinaryFormat/ELF.h"
 #include "llvm/MC/MCAsmInfo.h"
@@ -32,6 +33,8 @@ class X86ELFObjectWriter : public MCELFObjectTargetWriter {
 protected:
   unsigned getRelocType(MCContext &Ctx, const MCValue &Target,
                         const MCFixup &Fixup, bool IsPCRel) const override;
+  bool needsRelocateWithSymbol(const MCValue &Val, const MCSymbol &Sym,
+                               unsigned Type) const override;
 };
 
 } // end anonymous namespace
@@ -47,8 +50,7 @@ X86ELFObjectWriter::X86ELFObjectWriter(bool IsELF64, uint8_t OSABI,
 enum X86_64RelType { RT64_NONE, RT64_64, RT64_32, RT64_32S, RT64_16, RT64_8 };
 
 static X86_64RelType getType64(MCFixupKind Kind,
-                               MCSymbolRefExpr::VariantKind &Modifier,
-                               bool &IsPCRel) {
+                               X86MCExpr::Specifier &Specifier, bool &IsPCRel) {
   switch (unsigned(Kind)) {
   default:
     llvm_unreachable("Unimplemented");
@@ -58,11 +60,11 @@ static X86_64RelType getType64(MCFixupKind Kind,
     return RT64_64;
   case X86::reloc_signed_4byte:
   case X86::reloc_signed_4byte_relax:
-    if (Modifier == MCSymbolRefExpr::VK_None && !IsPCRel)
+    if (Specifier == X86MCExpr::VK_None && !IsPCRel)
       return RT64_32S;
     return RT64_32;
   case X86::reloc_global_offset_table:
-    Modifier = MCSymbolRefExpr::VK_GOT;
+    Specifier = X86MCExpr::VK_GOT;
     IsPCRel = true;
     return RT64_32;
   case FK_Data_4:
@@ -76,7 +78,7 @@ static X86_64RelType getType64(MCFixupKind Kind,
   case X86::reloc_riprel_4byte_relax_evex:
     return RT64_32;
   case X86::reloc_branch_4byte_pcrel:
-    Modifier = MCSymbolRefExpr::VK_PLT;
+    Specifier = X86MCExpr::VK_PLT;
     return RT64_32;
   case FK_PCRel_2:
   case FK_Data_2:
@@ -100,17 +102,17 @@ static void checkIs64(MCContext &Ctx, SMLoc Loc, X86_64RelType Type) {
 }
 
 static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
-                               MCSymbolRefExpr::VariantKind Modifier,
+                               X86MCExpr::Specifier Specifier,
                                X86_64RelType Type, bool IsPCRel,
                                MCFixupKind Kind) {
-  switch (Modifier) {
+  switch (Specifier) {
   default:
     llvm_unreachable("Unimplemented");
-  case MCSymbolRefExpr::VK_None:
-  case MCSymbolRefExpr::VK_X86_ABS8:
+  case X86MCExpr::VK_None:
+  case X86MCExpr::VK_ABS8:
     switch (Type) {
     case RT64_NONE:
-      if (Modifier == MCSymbolRefExpr::VK_None)
+      if (Specifier == X86MCExpr::VK_None)
         return ELF::R_X86_64_NONE;
       llvm_unreachable("Unimplemented");
     case RT64_64:
@@ -125,7 +127,7 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
       return IsPCRel ? ELF::R_X86_64_PC8 : ELF::R_X86_64_8;
     }
     llvm_unreachable("unexpected relocation type!");
-  case MCSymbolRefExpr::VK_GOT:
+  case X86MCExpr::VK_GOT:
     switch (Type) {
     case RT64_64:
       return IsPCRel ? ELF::R_X86_64_GOTPC64 : ELF::R_X86_64_GOT64;
@@ -138,12 +140,12 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
       llvm_unreachable("Unimplemented");
     }
     llvm_unreachable("unexpected relocation type!");
-  case MCSymbolRefExpr::VK_GOTOFF:
+  case X86MCExpr::VK_GOTOFF:
     assert(!IsPCRel);
     if (Type != RT64_64)
       Ctx.reportError(Loc, "unsupported relocation type");
     return ELF::R_X86_64_GOTOFF64;
-  case MCSymbolRefExpr::VK_TPOFF:
+  case X86MCExpr::VK_TPOFF:
     assert(!IsPCRel);
     switch (Type) {
     case RT64_64:
@@ -157,7 +159,7 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
       llvm_unreachable("Unimplemented");
     }
     llvm_unreachable("unexpected relocation type!");
-  case MCSymbolRefExpr::VK_DTPOFF:
+  case X86MCExpr::VK_DTPOFF:
     assert(!IsPCRel);
     switch (Type) {
     case RT64_64:
@@ -171,7 +173,7 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
       llvm_unreachable("Unimplemented");
     }
     llvm_unreachable("unexpected relocation type!");
-  case MCSymbolRefExpr::VK_SIZE:
+  case X86MCExpr::VK_SIZE:
     assert(!IsPCRel);
     switch (Type) {
     case RT64_64:
@@ -185,16 +187,16 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
       llvm_unreachable("Unimplemented");
     }
     llvm_unreachable("unexpected relocation type!");
-  case MCSymbolRefExpr::VK_TLSCALL:
+  case X86MCExpr::VK_TLSCALL:
     return ELF::R_X86_64_TLSDESC_CALL;
-  case MCSymbolRefExpr::VK_TLSDESC:
+  case X86MCExpr::VK_TLSDESC:
     return ((unsigned)Kind == X86::reloc_riprel_4byte_relax_rex2)
                ? ELF::R_X86_64_CODE_4_GOTPC32_TLSDESC
                : ELF::R_X86_64_GOTPC32_TLSDESC;
-  case MCSymbolRefExpr::VK_TLSGD:
+  case X86MCExpr::VK_TLSGD:
     checkIs32(Ctx, Loc, Type);
     return ELF::R_X86_64_TLSGD;
-  case MCSymbolRefExpr::VK_GOTTPOFF:
+  case X86MCExpr::VK_GOTTPOFF:
     checkIs32(Ctx, Loc, Type);
     if ((unsigned)Kind == X86::reloc_riprel_4byte_movq_load_rex2 ||
         (unsigned)Kind == X86::reloc_riprel_4byte_relax_rex2)
@@ -202,13 +204,13 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
     else if ((unsigned)Kind == X86::reloc_riprel_4byte_relax_evex)
       return ELF::R_X86_64_CODE_6_GOTTPOFF;
     return ELF::R_X86_64_GOTTPOFF;
-  case MCSymbolRefExpr::VK_TLSLD:
+  case X86MCExpr::VK_TLSLD:
     checkIs32(Ctx, Loc, Type);
     return ELF::R_X86_64_TLSLD;
-  case MCSymbolRefExpr::VK_PLT:
+  case X86MCExpr::VK_PLT:
     checkIs32(Ctx, Loc, Type);
     return ELF::R_X86_64_PLT32;
-  case MCSymbolRefExpr::VK_GOTPCREL:
+  case X86MCExpr::VK_GOTPCREL:
     checkIs32(Ctx, Loc, Type);
     // Older versions of ld.bfd/ld.gold/lld
     // do not support GOTPCRELX/REX_GOTPCRELX/CODE_4_GOTPCRELX,
@@ -228,10 +230,10 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
       return ELF::R_X86_64_CODE_4_GOTPCRELX;
     }
     llvm_unreachable("unexpected relocation type!");
-  case MCSymbolRefExpr::VK_GOTPCREL_NORELAX:
+  case X86MCExpr::VK_GOTPCREL_NORELAX:
     checkIs32(Ctx, Loc, Type);
     return ELF::R_X86_64_GOTPCREL;
-  case MCSymbolRefExpr::VK_X86_PLTOFF:
+  case X86MCExpr::VK_X86_PLTOFF:
     checkIs64(Ctx, Loc, Type);
     return ELF::R_X86_64_PLTOFF64;
   }
@@ -240,17 +242,17 @@ static unsigned getRelocType64(MCContext &Ctx, SMLoc Loc,
 enum X86_32RelType { RT32_NONE, RT32_32, RT32_16, RT32_8 };
 
 static unsigned getRelocType32(MCContext &Ctx, SMLoc Loc,
-                               MCSymbolRefExpr::VariantKind Modifier,
+                               X86MCExpr::Specifier Specifier,
                                X86_32RelType Type, bool IsPCRel,
                                MCFixupKind Kind) {
-  switch (Modifier) {
+  switch (Specifier) {
   default:
     llvm_unreachable("Unimplemented");
-  case MCSymbolRefExpr::VK_None:
-  case MCSymbolRefExpr::VK_X86_ABS8:
+  case X86MCExpr::VK_None:
+  case X86MCExpr::VK_ABS8:
     switch (Type) {
     case RT32_NONE:
-      if (Modifier == MCSymbolRefExpr::VK_None)
+      if (Specifier == X86MCExpr::VK_None)
         return ELF::R_386_NONE;
       llvm_unreachable("Unimplemented");
     case RT32_32:
@@ -261,7 +263,7 @@ static unsigned getRelocType32(MCContext &Ctx, SMLoc Loc,
       return IsPCRel ? ELF::R_386_PC8 : ELF::R_386_8;
     }
     llvm_unreachable("unexpected relocation type!");
-  case MCSymbolRefExpr::VK_GOT:
+  case X86MCExpr::VK_GOT:
     if (Type != RT32_32)
       break;
     if (IsPCRel)
@@ -274,55 +276,55 @@ static unsigned getRelocType32(MCContext &Ctx, SMLoc Loc,
     return Kind == MCFixupKind(X86::reloc_signed_4byte_relax)
                ? ELF::R_386_GOT32X
                : ELF::R_386_GOT32;
-  case MCSymbolRefExpr::VK_GOTOFF:
+  case X86MCExpr::VK_GOTOFF:
     assert(!IsPCRel);
     if (Type != RT32_32)
       break;
     return ELF::R_386_GOTOFF;
-  case MCSymbolRefExpr::VK_TLSCALL:
+  case X86MCExpr::VK_TLSCALL:
     return ELF::R_386_TLS_DESC_CALL;
-  case MCSymbolRefExpr::VK_TLSDESC:
+  case X86MCExpr::VK_TLSDESC:
     return ELF::R_386_TLS_GOTDESC;
-  case MCSymbolRefExpr::VK_TPOFF:
+  case X86MCExpr::VK_TPOFF:
     if (Type != RT32_32)
       break;
     assert(!IsPCRel);
     return ELF::R_386_TLS_LE_32;
-  case MCSymbolRefExpr::VK_DTPOFF:
+  case X86MCExpr::VK_DTPOFF:
     if (Type != RT32_32)
       break;
     assert(!IsPCRel);
     return ELF::R_386_TLS_LDO_32;
-  case MCSymbolRefExpr::VK_TLSGD:
+  case X86MCExpr::VK_TLSGD:
     if (Type != RT32_32)
       break;
     assert(!IsPCRel);
     return ELF::R_386_TLS_GD;
-  case MCSymbolRefExpr::VK_GOTTPOFF:
+  case X86MCExpr::VK_GOTTPOFF:
     if (Type != RT32_32)
       break;
     assert(!IsPCRel);
     return ELF::R_386_TLS_IE_32;
-  case MCSymbolRefExpr::VK_PLT:
+  case X86MCExpr::VK_PLT:
     if (Type != RT32_32)
       break;
     return ELF::R_386_PLT32;
-  case MCSymbolRefExpr::VK_INDNTPOFF:
+  case X86MCExpr::VK_INDNTPOFF:
     if (Type != RT32_32)
       break;
     assert(!IsPCRel);
     return ELF::R_386_TLS_IE;
-  case MCSymbolRefExpr::VK_NTPOFF:
+  case X86MCExpr::VK_NTPOFF:
     if (Type != RT32_32)
       break;
     assert(!IsPCRel);
     return ELF::R_386_TLS_LE;
-  case MCSymbolRefExpr::VK_GOTNTPOFF:
+  case X86MCExpr::VK_GOTNTPOFF:
     if (Type != RT32_32)
       break;
     assert(!IsPCRel);
     return ELF::R_386_TLS_GOTIE;
-  case MCSymbolRefExpr::VK_TLSLDM:
+  case X86MCExpr::VK_TLSLDM:
     if (Type != RT32_32)
       break;
     assert(!IsPCRel);
@@ -338,10 +340,29 @@ unsigned X86ELFObjectWriter::getRelocType(MCContext &Ctx, const MCValue &Target,
   MCFixupKind Kind = Fixup.getKind();
   if (Kind >= FirstLiteralRelocationKind)
     return Kind - FirstLiteralRelocationKind;
-  MCSymbolRefExpr::VariantKind Modifier = Target.getAccessVariant();
-  X86_64RelType Type = getType64(Kind, Modifier, IsPCRel);
+  auto Specifier = X86MCExpr::Specifier(Target.getAccessVariant());
+  switch (Specifier) {
+  case X86MCExpr::VK_GOTTPOFF:
+  case X86MCExpr::VK_INDNTPOFF:
+  case X86MCExpr::VK_NTPOFF:
+  case X86MCExpr::VK_GOTNTPOFF:
+  case X86MCExpr::VK_TLSCALL:
+  case X86MCExpr::VK_TLSDESC:
+  case X86MCExpr::VK_TLSGD:
+  case X86MCExpr::VK_TLSLD:
+  case X86MCExpr::VK_TLSLDM:
+  case X86MCExpr::VK_TPOFF:
+  case X86MCExpr::VK_DTPOFF:
+    if (auto *S = Target.getSymA())
+      cast<MCSymbolELF>(S->getSymbol()).setType(ELF::STT_TLS);
+    break;
+  default:
+    break;
+  }
+
+  X86_64RelType Type = getType64(Kind, Specifier, IsPCRel);
   if (getEMachine() == ELF::EM_X86_64)
-    return getRelocType64(Ctx, Fixup.getLoc(), Modifier, Type, IsPCRel, Kind);
+    return getRelocType64(Ctx, Fixup.getLoc(), Specifier, Type, IsPCRel, Kind);
 
   assert((getEMachine() == ELF::EM_386 || getEMachine() == ELF::EM_IAMCU) &&
          "Unsupported ELF machine type.");
@@ -364,7 +385,21 @@ unsigned X86ELFObjectWriter::getRelocType(MCContext &Ctx, const MCValue &Target,
     RelType = RT32_8;
     break;
   }
-  return getRelocType32(Ctx, Fixup.getLoc(), Modifier, RelType, IsPCRel, Kind);
+  return getRelocType32(Ctx, Fixup.getLoc(), Specifier, RelType, IsPCRel, Kind);
+}
+
+bool X86ELFObjectWriter::needsRelocateWithSymbol(const MCValue &V,
+                                                 const MCSymbol &Sym,
+                                                 unsigned Type) const {
+  switch (getSpecifier(V.getSymA())) {
+  case X86MCExpr::VK_GOT:
+  case X86MCExpr::VK_PLT:
+  case X86MCExpr::VK_GOTPCREL:
+  case X86MCExpr::VK_GOTPCREL_NORELAX:
+    return true;
+  default:
+    return false;
+  }
 }
 
 std::unique_ptr<MCObjectTargetWriter>
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86EncodingOptimization.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86EncodingOptimization.cpp
index 1e9d44068b3f3..e4fdd85929a4b 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86EncodingOptimization.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86EncodingOptimization.cpp
@@ -11,6 +11,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "X86EncodingOptimization.h"
+#include "MCTargetDesc/X86MCExpr.h"
 #include "X86BaseInfo.h"
 #include "llvm/MC/MCExpr.h"
 #include "llvm/MC/MCInst.h"
@@ -374,7 +375,7 @@ bool X86::optimizeMOV(MCInst &MI, bool In64BitMode) {
   if (MI.getOperand(AddrOp).isExpr()) {
     const MCExpr *MCE = MI.getOperand(AddrOp).getExpr();
     if (const MCSymbolRefExpr *SRE = dyn_cast<MCSymbolRefExpr>(MCE))
-      if (SRE->getKind() == MCSymbolRefExpr::VK_TLVP)
+      if (getSpecifier(SRE) == X86MCExpr::VK_TLVP)
         Absolute = false;
   }
   if (Absolute && (MI.getOperand(AddrBase + X86::AddrBaseReg).getReg() ||
@@ -485,7 +486,7 @@ static bool optimizeToShortImmediateForm(MCInst &MI) {
   MCOperand &LastOp = MI.getOperand(MI.getNumOperands() - 1 - SkipOperands);
   if (LastOp.isExpr()) {
     const MCSymbolRefExpr *SRE = dyn_cast<MCSymbolRefExpr>(LastOp.getExpr());
-    if (!SRE || SRE->getKind() != MCSymbolRefExpr::VK_X86_ABS8)
+    if (!SRE || getSpecifier(SRE) != X86MCExpr::VK_ABS8)
       return false;
   } else if (LastOp.isImm()) {
     if (!isInt<8>(LastOp.getImm()))
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCAsmInfo.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCAsmInfo.cpp
index cc108c5aa688c..e03a68e34dd7c 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCAsmInfo.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCAsmInfo.cpp
@@ -11,6 +11,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "X86MCAsmInfo.h"
+#include "MCTargetDesc/X86MCExpr.h"
 #include "llvm/MC/MCExpr.h"
 #include "llvm/MC/MCStreamer.h"
 #include "llvm/Support/CommandLine.h"
@@ -35,34 +36,34 @@ MarkedJTDataRegions("mark-data-regions", cl::init(true),
   cl::Hidden);
 
 const MCAsmInfo::VariantKindDesc variantKindDescs[] = {
-    {MCSymbolRefExpr::VK_X86_ABS8, "ABS8"},
-    {MCSymbolRefExpr::VK_DTPOFF, "DTPOFF"},
-    {MCSymbolRefExpr::VK_DTPREL, "DTPREL"},
-    {MCSymbolRefExpr::VK_GOT, "GOT"},
-    {MCSymbolRefExpr::VK_GOTENT, "GOTENT"},
-    {MCSymbolRefExpr::VK_GOTNTPOFF, "GOTNTPOFF"},
-    {MCSymbolRefExpr::VK_GOTOFF, "GOTOFF"},
-    {MCSymbolRefExpr::VK_GOTPCREL, "GOTPCREL"},
-    {MCSymbolRefExpr::VK_GOTPCREL_NORELAX, "GOTPCREL_NORELAX"},
-    {MCSymbolRefExpr::VK_GOTREL, "GOTREL"},
-    {MCSymbolRefExpr::VK_GOTTPOFF, "GOTTPOFF"},
-    {MCSymbolRefExpr::VK_INDNTPOFF, "INDNTPOFF"},
+    {X86MCExpr::VK_ABS8, "ABS8"},
+    {X86MCExpr::VK_DTPOFF, "DTPOFF"},
+    {X86MCExpr::VK_DTPREL, "DTPREL"},
+    {X86MCExpr::VK_GOT, "GOT"},
+    {X86MCExpr::VK_GOTENT, "GOTENT"},
+    {X86MCExpr::VK_GOTNTPOFF, "GOTNTPOFF"},
+    {X86MCExpr::VK_GOTOFF, "GOTOFF"},
+    {X86MCExpr::VK_GOTPCREL, "GOTPCREL"},
+    {X86MCExpr::VK_GOTPCREL_NORELAX, "GOTPCREL_NORELAX"},
+    {X86MCExpr::VK_GOTREL, "GOTREL"},
+    {X86MCExpr::VK_GOTTPOFF, "GOTTPOFF"},
+    {X86MCExpr::VK_INDNTPOFF, "INDNTPOFF"},
     {MCSymbolRefExpr::VK_COFF_IMGREL32, "IMGREL"},
-    {MCSymbolRefExpr::VK_NTPOFF, "NTPOFF"},
-    {MCSymbolRefExpr::VK_PCREL, "PCREL"},
-    {MCSymbolRefExpr::VK_PLT, "PLT"},
-    {MCSymbolRefExpr::VK_X86_PLTOFF, "PLTOFF"},
+    {X86MCExpr::VK_NTPOFF, "NTPOFF"},
+    {X86MCExpr::VK_PCREL, "PCREL"},
+    {X86MCExpr::VK_PLT, "PLT"},
+    {X86MCExpr::VK_X8...
[truncated]

@MaskRay MaskRay requested a review from RKSimon March 20, 2025 05:50
Comment on lines +356 to +357
if (auto *S = Target.getSymA())
cast<MCSymbolELF>(S->getSymbol()).setType(ELF::STT_TLS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this used for? I don't find corresponding code in LHS.

Copy link
Member Author

@MaskRay MaskRay Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers#tls-symbols

When a symbol is defined in a section with the SHF_TLS flag (Thread-Local Storage), GNU assembler assigns it the type STT_TLS in the symbol table. For undefined TLS symbols, the process differs: GCC and Clang don’t emit explicit labels. Instead, assemblers identify these symbols through TLS-specific relocation specifiers in the code, deduce their thread-local nature, and set their type to STT_TLS accordingly.

The legacy generic code uses ELFObjectWriter::fixSymbolsInTLSFixups to set STT_TLS (and use an unnecessary expression walk). The better way is to do this in getRelocType, which I have done for AArch64, PowerPC, and RISC-V.

Comment on lines +44 to +45
VK_GOTOFF,
VK_GOTPCREL,
Copy link
Contributor

@KanRobert KanRobert Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are only for X86? I thought the specifier w/o "X86" is target-independent.
If so, should we align the names, i.e. remove "X86" in all the names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every ELF constant in MCSymbolRefExpr::VariantKind is target-specific while some targets might share a specifier of a similar name (e.g., @got, @plt). It looks cleaner for targets to define relocation specifiers they use

Created using spr 1.3.5-bogner
@MaskRay MaskRay merged commit c177dbe into main Mar 21, 2025
11 checks passed
@MaskRay MaskRay deleted the users/MaskRay/spr/move-x86-specific-mcsymbolrefexprvariantkind-to-x86mcexprspecifier branch March 21, 2025 03:28
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Mar 21, 2025
…r::Specifier

Move target-specific members outside of MCSymbolRefExpr::VariantKind
(a legacy interface I am eliminating). Most changes are mechanic,
except:

* ELFObjectWriter::shouldRelocateWithSymbol
* The legacy generic code uses `ELFObjectWriter::fixSymbolsInTLSFixups`
  to set `STT_TLS` (and use an unnecessary expression walk). The better
  way is to do this in `getRelocType`, which I have done for
  AArch64, PowerPC, and RISC-V.

In the future, we should encode expressions with a relocation specifier
as X86MCExpr and use MCValue::RefKind to hold the specifier of the
relocatable expression.
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers

While here, rename "Modifier' to "Specifier":

> "Relocation modifier", though concise, suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation. I landed on "relocation specifier" as the winner. It's clear, aligns with Arm and IBM’s usage, and fits the assembler's role seamlessly.

Pull Request: llvm/llvm-project#132149
MaskRay added a commit to MaskRay/llvm-project that referenced this pull request Mar 26, 2025
Move wasm-specific members outside of MCSymbolRefExpr::VariantKind (a
legacy interface I am eliminating). Most changes are mechanic and
similar to what I've done for many ELF targets (e.g. X86 llvm#132149)

Notes:

* `fixSymbolsInTLSFixups` is replaced with `setTLS` in
  `WebAssemblyWasmObjectWriter::getRelocType`, similar to what I've done
  for many ELF targets.

In the future, we should encode expressions with a relocation specifier
as WebAssemblyMCExpr and use `MCValue::RefKind` to hold the specifier of
the relocatable expression.
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers

While here, rename "Modifier' to "Specifier":

> "Relocation modifier", though concise, suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation. I landed on "relocation specifier" as the winner. It's clear, aligns with Arm and IBM’s usage, and fits the assembler's role seamlessly.
MaskRay added a commit to MaskRay/llvm-project that referenced this pull request Mar 26, 2025
Move wasm-specific members outside of MCSymbolRefExpr::VariantKind (a
legacy interface I am eliminating). Most changes are mechanic and
similar to what I've done for many ELF targets (e.g. X86 llvm#132149)

Notes:

* `fixSymbolsInTLSFixups` is replaced with `setTLS` in
  `WebAssemblyWasmObjectWriter::getRelocType`, similar to what I've done
  for many ELF targets.
* `SymA->setUsedInGOT()` in `recordRelocation` is moved to
  `getRelocType`.

In the future, we should encode expressions with a relocation specifier
as WebAssemblyMCExpr and use `MCValue::RefKind` to hold the specifier of
the relocatable expression (e.g. AArch64, RISCV).
https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers

While here, rename "Modifier' to "Specifier":

> "Relocation modifier", though concise, suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation. I landed on "relocation specifier" as the winner. It's clear, aligns with Arm and IBM’s usage, and fits the assembler's role seamlessly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants