Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLD][ELF] Allow memory region in OVERLAY #133540

Merged
merged 3 commits into from
Mar 31, 2025
Merged

Conversation

mysterymath
Copy link
Contributor

@mysterymath mysterymath commented Mar 28, 2025

This allows the contents of OVERLAYs to be attributed to memory regions. This is the only clean way to overlap VMAs in linker scripts that choose to primarily use memory regions to lay out addresses.

This also simplifies OVERLAY expansion to better match GNU LD. Expressions for the first section's LMA and VMA are not generated if the user did not provide them. This allows the LMA/VMA offset to be preserved across multiple overlays in the same region, as with regular sections.

Closes #129816

@llvmbot
Copy link
Member

llvmbot commented Mar 28, 2025

@llvm/pr-subscribers-lld-elf

@llvm/pr-subscribers-lld

Author: Daniel Thornburgh (mysterymath)

Changes

This allows the contents of OVERLAYs to be attributed to memory regions. This is the only clean way to overlap VMAs in linker scripts that choose to primarily use memory regions to lay out addresses.

This also simplifies OVERLAY expansion to better match GNU LD. Expressions for the first section's LMA and VMA are not generated if the user did not provide them. This allows the LMA/VMA offset to be preserved across multiple overlays in the same region, as with regular sections.

Fixes #129816


Full diff: https://github.com/llvm/llvm-project/pull/133540.diff

5 Files Affected:

  • (modified) lld/ELF/LinkerScript.cpp (+14-1)
  • (modified) lld/ELF/LinkerScript.h (+1)
  • (modified) lld/ELF/OutputSections.h (+1)
  • (modified) lld/ELF/ScriptParser.cpp (+13-10)
  • (modified) lld/test/ELF/linkerscript/overlay.test (+34)
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index e19823f2ea752..8437049a209d9 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -182,7 +182,18 @@ void LinkerScript::expandMemoryRegions(uint64_t size) {
 
 void LinkerScript::expandOutputSection(uint64_t size) {
   state->outSec->size += size;
-  expandMemoryRegions(size);
+  size_t regionSize = size;
+  if (state->outSec->inOverlay) {
+    // Expand the overlay if necessary, and expand the region by the
+    // corresponding amount.
+    if (state->outSec->size > state->overlaySize) {
+      regionSize = state->outSec->size - state->overlaySize;
+      state->overlaySize = state->outSec->size;
+    } else {
+      regionSize = 0;
+    }
+  }
+  expandMemoryRegions(regionSize);
 }
 
 void LinkerScript::setDot(Expr e, const Twine &loc, bool inSec) {
@@ -1273,6 +1284,8 @@ bool LinkerScript::assignOffsets(OutputSection *sec) {
     // NOBITS TLS sections are similar. Additionally save the end address.
     state->tbssAddr = dot;
     dot = savedDot;
+  } else if (sec->lastInOverlay) {
+    state->overlaySize = 0;
   }
   return addressChanged;
 }
diff --git a/lld/ELF/LinkerScript.h b/lld/ELF/LinkerScript.h
index 0a2dda13f4ef8..65c443ba74d27 100644
--- a/lld/ELF/LinkerScript.h
+++ b/lld/ELF/LinkerScript.h
@@ -311,6 +311,7 @@ class LinkerScript final {
     MemoryRegion *lmaRegion = nullptr;
     uint64_t lmaOffset = 0;
     uint64_t tbssAddr = 0;
+    uint64_t overlaySize = 0;
   };
 
   Ctx &ctx;
diff --git a/lld/ELF/OutputSections.h b/lld/ELF/OutputSections.h
index 3ab36a21ce488..33516a62ecb43 100644
--- a/lld/ELF/OutputSections.h
+++ b/lld/ELF/OutputSections.h
@@ -102,6 +102,7 @@ class OutputSection final : public SectionBase {
   bool expressionsUseSymbols = false;
   bool usedInExpression = false;
   bool inOverlay = false;
+  bool lastInOverlay = false;
 
   // Tracks whether the section has ever had an input section added to it, even
   // if the section was later removed (e.g. because it is a synthetic section
diff --git a/lld/ELF/ScriptParser.cpp b/lld/ELF/ScriptParser.cpp
index 4c52bfda7a70e..09786aebaf337 100644
--- a/lld/ELF/ScriptParser.cpp
+++ b/lld/ELF/ScriptParser.cpp
@@ -561,37 +561,40 @@ void ScriptParser::readSearchDir() {
 // https://sourceware.org/binutils/docs/ld/Overlay-Description.html#Overlay-Description
 SmallVector<SectionCommand *, 0> ScriptParser::readOverlay() {
   Expr addrExpr;
-  if (consume(":")) {
-    addrExpr = [s = ctx.script] { return s->getDot(); };
-  } else {
+  if (!consume(":")) {
     addrExpr = readExpr();
     expect(":");
   }
-  // When AT is omitted, LMA should equal VMA. script->getDot() when evaluating
-  // lmaExpr will ensure this, even if the start address is specified.
-  Expr lmaExpr = consume("AT") ? readParenExpr()
-                               : [s = ctx.script] { return s->getDot(); };
+  Expr lmaExpr = consume("AT") ? readParenExpr() : Expr{};
   expect("{");
 
   SmallVector<SectionCommand *, 0> v;
   OutputSection *prev = nullptr;
   while (!errCount(ctx) && !consume("}")) {
     // VA is the same for all sections. The LMAs are consecutive in memory
-    // starting from the base load address specified.
+    // starting from the base load address.
     OutputDesc *osd = readOverlaySectionDescription();
     osd->osec.addrExpr = addrExpr;
     if (prev) {
       osd->osec.lmaExpr = [=] { return prev->getLMA() + prev->size; };
     } else {
       osd->osec.lmaExpr = lmaExpr;
-      // Use first section address for subsequent sections as initial addrExpr
-      // can be DOT. Ensure the first section, even if empty, is not discarded.
+      // Use first section address for subsequent sections. Ensure the first
+      // section, even if empty, is not discarded.
       osd->osec.usedInExpression = true;
       addrExpr = [=]() -> ExprValue { return {&osd->osec, false, 0, ""}; };
     }
     v.push_back(osd);
     prev = &osd->osec;
   }
+  if (!v.empty())
+    static_cast<OutputDesc *>(v.back())->osec.lastInOverlay = true;
+  if (consume(">")) {
+    StringRef regionName = readName();
+    for (SectionCommand *od : v)
+      static_cast<OutputDesc *>(od)->osec.memoryRegionName =
+          std::string(regionName);
+  }
 
   // According to the specification, at the end of the overlay, the location
   // counter should be equal to the overlay base address plus size of the
diff --git a/lld/test/ELF/linkerscript/overlay.test b/lld/test/ELF/linkerscript/overlay.test
index 7c64303b45659..8324c81a57092 100644
--- a/lld/test/ELF/linkerscript/overlay.test
+++ b/lld/test/ELF/linkerscript/overlay.test
@@ -9,6 +9,7 @@
 ## .text does not cause overlapping error and that
 ## .text's VA is 0x1000 + max(sizeof(.out.big), sizeof(.out.small)).
 
+# RUN: ld.lld a.o -T a.t -o a
 # RUN: llvm-readelf --sections -l a | FileCheck %s
 
 # CHECK:      Name       Type     Address          Off    Size
@@ -41,6 +42,23 @@
 # ERR2-NEXT:>>>     .out.aaa { *(.aaa) } > AX AT>FLASH
 # ERR2-NEXT:>>>                            ^
 
+# RUN: ld.lld a.o -T region.t -o region
+# RUN: llvm-readelf --sections -l region | FileCheck --check-prefix=REGION %s
+
+# REGION:      Name       Type     Address          Off    Size
+# REGION:      .big1      PROGBITS 0000000000001000 001000 000008
+# REGION-NEXT: .small1    PROGBITS 0000000000001000 002000 000004
+# REGION:      .big2      PROGBITS 0000000000001008 002008 000008
+# REGION-NEXT: .small2    PROGBITS 0000000000001008 003008 000004
+# REGION-NEXT: .text      PROGBITS 0000000000001010 003010 000001
+
+# REGION:      Program Headers:
+# REGION:      Type Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
+# REGION-NEXT: LOAD 0x001000 0x0000000000001000 0x0000000000001000 0x000008 0x000008 R   0x1000
+# REGION-NEXT: LOAD 0x002000 0x0000000000001000 0x0000000000001008 0x000010 0x000010 R   0x1000
+# REGION-NEXT: LOAD 0x003008 0x0000000000001008 0x0000000000001018 0x000004 0x000004 R   0x1000
+# REGION-NEXT: LOAD 0x003010 0x0000000000001010 0x0000000000001020 0x000001 0x000001 R E 0x1000
+
 #--- a.s
 .globl _start
 _start:
@@ -76,6 +94,22 @@ SECTIONS {
   .text : { *(.text) }
 }
 
+#--- region.t
+MEMORY { region : ORIGIN = 0x1000, LENGTH = 0x1000 }
+SECTIONS {
+## Memory region instead of explicit address.
+  OVERLAY : {
+    .big1 { *(.big1) }
+    .small1 { *(.small1) }
+  } >region
+  OVERLAY : {
+    .big2 { *(.big2) }
+    .small2 { *(.small2) }
+  } >region
+  .text : { *(.text) } >region
+  /DISCARD/ : { *(.big* .small*) }
+}
+
 #--- err1.t
 SECTIONS {
   OVERLAY 0x1000 : AT ( 0x2000 ) {

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Code looks great. Consider renaming lastInOverlay to firstInOverlay and moving the code to expand* to place overlay code together.

diff --git i/lld/ELF/LinkerScript.cpp w/lld/ELF/LinkerScript.cpp
index 8437049a209d..58bea55e4e97 100644
--- i/lld/ELF/LinkerScript.cpp
+++ w/lld/ELF/LinkerScript.cpp
@@ -186,2 +186,4 @@ void LinkerScript::expandOutputSection(uint64_t size) {
   if (state->outSec->inOverlay) {
+    if (state->outSec->firstInOverlay)
+      state->overlaySize = 0;
     // Expand the overlay if necessary, and expand the region by the
@@ -1286,4 +1288,2 @@ bool LinkerScript::assignOffsets(OutputSection *sec) {
     dot = savedDot;
-  } else if (sec->lastInOverlay) {
-    state->overlaySize = 0;
   }
diff --git i/lld/ELF/ScriptParser.cpp w/lld/ELF/ScriptParser.cpp
index 09786aebaf33..555fceccc4a7 100644
--- i/lld/ELF/ScriptParser.cpp
+++ w/lld/ELF/ScriptParser.cpp
@@ -590,3 +590,3 @@ SmallVector<SectionCommand *, 0> ScriptParser::readOverlay() {
   if (!v.empty())
-    static_cast<OutputDesc *>(v.back())->osec.lastInOverlay = true;
+    static_cast<OutputDesc *>(v[0])->osec.firstInOverlay = true;
   if (consume(">")) {

@MaskRay
Copy link
Member

MaskRay commented Mar 29, 2025

Fixes #129816

As this is a feature request we prefer Closes

@mysterymath
Copy link
Contributor Author

mysterymath commented Mar 31, 2025

Thanks! Code looks great. Consider renaming lastInOverlay to firstInOverlay and moving the code to expand* to place overlay code together.

As written, this wouldn't be correct when the first overlay section is empty, as expandOutputSection would never be called. I do like firstInOverlay better though; it seems simpler to think about. I've done that portion of the change, just in assignOffsets.

mysterymath and others added 3 commits March 31, 2025 10:18
This allows the contents of OVERLAYs to be attributed to memory regions.
This is the only clean way to overlap VMAs in linker scripts that choose
to primarily use memory regions to lay out addresses.

This also simplifies OVERLAY expansion to better match GNU LD.
Expressions for the first section's LMA and VMA are not generated if the
user did not provide them. This allows the LMA/VMA offset to be
preserved across multiple overlays in the same region, as with regular
sections.

Fixes llvm#129816
@mysterymath mysterymath merged commit 2d7add6 into llvm:main Mar 31, 2025
10 of 11 checks passed
@mysterymath mysterymath deleted the lld/overlay branch April 7, 2025 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[LLD] Support placing OVERLAY in a specific MEMORY region in linker scripts
3 participants