Skip to content

[LLD][COFF] Use appropriate symbol table for -include argument on ARM64X #122554

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 13, 2025

Conversation

cjacek
Copy link
Contributor

@cjacek cjacek commented Jan 11, 2025

Move LinkerDriver::addUndefined to SymbolTable to allow its use with both symbol tables on ARM64X and rename it to addGCRoot to clarify its distinct role compared to the existing SymbolTable::addUndefined.

Command-line -include arguments now apply to the EC symbol table, with mainSymtab introduced in linkerMain. There will be more similar cases. For .drectve sections, the corresponding symbol table is used based on the context.

@llvmbot
Copy link
Member

llvmbot commented Jan 11, 2025

@llvm/pr-subscribers-lld
@llvm/pr-subscribers-platform-windows

@llvm/pr-subscribers-lld-coff

Author: Jacek Caban (cjacek)

Changes

Move LinkerDriver::addUndefined to SymbolTable to allow its use with both symbol tables on ARM64X and rename it to addGCRoot to clarify its distinct role compared to the existing SymbolTable::addUndefined.

Command-line -include arguments now apply to the EC symbol table, with mainSymtab introduced in linkerMain. There will be more similar cases. For .drectve sections, the corresponding symbol table is used based on the context.


Full diff: https://github.com/llvm/llvm-project/pull/122554.diff

5 Files Affected:

  • (modified) lld/COFF/Driver.cpp (+19-45)
  • (modified) lld/COFF/Driver.h (-2)
  • (modified) lld/COFF/SymbolTable.cpp (+29)
  • (modified) lld/COFF/SymbolTable.h (+2)
  • (added) lld/test/COFF/arm64x-incl.s (+66)
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index 791382fd9bdd4a..0d89457046a500 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -479,7 +479,7 @@ void LinkerDriver::parseDirectives(InputFile *file) {
 
   // Handle /include: in bulk.
   for (StringRef inc : directives.includes)
-    addUndefined(inc);
+    file->symtab.addGCRoot(inc);
 
   // Handle /exclude-symbols: in bulk.
   for (StringRef e : directives.excludes) {
@@ -505,13 +505,13 @@ void LinkerDriver::parseDirectives(InputFile *file) {
     case OPT_entry:
       if (!arg->getValue()[0])
         Fatal(ctx) << "missing entry point symbol name";
-      ctx.config.entry = addUndefined(mangle(arg->getValue()), true);
+      ctx.config.entry = file->symtab.addGCRoot(mangle(arg->getValue()), true);
       break;
     case OPT_failifmismatch:
       checkFailIfMismatch(arg->getValue(), file);
       break;
     case OPT_incl:
-      addUndefined(arg->getValue());
+      file->symtab.addGCRoot(arg->getValue());
       break;
     case OPT_manifestdependency:
       ctx.config.manifestDependencies.insert(arg->getValue());
@@ -805,35 +805,6 @@ void LinkerDriver::addLibSearchPaths() {
   }
 }
 
-Symbol *LinkerDriver::addUndefined(StringRef name, bool aliasEC) {
-  Symbol *b = ctx.symtab.addUndefined(name);
-  if (!b->isGCRoot) {
-    b->isGCRoot = true;
-    ctx.config.gcroot.push_back(b);
-  }
-
-  // On ARM64EC, a symbol may be defined in either its mangled or demangled form
-  // (or both). Define an anti-dependency symbol that binds both forms, similar
-  // to how compiler-generated code references external functions.
-  if (aliasEC && isArm64EC(ctx.config.machine)) {
-    if (std::optional<std::string> mangledName =
-            getArm64ECMangledFunctionName(name)) {
-      auto u = dyn_cast<Undefined>(b);
-      if (u && !u->weakAlias) {
-        Symbol *t = ctx.symtab.addUndefined(saver().save(*mangledName));
-        u->setWeakAlias(t, true);
-      }
-    } else if (std::optional<std::string> demangledName =
-                   getArm64ECDemangledFunctionName(name)) {
-      Symbol *us = ctx.symtab.addUndefined(saver().save(*demangledName));
-      auto u = dyn_cast<Undefined>(us);
-      if (u && !u->weakAlias)
-        u->setWeakAlias(b, true);
-    }
-  }
-  return b;
-}
-
 void LinkerDriver::addUndefinedGlob(StringRef arg) {
   Expected<GlobPattern> pat = GlobPattern::create(arg);
   if (!pat) {
@@ -849,7 +820,7 @@ void LinkerDriver::addUndefinedGlob(StringRef arg) {
   });
 
   for (Symbol *sym : syms)
-    addUndefined(sym->getName());
+    ctx.symtab.addGCRoot(sym->getName());
 }
 
 StringRef LinkerDriver::mangleMaybe(Symbol *s) {
@@ -1487,7 +1458,7 @@ void LinkerDriver::maybeCreateECExportThunk(StringRef name, Symbol *&sym) {
     expName = saver().save("EXP+" + *mangledName);
   else
     expName = saver().save("EXP+" + name);
-  sym = addUndefined(expName);
+  sym = ctx.symtabEC->addGCRoot(expName);
   if (auto undef = dyn_cast<Undefined>(sym)) {
     if (!undef->getWeakAlias()) {
       auto thunk = make<ECExportThunkChunk>(def);
@@ -1537,7 +1508,8 @@ void LinkerDriver::createECExportThunks() {
 
 void LinkerDriver::pullArm64ECIcallHelper() {
   if (!ctx.config.arm64ECIcallHelper)
-    ctx.config.arm64ECIcallHelper = addUndefined("__icall_helper_arm64ec");
+    ctx.config.arm64ECIcallHelper =
+        ctx.symtabEC->addGCRoot("__icall_helper_arm64ec");
 }
 
 // In MinGW, if no symbols are chosen to be exported, then all symbols are
@@ -1976,6 +1948,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
       setMachine(machine);
     }
   }
+  SymbolTable &mainSymtab = ctx.hybridSymtab ? *ctx.hybridSymtab : ctx.symtab;
 
   // Handle /nodefaultlib:<filename>
   {
@@ -2062,7 +2035,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
 
   // Handle /include
   for (auto *arg : args.filtered(OPT_incl))
-    addUndefined(arg->getValue());
+    mainSymtab.addGCRoot(arg->getValue());
 
   // Handle /implib
   if (auto *arg = args.getLastArg(OPT_implib))
@@ -2493,22 +2466,22 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
     if (auto *arg = args.getLastArg(OPT_entry)) {
       if (!arg->getValue()[0])
         Fatal(ctx) << "missing entry point symbol name";
-      config->entry = addUndefined(mangle(arg->getValue()), true);
+      config->entry = ctx.symtab.addGCRoot(mangle(arg->getValue()), true);
     } else if (!config->entry && !config->noEntry) {
       if (args.hasArg(OPT_dll)) {
         StringRef s = (config->machine == I386) ? "__DllMainCRTStartup@12"
                                                 : "_DllMainCRTStartup";
-        config->entry = addUndefined(s, true);
+        config->entry = ctx.symtab.addGCRoot(s, true);
       } else if (config->driverWdm) {
         // /driver:wdm implies /entry:_NtProcessStartup
-        config->entry = addUndefined(mangle("_NtProcessStartup"), true);
+        config->entry = ctx.symtab.addGCRoot(mangle("_NtProcessStartup"), true);
       } else {
         // Windows specific -- If entry point name is not given, we need to
         // infer that from user-defined entry name.
         StringRef s = findDefaultEntry();
         if (s.empty())
           Fatal(ctx) << "entry point must be defined";
-        config->entry = addUndefined(s, true);
+        config->entry = ctx.symtab.addGCRoot(s, true);
         Log(ctx) << "Entry name inferred: " << s;
       }
     }
@@ -2520,9 +2493,10 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
     for (auto *arg : args.filtered(OPT_delayload)) {
       config->delayLoads.insert(StringRef(arg->getValue()).lower());
       if (config->machine == I386) {
-        config->delayLoadHelper = addUndefined("___delayLoadHelper2@8");
+        config->delayLoadHelper = ctx.symtab.addGCRoot("___delayLoadHelper2@8");
       } else {
-        config->delayLoadHelper = addUndefined("__delayLoadHelper2", true);
+        config->delayLoadHelper =
+            ctx.symtab.addGCRoot("__delayLoadHelper2", true);
       }
     }
   }
@@ -2659,7 +2633,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
       for (Export &e : config->exports) {
         if (!e.forwardTo.empty())
           continue;
-        e.sym = addUndefined(e.name, !e.data);
+        e.sym = ctx.symtab.addGCRoot(e.name, !e.data);
         if (e.source != ExportSource::Directives)
           e.symbolName = mangleMaybe(e.sym);
       }
@@ -2701,13 +2675,13 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
 
       // Windows specific -- if __load_config_used can be resolved, resolve it.
       if (ctx.symtab.findUnderscore("_load_config_used"))
-        addUndefined(mangle("_load_config_used"));
+        ctx.symtab.addGCRoot(mangle("_load_config_used"));
 
       if (args.hasArg(OPT_include_optional)) {
         // Handle /includeoptional
         for (auto *arg : args.filtered(OPT_include_optional))
           if (isa_and_nonnull<LazyArchive>(ctx.symtab.find(arg->getValue())))
-            addUndefined(arg->getValue());
+            ctx.symtab.addGCRoot(arg->getValue());
       }
     } while (run());
   }
diff --git a/lld/COFF/Driver.h b/lld/COFF/Driver.h
index 51325689042981..9d4f1cbfcb5841 100644
--- a/lld/COFF/Driver.h
+++ b/lld/COFF/Driver.h
@@ -173,8 +173,6 @@ class LinkerDriver {
 
   std::set<std::string> visitedLibs;
 
-  Symbol *addUndefined(StringRef sym, bool aliasEC = false);
-
   void addUndefinedGlob(StringRef arg);
 
   StringRef mangleMaybe(Symbol *s);
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index ae88675ab93a1f..b2f3ffe780e5dc 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -651,6 +651,35 @@ Symbol *SymbolTable::addUndefined(StringRef name, InputFile *f,
   return s;
 }
 
+Symbol *SymbolTable::addGCRoot(StringRef name, bool aliasEC) {
+  Symbol *b = addUndefined(name);
+  if (!b->isGCRoot) {
+    b->isGCRoot = true;
+    ctx.config.gcroot.push_back(b);
+  }
+
+  // On ARM64EC, a symbol may be defined in either its mangled or demangled form
+  // (or both). Define an anti-dependency symbol that binds both forms, similar
+  // to how compiler-generated code references external functions.
+  if (aliasEC && isEC()) {
+    if (std::optional<std::string> mangledName =
+            getArm64ECMangledFunctionName(name)) {
+      auto u = dyn_cast<Undefined>(b);
+      if (u && !u->weakAlias) {
+        Symbol *t = addUndefined(saver().save(*mangledName));
+        u->setWeakAlias(t, true);
+      }
+    } else if (std::optional<std::string> demangledName =
+                   getArm64ECDemangledFunctionName(name)) {
+      Symbol *us = addUndefined(saver().save(*demangledName));
+      auto u = dyn_cast<Undefined>(us);
+      if (u && !u->weakAlias)
+        u->setWeakAlias(b, true);
+    }
+  }
+  return b;
+}
+
 // On ARM64EC, a function symbol may appear in both mangled and demangled forms:
 // - ARM64EC archives contain only the mangled name, while the demangled symbol
 //   is defined by the object file as an alias.
diff --git a/lld/COFF/SymbolTable.h b/lld/COFF/SymbolTable.h
index 5443815172dfd9..436eff1ab4caf7 100644
--- a/lld/COFF/SymbolTable.h
+++ b/lld/COFF/SymbolTable.h
@@ -85,6 +85,8 @@ class SymbolTable {
   // added and before the writer writes results to a file.
   void compileBitcodeFiles();
 
+  Symbol *addGCRoot(StringRef sym, bool aliasEC = false);
+
   // Creates an Undefined symbol for a given name.
   Symbol *addUndefined(StringRef name);
 
diff --git a/lld/test/COFF/arm64x-incl.s b/lld/test/COFF/arm64x-incl.s
new file mode 100644
index 00000000000000..15ddc3b78fb458
--- /dev/null
+++ b/lld/test/COFF/arm64x-incl.s
@@ -0,0 +1,66 @@
+// REQUIRES: aarch64
+// RUN: split-file %s %t.dir && cd %t.dir
+
+// RUN: llvm-mc -filetype=obj -triple=arm64ec-windows sym-arm64ec.s -o sym-arm64ec.obj
+// RUN: llvm-mc -filetype=obj -triple=aarch64-windows sym-aarch64.s -o sym-aarch64.obj
+// RUN: llvm-mc -filetype=obj -triple=arm64ec-windows drectve.s -o drectve-arm64ec.obj
+// RUN: llvm-mc -filetype=obj -triple=aarch64-windows drectve.s -o drectve-aarch64.obj
+// RUN: llvm-mc -filetype=obj -triple=arm64ec-windows %S/Inputs/loadconfig-arm64ec.s -o loadconfig-arm64ec.obj
+// RUN: llvm-mc -filetype=obj -triple=aarch64-windows %S/Inputs/loadconfig-arm64.s -o loadconfig-arm64.obj
+// RUN: llvm-lib -machine:arm64x -out:sym.lib sym-arm64ec.obj sym-aarch64.obj
+
+// Check that the command-line -include argument ensures the EC symbol is included.
+
+// RUN: lld-link -machine:arm64x -out:out-arg.dll -dll -noentry loadconfig-arm64.obj loadconfig-arm64ec.obj sym.lib -include:sym
+// RUN: llvm-readobj --hex-dump=.test out-arg.dll | FileCheck --check-prefix=EC %s
+// EC: 0x180004000 02000000                            ....
+
+// Check that the native .rectve -include argument ensures the native symbol is included.
+
+// RUN: lld-link -machine:arm64x -out:out-native.dll -dll -noentry loadconfig-arm64.obj loadconfig-arm64ec.obj sym.lib drectve-aarch64.obj
+// RUN: llvm-readobj --hex-dump=.test out-native.dll | FileCheck --check-prefix=NATIVE %s
+// NATIVE: 0x180004000 01000000                            ....
+
+// Check that the EC .rectve -include argument ensures the EC symbol is included.
+
+// RUN: lld-link -machine:arm64x -out:out-ec.dll -dll -noentry loadconfig-arm64.obj loadconfig-arm64ec.obj sym.lib drectve-arm64ec.obj
+// RUN: llvm-readobj --hex-dump=.test out-ec.dll | FileCheck --check-prefix=EC %s
+
+// Check that both native and EC .rectve -include arguments ensure both symbols are included.
+
+// RUN: lld-link -machine:arm64x -out:out-arg-native.dll -dll -noentry loadconfig-arm64.obj loadconfig-arm64ec.obj sym.lib \
+// RUN:          -include:sym drectve-aarch64.obj
+// RUN: llvm-readobj --hex-dump=.test out-arg-native.dll | FileCheck --check-prefix=BOTH %s
+// BOTH: 0x180004000 02000000 01000000                   ........
+
+// RUN: lld-link -machine:arm64x -out:out-both.dll -dll -noentry loadconfig-arm64.obj loadconfig-arm64ec.obj sym.lib \
+// RUN:          drectve-arm64ec.obj drectve-aarch64.obj
+// RUN: llvm-readobj --hex-dump=.test out-both.dll | FileCheck --check-prefix=BOTH %s
+
+// Check that including a missing symbol results in an error.
+
+// RUN: not lld-link -machine:arm64x -out:err.dll -dll -noentry loadconfig-arm64.obj loadconfig-arm64ec.obj -include:sym sym-aarch64.obj \
+// RUN:              2>&1 | FileCheck --check-prefix=ERR %s
+// ERR: lld-link: error: <root>: undefined symbol: sym
+
+// RUN: not lld-link -machine:arm64x -out:err.dll -dll -noentry loadconfig-arm64.obj loadconfig-arm64ec.obj drectve-arm64ec.obj sym-aarch64.obj \
+// RUN:              2>&1 | FileCheck --check-prefix=ERR %s
+
+// RUN: not lld-link -machine:arm64x -out:err.dll -dll -noentry loadconfig-arm64.obj loadconfig-arm64ec.obj drectve-aarch64.obj sym-arm64ec.obj \
+// RUN:              2>&1 | FileCheck --check-prefix=ERR %s
+
+#--- sym-aarch64.s
+        .section ".test","dr"
+        .globl sym
+sym:
+        .word 1
+
+#--- sym-arm64ec.s
+        .section ".test","dr"
+        .globl sym
+sym:
+        .word 2
+
+#--- drectve.s
+        .section .drectve, "yn"
+        .ascii " -include:sym"

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall

// RUN: llvm-readobj --hex-dump=.test out-arg.dll | FileCheck --check-prefix=EC %s
// EC: 0x180004000 02000000 ....

// Check that the native .rectve -include argument ensures the native symbol is included.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo .rectve

// RUN: llvm-readobj --hex-dump=.test out-native.dll | FileCheck --check-prefix=NATIVE %s
// NATIVE: 0x180004000 01000000 ....

// Check that the EC .rectve -include argument ensures the EC symbol is included.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo repeated

// RUN: lld-link -machine:arm64x -out:out-ec.dll -dll -noentry loadconfig-arm64.obj loadconfig-arm64ec.obj sym.lib drectve-arm64ec.obj
// RUN: llvm-readobj --hex-dump=.test out-ec.dll | FileCheck --check-prefix=EC %s

// Check that both native and EC .rectve -include arguments ensure both symbols are included.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again

Move LinkerDriver::addUndefined to SymbolTable to allow its use with both symbol tables
on ARM64X and rename it to addGCRoot to clarify its distinct role compared to the existing
SymbolTable::addUndefined.

Command-line -include arguments now apply to the EC symbol table, with mainSymtab introduced
in linkerMain. There will be more similar cases. For .drectve sections, the corresponding
symbol table is used based on the context.
@cjacek cjacek merged commit 251ef3f into llvm:main Jan 13, 2025
5 of 6 checks passed
@cjacek cjacek deleted the arm64x-include branch January 13, 2025 22:17
@cjacek
Copy link
Contributor Author

cjacek commented Jan 13, 2025

Merged with typos fixed, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants