After gcc foo.c bar.c, the linker runs silently. If you've ever seen undefined reference to 'printf', that was the linker. The ld.so program runs before your main(), and DLL hell is a real thing. This guide covers it, plus the static vs dynamic trade-off.
The Problem Linkers Solve
foo.c:
extern int g_counter;
void inc() { g_counter++; printf("inc\n"); }
bar.c:
int g_counter = 0;
int main() { inc(); return g_counter; }
Each .c → .o (compile):
foo.o: defines inc(); g_counter and printf are "undefined"
bar.o: defines main() and g_counter; inc is "undefined"
Linker's job:
- Match symbols across foo.o + bar.o + libc.so
- Resolve undefined references
- Produce the final executable or libraryELF File Layout (Linux)
$ readelf -S foo.o
ELF sections:
.text ← executable code
.data ← initialized globals
.bss ← zero-init globals (metadata only on disk)
.rodata ← read-only data (string literals)
.symtab ← symbol table (defined/referenced)
.strtab ← symbol name strings
.rel.text ← relocation info (slots the linker must fill)
$ nm foo.o
0000000000000000 T inc ← T = defined in .text
U g_counter ← U = undefined
U printf ← U = undefinedStatic Linking
gcc -static foo.o bar.o -o program
→ Extract used symbols from libc.a (static lib) → include in program
→ Result is self-contained (no .so required to run)
→ Binary is large (a Go binary is large because it's all static)
Pros:
- Easy deploy (single file)
- Immune to environment differences (different libc, no problem)
- Security: no DLL hijacking risk
Cons:
- Size (libc alone ~2 MB)
- Recompile on libc update (security patches!)
- Memory — same libc code duplicated per processDynamic Linking
gcc foo.o bar.o -o program (default — dynamic)
→ printf etc. are recorded as "in libc.so" (no actual code copied)
→ Binary is small
→ At runtime ld.so loads libc.so and resolves symbols
$ ldd program
linux-vdso.so.1 (0x...)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x...)
/lib64/ld-linux-x86-64.so.2 (0x...)ld.so — The Program That Runs Before main()
Kernel behavior on exec:
1. exec("./program")
2. Parse ELF header → "interpreter = /lib64/ld-linux-x86-64.so.2"
3. Load ld.so first ← that's the dynamic linker
4. ld.so:
- Loads dependent .so files (libc, libstdc++, ...)
- Performs relocations (fills in function addresses)
- Cascades transitive deps
5. Finally calls program's _start → main()GOT / PLT — Runtime Address Resolution
printf's real address is determined at runtime (ASLR).
Unknown at compile time. How do we call it?
GOT (Global Offset Table) — runtime address storage
PLT (Procedure Linkage Table) — trigger resolution on first call
call printf:
→ jump to PLT entry for printf
→ first time, calls ld.so → finds printf in libc.so → stores in GOT
→ subsequent calls jump directly via GOT (resolved once, "lazy binding")
$ readelf -r program | grep printf
... R_X86_64_JUMP_SLOT printf ← runtime-filled slotWhat "undefined reference to foo" Really Means
The linker searched all .o files + specified libraries and couldn't
find a definition for foo.
Common causes:
1. Missed a .c file (not compiled in)
2. Forgot -lfoo (library not linked)
3. C++ function declared but not defined
4. C++ name mangling — no extern "C" on a C function call
5. Library link order — gcc resolves left to right, user → library order
Fix:
gcc main.c -lfoo ← right order if main uses foo
gcc -lfoo main.c ← wrong (foo seen before main's references)DLL Hell — Windows / Generic Dynamic-Linking Trouble
Program A needs libfoo.so 1.0
Program B needs libfoo.so 2.0 (API changed)
If only one version exists on the system, one program breaks.
Strategies:
1. SONAME — libfoo.so.1, libfoo.so.2 (coexist by major version)
2. RPATH — bake "look for my deps in ./lib/" into the binary
3. Static linking — no deps
4. Containers — isolate each app's deps (Docker)
5. snap / flatpak / AppImage — bundle-based distributionSymbol Visibility — Namespace Clashes
Two libraries both define hash() — what does the linker do?
Option 1 — restrict visibility:
__attribute__((visibility("hidden"))) int hash() { ... }
→ not exported from the .so, internal only
Option 2 — namespace (C++):
namespace mylib { int hash() { ... } }
→ mangled name like _ZN5mylib4hashEv — unique
Option 3 — extern "C" + prefix:
extern "C" int mylib_hash() { ... }Common Pitfalls
- Forgot -rdynamic — need it to look up symbols via dlsym.
- strip + lookup failure — release-build strip can remove dynamic symbols → dlsym fails.
- Library version mismatch — built against libfoo.so.1 but only libfoo.so.2 deployed → can't run.
- LD_PRELOAD tricks — inject a library before every dynamic binary. Used for malloc tracing, network sniffing. A double-edged sword for security.
- Mixing static and dynamic with ODR violations — the same symbol in two places = UB.
Wrap-up
The linker is the source of most "after compile" problems. Encountering undefined reference or duplicate symbol a few times teaches you fast.
Practical advice: small projects benefit from static linking (Go defaults). Large systems benefit from dynamic linking for memory and deployment. Containers ease the trade-off significantly — either way, everything's isolated.