Java, Go, JavaScript, Python — all reclaim memory via GC. But if you treat GC as "magic," you're powerless when GC pauses cause production latency spikes. This guide covers the actual mechanics of mark-and-sweep, why the generational hypothesis works, the meaning of the "99% die young" statistic, and the trade-offs between modern GCs (G1, ZGC, Shenandoah).
The Problem GC Solves — Reachability
function foo() {
let a = {x: 1}; // alloc
let b = {y: 2}; // alloc
a.next = b; // a → b reference
return a; // only a returned
}
const root = foo();
// root → a → b (all reachable)
root.next = null; // break a's b reference
// now b is unreachable → GC will reclaim itThe core question of GC: which objects are reachable? Reachable = can be traversed from a root (stack variables, globals, registers) through reference chains. The rest is garbage.
Algorithm 1 — Reference Counting (Python, Swift, ObjC ARC)
# Python (CPython)
a = [1, 2, 3] # refcount = 1
b = a # refcount = 2
b = None # refcount = 1
a = None # refcount = 0 → freed immediately- Pros: instant reclaim (deterministic). No pauses.
- Con 1: increment/decrement on every assignment.
- Con 2: can't catch cyclic references. If a → b → a, both have refcount ≥ 1 forever.
CPython uses refcount + a separate cycle detector. Swift makes cycle avoidance the programmer's responsibility via weak references. Simplicity is the appeal.
Algorithm 2 — Mark-and-Sweep
Phase 1 — Mark (reachability tracing):
worklist = [root1, root2, ...]
while worklist not empty:
obj = worklist.pop()
if not obj.marked:
obj.marked = true
for child in obj.references:
worklist.push(child)
Phase 2 — Sweep:
for obj in heap:
if not obj.marked:
free(obj)
else:
obj.marked = false // reset for next cycle- Handles cycles correctly.
- No increment/decrement overhead.
- But walks the whole heap — pauses get long with large heaps.
The Generational Hypothesis — 99% Die Young
The empirical distribution in real programs:
Object lifetime distribution (typical server app):
Lifetime < 1 GC cycle : ~95–99% ← per-request temporaries
Lifetime ~few cycles : ~1–4%
Lifetime > 100 cycles : ~0.1% (caches, global state)Most garbage is young. So instead of walking the whole heap every time, walk young often, old rarely.
Generational GC Layout
heap:
┌──────────────┬──────────────┬──────────────────┐
│ Young (Eden) │ Survivor 0/1 │ Old Gen │
└──────────────┴──────────────┴──────────────────┘
↑ ↑ ↑
new alloc after minor GC after N survives
lands here survivors here promote to old
Minor GC (frequent, fast):
- walk Young only
- copy survivors → Survivor
- bulk-free the rest of Eden (cost ∝ survivors, dead cost 0!)
Major GC (rare, slow):
- walk whole heap (incl. Old)Key trick — copying collector. Copy only survivors to another area, bulk-free the source. If few objects survive (95% die), this is extremely cheap.
Cross-generation Reference Problem
If an old-gen object references a young-gen one?
→ minor GC walking only young wouldn't know that young object is reachable
Solution: write barrier
- mark a "card" when old writes a young reference
- minor GC additionally scans cards as extra rootsWrite barrier = a tiny extra instruction on every reference store. Costs something, but generational's throughput win is much larger.
Stop-the-World — All Threads Paused
Normal: [T1 running] [T2 running] [T3 running]
GC start: [T1 pause ] [T2 pause ] [T3 pause ]
GC: (single-threaded mark + sweep)
GC end: [T1 resume] [T2 resume] [T3 resume]
Impact: can't serve requests during pause → latency spikeOld Java GCs (Serial, Parallel) STW the entire collection. A Major GC on a 100 GB heap can pause for seconds — enemy of p99 latency and SLAs.
Modern GC — Shrinking the Pause
Concurrent — Mark Phase Runs Alongside the App
STW only for taking the mark snapshot and sweep. Most of the time the app runs concurrently with the collector.
Incremental — Break GC into Small Pieces
Split GC into small chunks and interleave with app execution. No single long pause.
Region-Based — Many Small Regions
G1, ZGC. Divide the heap into hundreds of regions and reclaim those with the most garbage first. Major GC can be partial.
Java GC Choices (Common)
| GC | Pause | Throughput | Best for |
|---|---|---|---|
| Serial | Long | Medium | <100MB heap, single core |
| Parallel | Long | Best | Batch jobs, throughput first |
| G1 (default in 9+) | Med (10–200 ms) | Medium | General server, 4–100 GB |
| ZGC | <1ms most of the time | Slight hit | Large heaps (100GB+), latency SLAs |
| Shenandoah | <10ms | Slight hit | Latency SLAs, OpenJDK |
Go's GC — Concurrent Tri-Color Mark-Sweep
Go's GC is not generational (escape analysis keeps things on stack when possible). Concurrent mark + write barriers + short STW (target 100 μs – few ms).
Tri-color invariant:
- white : not yet marked (garbage candidate)
- gray : marked but children not yet visited
- black : marked + all children visited
During mark phase:
- when gray worklist empties, mark is done
- if app creates a black → white reference, invariant breaks
→ write barrier turns the white object gray (preserves mark)V8 (JavaScript) GC
- Generational (young = "new space" + survivors; old = "old space")
- Young: Cheney's copying algorithm (Scavenge)
- Old: Concurrent mark + sweep + compact (Major GC, incremental)
The Practical Truth About GC Tuning
Most GC tuning is really allocation pattern tuning, not GC option tweaking.
// Bad — thousands of new objects per request
public Response handle(Request req) {
List<Item> items = new ArrayList<>();
for (int i = 0; i < 10000; i++) {
items.add(new Item(i)); // 10000 allocs
}
return build(items);
} // all immediately garbage
// Good — reuse via pool
private final ItemPool pool = new ItemPool(10000);
public Response handle(Request req) {
pool.reset();
for (int i = 0; i < 10000; i++) {
pool.acquire().set(i); // 0 allocs (reused)
}
return build(pool);
}Allocation Rate Drives GC Pause Frequency
Higher allocation rate → young gen fills faster → more minor GCs → higher cumulative GC pause.
Sudden Long-Lived Object Bursts Cause Promotion Storms
Survivor fills, objects get promoted to Old quickly → Old fills → Major GC → long pause.
Common Pitfalls
- "GC handles it" illusion — GC is a large contributor to production latency spikes. You won't know without profiling.
- Cranking heap size up indiscriminately — larger heap = longer GC pauses (pre-G1). ZGC is fine with it.
- Excessive immutable-object churn (String etc.) — every modification allocates. Use StringBuilder / pools.
- Finalizers / WeakReference overuse — extra GC work. Prefer explicit close.
Wrap-up
GC is not magic. It's mark-and-sweep + generational + concurrent, composed into modern collectors. Understanding the machinery lets you answer questions like "why does our p99 spike occasionally?"
Your allocation pattern drives GC behavior — read together with how-memory-allocation-works.