Every program uses memory. Yet most developers write let x = 42 without knowing where it lives, or call new Object() without knowing what it costs. This guide covers the actual mechanics of stack vs heap, how malloc/free talks to the OS, why fragmentation kills long-running servers, and the RAII (C++/Rust) vs GC (Java/Go/JS) trade-off.
Stack vs Heap — Two Memory Regions
Process virtual address space (Linux x86-64):
0xFFFFFFFFFFFFFFFF
┌─────────────────┐
│ kernel space │ (kernel-only)
├─────────────────┤
│ stack (↓) │ ← rsp (current function frame)
│ │
│ ... │ ← free space
│ │
│ heap (↑) │ ← brk pointer
├─────────────────┤
│ bss + data │ (globals)
├─────────────────┤
│ text (code) │
└─────────────────┘
0x0000000000000000Stack grows downward, heap grows upward. When they meet, stack overflow or out-of-memory.
Stack — Fast and Auto-Cleaned
void foo() {
int x = 42; // stack
char buf[1024]; // stack
Point p = {1, 2}; // stack
} // function returns — rsp restored, all variables gone (automatic)- Zero allocation cost — just subtract from rsp register. One cycle.
- Auto-cleanup — return restores rsp. No free() needed.
- Size limit — Linux default 8 MB. Large arrays overflow.
- Scope-bounded — once a function returns, that frame is gone (returning a reference to it = UB).
Heap — Dynamic Size and Lifetime
// C
void* ptr = malloc(1024); // 1024 bytes on heap
free(ptr); // explicit free
// C++
auto* p = new Point(1, 2); // heap
delete p; // explicit free
// Rust — RAII
let s = String::from("hi"); // heap (String's inner buffer)
// at scope end, Drop trait called → freed
// Go / Java / JS — GC
let obj = {x: 1}; // heap
// runtime tracks reachability → GC eventually freesThe heap is regions received from the OS in page (4 KB) units, then carved up by a user-space library. That library is the malloc implementation — ptmalloc (glibc), tcmalloc (Google), jemalloc (Facebook).
Inside malloc — Free Lists
When malloc(1024) is called:
1. Search the allocator's internal "free list" — any free chunk ≥ 1024 B?
2. If yes → split that chunk and return it (tens of ns)
3. If no → request new pages from OS via sbrk() or mmap() → add to free list → return
When free(ptr) is called:
1. Read chunk metadata (size etc.) at ptr
2. Coalesce with adjacent free chunks (prevents fragmentation)
3. Add to free list
→ malloc/free are NOT syscalls in the common case. Just user-space bookkeeping.Chunk metadata is typically 16 bytes before the chunk. A malloc(1024) request actually uses 1040 bytes — overhead.
Fragmentation — Not a Leak, but Out-of-Memory
Initial heap (free):
[████████████████████████████████] 32 MB free
malloc pattern:
- Allocate 1 KB, allocate 1 KB, allocate 1 KB, ...
- Free only even-numbered allocations
Result:
[██░░██░░██░░██░░██░░██░░██░░██░░] 16 MB free (50%)
↑ ↑ ↑ ↑ ↑ ↑
1KB holes
Now malloc(2 KB):
→ 16 MB free total, but no contiguous 2 KB → fails or mmap new pageExternal fragmentation — total free is enough but no contiguous space. The main cause of "memory keeps growing" in long-running servers (not a leak — fragmentation).
Mitigations: jemalloc (size-class separation), per-arena allocation (thread-local), periodic restart.
The Costs You Should Know
| Operation | Approximate cost |
|---|---|
| Stack variable | ~ 1 ns (rsp sub) |
| malloc (hot, free-list hit) | ~ 20–50 ns |
| malloc (new page, mmap) | ~ 1–10 μs (syscall) |
| free | ~ 20–50 ns |
| GC minor (young gen) | ~ 1–10 ms (pause) |
| GC major (full collection) | ~ 100 ms – several seconds (pause) |
RAII vs GC — Two Approaches to Freeing
RAII (C++, Rust)
{
let s = String::from("hello"); // heap allocation
let v = vec![1, 2, 3]; // heap allocation
} // scope ends
// Drop trait auto-called → compile-time-emitted free- Free time = scope end. Deterministic and predictable.
- No GC pause.
- Cost = just a free call (tens of ns).
- But the mental cost of ownership/borrow checker (in Rust).
GC (Java, Go, JS, Python)
let obj = {x: 1};
// runtime tracks reachability → declares unreachable at some point → frees- Free time = decided by GC. Non-deterministic.
- Periodic pauses (mark-and-sweep, generational).
- You can't forget to free — safety.
- Throughput vs latency trade-off (depends on GC algorithm).
See how-garbage-collection-works for the GC mechanics.
Practical Patterns — Memory-Aware Code
1. Prefer the Stack
// Slow (heap)
let v = Vec::with_capacity(8); // malloc
// Fast (stack — smallvec or array)
let arr: [i32; 8] = [0; 8];2. Pre-allocate
// Java — each add() grows ArrayList → repeated malloc + copy
List<Integer> list = new ArrayList<>();
for (int i = 0; i < 1_000_000; i++) list.add(i);
// Better — give capacity upfront
List<Integer> list = new ArrayList<>(1_000_000);3. Object Pool
If the hot path repeatedly allocates and frees the same shape of object, reuse via a pool. Go's sync.Pool, Netty's PooledByteBuf.
4. Arena Allocator
Allocate per-request → free the entire arena when the request ends. Zero individual free() calls. Rust's bumpalo, Nginx's ngx_pool_t.
Common Pitfalls
- Use-after-free — using a freed pointer. C/C++ UB. Detect with AddressSanitizer.
- Double free — freeing the same pointer twice. Crash or silent corruption.
- Memory leak — never calling free. Builds up in long-running servers.
- Stack overflow — large arrays on stack, or deep recursion.
- Fragmentation — see above. Looks like a leak but isn't.
Wrap-up
Memory allocation costs are not zero. A 50 ns malloc in a hot loop adds up. Stack-first / pre-allocate / arena / pool cover most hot paths.
Using a GC language doesn't free you from memory awareness — your allocation pattern determines how often and how long the GC pauses. See how-garbage-collection-works for that mechanism.