How Distributed Locks Actually Work

Inside one process a single mutex settles it. Spread across many servers, it suddenly gets hard. Guaranteeing "only one at a time" across machines is a distributed lock. Redis Redlock, ZooKeeper, and etcd are the common tools, but used naively they all break on clock skew, GC pauses, and network partitions. This guide covers why you need them, what breaks, and how to prevent it.

Why You Need Them — The Race Condition

Two servers run the same operation at once:
  server A: read balance 100 → -30 → write 70
  server B: read balance 100 → -50 → write 50   (nearly same instant)
  Result: last write wins → one withdrawal vanishes (lost update)

One machine? An in-process mutex settles it.
Many machines? No shared memory → you need an external coordinator.

Uses:
- Run a scheduled job on exactly one cluster instance (leader-like)
- Serialize concurrent edits to one resource (file/account/inventory)
- "Exactly once" external calls that need it (payments, etc.)

Note: a lock is a last resort. If you can, solve it with an atomic DB operation (UPDATE ... WHERE, conditional write, unique constraint) or idempotency — safer and faster. Reach for a distributed lock only when those don't fit.

The Most Naive Version — and Why It Breaks

With one Redis:
  SET lock_key <random_value> NX PX 30000
    NX = set only if absent (acquire), PX = auto-expire after 30s (TTL)
  do work...
  release: DEL only if my random_value matches (atomic via Lua)

Why random_value? — so you don't accidentally delete someone else's lock.
Why TTL?          — if you die holding it, it'd lock forever; TTL prevents.

Two dangers already here:
  1) One Redis = single point of failure → it dies, all locks freeze
  2) TTL assumes "work finishes within it" → what if it doesn't? (pitfalls)

Redis Redlock — Across Multiple Nodes

An algorithm meant to reduce single-Redis SPOF. It requires acquiring a majority across N independent Redis instances (usually 5).

Redlock (N=5):
  1) Record the current time
  2) Try SET NX PX with the same key/value on all 5 (short per-try timeout)
  3) Majority (3/5) succeed + elapsed < TTL → lock acquired
  4) Effective validity = TTL - elapsed (treat only this window as safe)
  5) Majority fail → release whatever you got and retry

Pro: survives one or two dead Redis (majority is enough)
Debate: Martin Kleppmann sharply criticized Redlock's safety.
  - Depends on clocks (assumptions about time across nodes)
  - Vulnerable to GC pauses / process stalls (below)
  → Fine for an "efficiency lock" (rare double-hold is no big deal),
    but a "correctness lock" (never two holders) needs fencing tokens.

ZooKeeper / etcd — Lease + Ephemeral Node

Strong-guarantee locks built on consensus (Raft/Zab). Both run directly on top of the agreement mechanism covered in the how-consensus-actually-works guide.

ZooKeeper approach (ephemeral sequential node):
  - Create an ephemeral + sequential child under the lock node
    /lock/req-0001, /lock/req-0002 ... (sequence auto-assigned)
  - The client with the lowest number holds the lock
  - The rest watch only the node "just before" theirs (avoids the herd)
  - Client dies or session drops → the ephemeral node auto-deletes
    → the next number succeeds automatically. No TTL guessing
      (the session is the source of truth).

etcd approach (lease):
  - Create a lease (a token with a TTL) → bind the key to the lease
  - Keep the lease alive with keep-alives while you live
  - Client dies, keep-alives stop → lease expires → key deleted
  - Consistency guaranteed internally by Raft consensus

Difference from Redis:
  Session/lease is managed by consensus → far more resistant to
  split-brain. Cost: lower throughput, higher operational complexity
  (you maintain a consensus cluster).

Fencing Tokens — The Lock's Last Line of Defense

No lock can 100% prevent the moment where you believe you still hold it while it has actually been taken away (see GC pause below). So a lock alone isn't enough — a fencing token backs it up.

Idea: issue a monotonically increasing number on every acquisition.
  client A acquires the lock → token 33
  A stalls in a GC pause → lock TTL expires → B acquires → token 34
  A wakes, believes it still holds the lock, tries to write (token 33)

Protected storage (DB/file server):
  "Reject anything below the highest token I've seen"
  → A's 33 is rejected (already saw 34); only B's 34 passes.

Key: even if the lock service hands out a token, it only matters if the
     receiver of the actual write checks it. (Use etcd's revision, ZK's zxid)
  A lock without fencing tokens is only "usually right, sometimes both pass".

The Traps — Where Nearly Every Naive Implementation Breaks

1) Clock skew:
   TTL/expiry depend on time. If node clocks differ or jump, one side
   says "still valid" and another "expired" → both hold at once.
   NTP isn't perfect either (jumps/drift exist).

2) GC pause / STW (stop-the-world):
   A client holds the lock and works → JVM full GC freezes it for seconds.
   Meanwhile the TTL expires → another client acquires the lock.
   The frozen one wakes, believes "I still hold it", forces its work → clash.
   → Only fencing tokens are a real defense.

3) Network partition:
   The client believes the lock is alive but is cut off from the lock server.
   - Lock server's view: no renewals → expire → hand to someone else
   - Severed client: by its own clock still valid → keeps working
   Both proceed → clash. A trade-off forced by CAP's P.

4) Dying while holding (no TTL):
   No TTL → locked forever (deadlock). With TTL → risks 1–3 above.
   → Inherently a "safety vs progress" tension. No free lunch.

Choosing

Situation	Recommended
Serialize row edits within a single DB	DB row lock / conditional UPDATE / unique constraint (no lock)
Rare double-hold is cheap (efficiency lock)	Single Redis / Redlock — simple, fast
Two holders must never happen (correctness lock)	etcd/ZooKeeper + fencing tokens, always
Already running a consensus cluster	etcd/ZooKeeper (reuse the infra)

Wrap-up

A distributed lock isn't "a common problem solved the hard way" — it's an inherently hard problem. Clocks, GC, and partitions break every simple assumption. Remember two things: (1) prefer atomic DB operations / idempotency over a lock when you can, and (2) if you truly need correctness, don't trust a lock without fencing tokens. Why consensus-based locks (etcd/ZooKeeper) are stronger is filled in by how-consensus-actually-works and how-replication-actually-works.