How the Testing Pyramid Actually Works

The testing pyramid (Mike Cohn, 2009) — unit 70 / integration 20 / E2E 10. The ratio isn't an absolute law but a way to visualize speed / cost / signal trade-offs. This guide covers what each layer verifies and why the modern alternative (the testing trophy) emerged.

Why a Pyramid — Speed × Cost

         /\
        / E\  ← E2E (slow ms~s, expensive, brittle)
       / 2  \
      /  E   \
     /────────\
    / Integ.   \  ← Integration (DB / service combined, medium)
   /────────────\
  /              \  ← Unit (μs fast, cheap, accurate)
 /     Unit       \
/──────────────────\

70/20/10 answers:
"Many fast / cheap tests; few slow / expensive tests"

→ Fast feedback loop, low CI cost, easier debugging (unit failure is
   more precise).

Layer by Layer

Unit Test

def add(a, b):
    return a + b

def test_add():
    assert add(2, 3) == 5
    assert add(-1, 1) == 0

Properties:
- One function/class behavior only
- No external deps (DB, network, filesystem) — mocked if any
- Sub-millisecond (usually μs)
- One file changed → 1-10 unit tests affected

Pros: fast feedback, precise failure location, stable (no env dependency)
Cons: doesn't verify integration; mock-heavy → "implementation coupling"

Integration Test

def test_user_signup_saves_to_db():
    user = create_user("alice", "pw")  # real DB
    found = db.users.find_by_name("alice")
    assert found.id == user.id

Properties:
- Multiple modules / external systems integrated
- Real DB (or testcontainer), filesystem, in-memory broker
- 10ms - 1s
- Verifies integration boundaries (SQL query, ORM mapping, transaction)

Pros: real integration verified, no mocks
Cons: slow (CI time ↑), env-dependent (DB schema, etc.)

E2E Test

async def test_signup_flow():
    page = await browser.new_page()
    await page.goto("/signup")
    await page.fill('input[name="email"]', "a@b.com")
    await page.click('button[type="submit"]')
    await page.wait_for_selector(".welcome")
    assert "Welcome" in await page.text_content(".welcome")

Properties:
- Full flow from user perspective
- Real browser (Playwright, Cypress) + real backend + DB
- Seconds (usually 5-30s per test)
- Verifies frontend + backend + infra together

Pros: directly verifies user experience
Cons: very slow, very flaky (timing / network), brittle (selector changes)

Ice Cream Cone — The Anti-Pattern

        ____________________
       \   E2E (most)          /
        \______/¯¯¯¯¯¯\______/
              \Integ./
              \Unit /
               \___/

Symptoms:
- 100+ E2E tests, almost no unit tests
- CI 1 hour+
- One change → 5 E2E fail (which actually broke?)
- "Just retry them all"

Causes:
- "Just test from the user view" — partial truth, but cost explodes
- Legacy / hard-to-mock codebase
- Code change frequency < test writing frequency

Defense: unit first per new feature, E2E only on critical paths
        (signup, checkout — core user journeys).

The Testing Trophy — Modern Alternative (Kent C. Dodds)

     ___
    |   |  ← E2E (few, critical paths)
   _|___|_
  |       |  ← Integration (widest) ← the focus
   \_____/
   |     |  ← Unit (logic / computation)
    \___/
      |    ← Static (TypeScript, ESLint) ← base
     ___

Difference from the pyramid:
- Integration is widest — for frontend, component integration
  (state + render + interaction) is highest-value
- Static (typecheck, lint) as the base — near-free verification
  in modern toolchains

Background:
- The pyramid was backend-centric (2009). On frontends, slicing
  unit too thin causes implementation coupling.
- React Testing Library, Vitest, Playwright Component Test —
  integration-style tests became easy to write.
- "Test from the user's view" naturally lands at integration shape.

Which Shape Is Right?

Code type	Recommended shape	Why
Pure algorithm (parser, math)	Unit-heavy (classic pyramid)	Few external deps, many edge cases
Backend API service	Pyramid + integration emphasized	DB / external service integration is the point
Frontend SPA	Trophy (integration-heavy)	Component integration + user interaction
Microservice (consumer-provider)	Add contract tests	Service-to-service boundaries
Data pipeline	Unit + small integration samples	Full E2E on huge data is impractical

Common Pitfalls

Obsessing over ratios — 70/20/10 is a guideline, not a law. 30% coverage that hits critical paths can be enough.
Unit-testing every function — getters / DTO mappers — trivial code is covered by typecheck.
Mock-heavy unit tests — 100% coverage but real integration unverified → integration tests still needed.
Only E2E tests added — 30-minute feedback loop, 1-hour debugging. Expensive signal.
Retrying flaky tests — without fixing root cause, CI trust erodes (see the flaky tests guide).

Wrap-up

The pyramid / trophy aren't ratio rules but visualizations of cost vs value. Pick the shape matching your codebase. Not every org uses the same ratio.

Practical start — typecheck + lint (free) as the base → unit (pure logic) + integration (boundaries) in moderation → E2E only for critical user journeys (signup, checkout). Flaky E2E — fix immediately or quarantine.