How Pagination Actually Works

You can't return a million-row list at once, so you split it into pages. Almost everyone starts with an offset style like ?page=2&limit=20, but as data grows and changes often it breaks in two ways — deep-offset performance and consistency. This guide covers the real difference between offset and cursor (keyset), why deep pages get slow, and how to design a clean API with opaque cursors.

offset / limit — Familiar but a Trap

SELECT * FROM posts ORDER BY created_at DESC
LIMIT 20 OFFSET 40;     -- page 3 (20 * 2)

Pros:
  - Intuitive: page N = OFFSET (N-1)*limit
  - Jump to an arbitrary page ("go to page 47")
  - Show "1234 total pages" if you know the total count

Traps (detailed in the next two sections):
  - The larger the OFFSET, the slower it gets (deep offset)
  - If data changes mid-paging, rows shift or duplicate

Why Deep Offsets Get Slow

The core misconception: "OFFSET 1000000 makes the DB jump to that position." It doesn't. The DB actually reads the first 1,000,000 rows and throws them away.

LIMIT 20 OFFSET 1000000:
  DB: scans rows in sort order...
      reads 1,000,000 → discards all of them → returns the next 20
  i.e. the cost of "reading to discard" is proportional to OFFSET → O(offset)

Page 1 (OFFSET 0): reads 20 rows → fast
Page 50000 (OFFSET 1000000): reads 1,000,020 rows → tens to hundreds × slower

Why an index doesn't save you:
  An index follows sort order quickly, but it still has to "count and
  skip" OFFSET entries. The skipping itself is work.
  (how-database-indexes-work explains how indexes operate)

So in infinite scroll or deep lists, offset pagination's felt speed collapses the further you go — slowest at the "last page". A backwards performance curve.

cursor (keyset) pagination

Drop OFFSET and instead condition on "after the last value you saw". No counting of position — seek straight to that point through the index.

First page:
  SELECT * FROM posts ORDER BY id DESC LIMIT 20;
  → say the last row's id = 9981. That's the cursor.

Next page:
  SELECT * FROM posts
  WHERE id < 9981          -- "after the last one I just saw"
  ORDER BY id DESC LIMIT 20;

Why it's fast:
  WHERE id < 9981 seeks straight to that point via the index → reads 20.
  No discarded rows → O(limit), always fast regardless of depth.

Cost:
  - No arbitrary page jump ("go to page 47" isn't possible) — only next/prev
  - Hard to show "N total pages" (you'd have to count everything)
  - The sort key must be indexed and (effectively) unique

Tie-Break — When the Sort Key Isn't Unique

A keyset trap: if the sort column has duplicate values, rows can leak or get skipped at the boundary.

Sorting by created_at, but many rows share a timestamp?
  WHERE created_at < '2026-05-28 10:00:00'
  → rows at exactly that instant may be cut off or duplicated at the edge.

Fix: a composite (sort key, unique key) cursor + composite sort
  ORDER BY created_at DESC, id DESC
  WHERE (created_at, id) < ('2026-05-28 10:00:00', 9981)
    -- row-value comparison: same created_at → tie-break by id

Now the boundary is always uniquely determined → no gaps/duplicates.
A composite index (created_at, id) is needed for speed.

Consistency — Shifting and Duplicate Rows

Offset's subtler problem: if data changes while the user pages through, results go out of sync.

Viewing a newest-first list with OFFSET, after seeing page 1,
one new post is inserted at the front:

  Before:  [A B C D E] [F G H I J] ...   (page1) (page2)
  Insert X at front → everything shifts by one:
  After:   [X A B C D] [E F G H I] ...
                          ↑ page2 now starts at E
  User: never saw E on page1 (it was page2's) → missed
        or conversely sees the same row on two pages → duplicate

Why cursor breaks less:
  "id < 9981" is an absolute anchor. Whatever is added in front, anything
  below 9981 is unchanged. A new post has an id above 9981 → it appears on
  the "previous" side, not next → no shift.
  (Not fully immune — if 9981 itself is deleted the anchor disappears,
   usually mitigated with boundary handling / delete tombstones)

API Design — Opaque Cursors

When you expose a cursor to clients, don't hand them a raw value like id<9981. Wrap it in an opaque token.

Bad — leaks the internal implementation:
  GET /posts?after_id=9981&sort=created_at
  → clients couple to your column names / sort structure.
    Later change the sort key to (created_at,id) → the API breaks.

Good — opaque cursor:
  GET /posts?limit=20
  Response:
    {
      "data": [ ...20 items... ],
      "next_cursor": "eyJjcmVhdGVkX2F0IjoiMjAyNi0wNS0yOCIsImlkIjo5OTgxfQ=="
    }
  Next request:
    GET /posts?limit=20&cursor=eyJjcmVhdGVkX2F0...

  The cursor encodes (sort value, unique id, direction, ...) (base64/JSON).
  The client can't read it → just echoes it back → only the server decodes.
  Pros:
   - Change the internal sort structure freely; only the server owns the
     cursor format → API stays stable
   - Add signing/expiry to prevent tampering
   - Express both next/prev via cursors (including forward/backward)

When to Use What

Situation	Recommended
Admin table, needs page numbers/jumps, small data	offset/limit (simplicity wins)
Infinite scroll, feeds, large lists, changes often	cursor (keyset) — fast and consistent
Public API (external clients)	opaque cursor (hides internals + stability)
"How many pages total" is a hard requirement	offset + a separate COUNT (or show an estimate)

Common Pitfalls

Cursor with a non-unique sort key — gaps/dupes at the boundary. Tie-break with a (sort value, id) composite cursor + composite index.
COUNT(*) on every request — on a big table, counting the total is itself a heavy query. Avoid with an estimate (approximate row count) or a "load more" UI.
Raw id exposed in the cursor — clients couple to internals. Wrap it in an opaque token to keep room for future change.
No index on the sort column — keyset's WHERE seek degrades to a full scan and gets as slow as offset. An indexed sort key is the premise.
OFFSET for infinite scroll — the worst combo, slower the deeper you scroll. Infinite scroll is almost always cursor.

Wrap-up

One core insight: offset "counts a position", cursor "finds by value". Counting costs scale with depth; finding costs stay flat. So for large, frequently changing lists, cursor wins on both performance and consistency. For a public API, wrap that cursor in an opaque token so you can keep changing your internal sort structure in the future. Why indexes make all of this possible is filled in by how-database-indexes-work.