Skip to content
yutils

How QR Codes Are Made

A practical tour through QR code structure: the finder / alignment / timing patterns, four data encoding modes, Reed-Solomon error correction levels (L/M/Q/H), and the 8 mask patterns that keep scanners reliable. Backed by the ISO/IEC 18004 spec.

~9 min read

QR codes were invented in 1994 by Denso Wave in Japan for tracking automotive parts. They're now everywhere — menus, payments, tickets, Wi-Fi handoff. What looks like a grid of random black squares is actually a precisely engineered structure where finder patterns, error correction, and mask selection work together so a camera can read it from a tilted angle or a soiled surface. This guide walks through the key building blocks based on the ISO/IEC 18004 specification.

The big picture — function patterns + data area

A QR code splits into two regions:

  • Function patterns — fixed cells the scanner uses to locate, orient, and size the code. Finder / alignment / timing / format / version.
  • Encoding region — the data plus Reed-Solomon error correction codewords. Every cell that isn't a function pattern.

Versions range from 1 to 40. Version N is (4N + 17) × (4N + 17) cells: V1 = 21×21, V40 = 177×177.

Finder patterns — the three big corner squares

███████
█     █
█ ███ █
█ ███ █
█ ███ █
█     █
███████

7×7 nested squares in three corners (top-left, top-right, bottom-left). Black outer (7×7) → white middle (5×5) → black center (3×3) with a 1:1:3:1:1 ratio across any line through the center.

Image processing scans for that 1:1:3:1:1 ratio in any direction — which is how a scanner can find a QR code from a tilted phone. Three finder positions = enough to compute perspective transform. The empty fourth corner identifies the code's rotation (which way is up).

Separator + alignment + timing

  • Separator — 1-cell white border around each finder. Marks where the finder ends.
  • Alignment patterns — 5×5 mini-squares that appear from Version 2 onward. For large codes, three corner finders aren't enough to keep perspective accurate across the middle of the grid; alignment patterns refine the mapping. V40 has 46 of them.
  • Timing patterns — alternating black/white horizontal and vertical lines between finders. Lets the scanner measure the cell size precisely even when print/photo introduces small distortions.

Format information — ECC level + mask

A 15-bit region around the finders. Encodes which ECC level (L/M/Q/H) and which mask pattern (0-7) were applied. Those 15 bits are themselves protected by BCH (15, 5) — even if part of the format region is damaged, it can be recovered.

Scanner sequence: find finders → decode format → identify mask → XOR the mask out of the data region to recover the original bits.

Version information — large-code metadata

From Version 7 upward, an 18-bit version block sits next to the top-right and bottom-left finders. BCH (18, 6) protected. Encodes the 34 possible version numbers (7-40).

Data encoding — four modes

QR uses one of four encoding modes per data segment. Bits per character differ, so the right mode squeezes more data into the same code:

  • Numeric (0001) — digits only. 3 chars → 10 bits. Densest mode (V40 H fits 7,089 digits).
  • Alphanumeric (0010) — digits + uppercase Latin + the nine specials $%*+-./: and space. 2 chars → 11 bits.
  • Byte (0100) — arbitrary 8-bit bytes (UTF-8 etc.). 1 char → 8 bits. URLs typically use this.
  • Kanji (1000) — Shift-JIS 2-byte chars. 1 char → 13 bits. More efficient than byte mode for Japanese.

A single code can switch modes mid-stream — encoding "abc123" as alphanumeric + numeric is shorter than byte-only. QR Code Generator picks the optimal split automatically.

Error correction — the Reed-Solomon trick

The headline feature of QR. Damaged codes still scan. Four ECC levels:

LevelRecoverableUse case
L (Low)~7%Clean screens (apps, web)
M (Medium)~15%General print
Q (Quartile)~25%Industrial / outdoor
H (High)~30%Logo overlay / rough environments

Reed-Solomon adds extra ECC codewords next to the data codewords. Higher ECC level = more ECC codewords = larger code for the same payload.

The math runs over a GF(2^8) Galois Field. Treat the data as coefficients of a polynomial, then locate and correct errors by solving an error-locator polynomial. The intuition: space valid codewords far enough apart that even a damaged codeword decodes to the closest valid one.

The logo trick

Centering a logo on top of a QR code works because ECC treats the obscured cells as "errors." With level H (~30%) you can cover up to roughly the central 25% and still scan.

Masking — pick the best of 8 patterns

Raw encoded data can produce big black blobs or sequences that confuse a scanner (e.g. resembling a finder). The QR spec XORs the data area with one of 8 mask patterns to balance the visual distribution:

Mask 0: (row + col) % 2 == 0
Mask 1: row % 2 == 0
Mask 2: col % 3 == 0
Mask 3: (row + col) % 3 == 0
Mask 4: (row/2 + col/3) % 2 == 0
Mask 5: (row*col)%2 + (row*col)%3 == 0
Mask 6: ((row*col)%2 + (row*col)%3) % 2 == 0
Mask 7: ((row+col)%2 + (row*col)%3) % 2 == 0

Each candidate mask is scored with four penalty rules (runs of 5+ same-color cells / 2×2 same-color blocks / finder-like sequences / black-vs-white balance). The encoder picks the lowest total penalty. Generators like QR Code Generator compute this for you.

End-to-end — encoding "Hello!"

  1. Mode selection — "Hello!" includes "!" so it uses byte mode (0100).
  2. Character count — V1-V9 byte mode uses an 8-bit count. "Hello!" = 6 chars → 00000110.
  3. Data bits — each ASCII byte expanded to 8 bits: 01001000 01100101 01101100 01101100 01101111 00100001.
  4. Terminator + padding — append 0000, pad to a byte boundary, then alternate 11101100 / 00010001 pad bytes until codewords are full.
  5. Reed-Solomon ECC — append ECC codewords determined by the chosen level.
  6. Module placement — skip function-pattern cells and zigzag the codeword bits into the rest.
  7. Mask selection — evaluate 8 masks, apply the lowest-penalty one.
  8. Format info — encode the ECC level + chosen mask into the 15-bit format region (BCH protected).

Common pitfalls

1. URL too long

V40 H byte mode tops out at 1,273 characters. Long query strings push the version up; cells get tiny and scanners struggle. Shorten the URL first.

2. Poor contrast

Stick to dark-on-light. Reversed (light QR on dark) breaks many decoders that assume "dark = 1." Most apps don't auto-invert.

3. Small print + damage

A 6 mm QR on a business card is risky at L. Use M+ for print; H for outdoor signage where dirt and weather are a factor.

4. Missing quiet zone

The spec requires a 4-cell white margin around the code (the quiet zone). Designers trimming it tight is a top cause of "this QR doesn't scan." Always verify the final print has the margin.

5. Confusing micro QR with regular QR

Micro QR (M1-M4) is a separate spec — only one finder, smaller capacity. Many scanners don't read it. Stick with full QR for payments and general use.

References

Summary

  • QR = five function patterns (finder / alignment / timing / format / version) + a data area.
  • Four encoding modes (numeric / alphanumeric / byte / kanji) — picked automatically based on the input.
  • Reed-Solomon ECC at four levels (L 7% / M 15% / Q 25% / H 30%). Use H for logo overlay.
  • One of 8 mask patterns chosen by lowest penalty to avoid finder-like artifacts.
  • Print checks: 4-cell quiet zone, dark-on-light contrast, ECC matched to use case.
  • Try it directly with QR Code Generator — switch ECC L/M/Q/H and compare.
Back to guides