Kubernetes uses YAML. Cargo (Rust) uses TOML. npm uses JSON. Legacy desktop apps use INI. Maven uses XML. They're all "config files." Why the variety? This guide compares the five, with each format's strengths, weaknesses, gotchas, and the situations where it earns its keep.
The five at a glance
JSON
{
"name": "yutils",
"version": "0.1.0",
"tools": 66
}YAML
name: yutils
version: 0.1.0
tools: 66TOML
name = "yutils"
version = "0.1.0"
tools = 66XML
<project>
<name>yutils</name>
<version>0.1.0</version>
<tools>66</tools>
</project>INI
[project]
name = yutils
version = 0.1.0
tools = 66JSON — machine-friendly, human-tolerable
Strengths:
- RFC 8259 standard, supported by every language
- Native to JavaScript (it's literally in the name)
- Five clean types — object / array / string / number / boolean / null
- Parser fits in ~100 lines
Weaknesses:
- No comments — Crockford intentionally designed JSON for data interchange, not configuration. Reality uses it for config anyway, and the lack of comments stings.
- No trailing commas — diffs get noisier
- Duplicate keys are undefined
- IEEE 754 limit — big integers lose precision (see
how-json-parsing-worksguide)
Derivatives — JSON5 / JSONC:
- JSON5 (Mike Bostock) — comments, trailing commas, hex numbers, single quotes
- JSONC (Microsoft) — comments + trailing commas only (
tsconfig.json)
Best for — API responses, JSON Schema, package.json.
YAML — human-friendly, gotcha-heavy
Strengths:
- Indentation expresses structure → easier visual scan
- Comments with
# - JSON is a strict subset of YAML
- Multi-document streams (
---) — used by Kubernetes for many resources in one file - Anchors + aliases for DRY
The famous Norway bug
countries:
- GB
- NO ← Norway? Nope, NO is false!
- SE
# In YAML 1.1, NO parses as boolean falseYAML 1.1 booleans:
true: yes, Yes, YES, true, True, TRUE, on, On, ON
false: no, No, NO, false, False, FALSE, off, Off, OFFISO codes for Norway (NO) and Sweden (SE) get reinterpreted. Always quote them. YAML 1.2 (2009) cut booleans down to just true/false, but plenty of parsers still default to 1.1.
More traps
# Numeric-looking strings get coerced
version: 1.10 ← becomes float 1.1 ("0" lost)
mac: 00:01:02 ← parses as sexagesimal (base-60)
# Indentation mix
items:
- one
- two ← 3 spaces, becomes a different list
# null gotcha
key: ← value is null (intended empty string?)
key: "" ← explicit empty stringYAML's "human friendliness" hides quiet quirks. Kubernetes / Docker / Ansible users have seen them all.
Best for — Kubernetes manifests, GitHub Actions, Ansible playbooks, Docker Compose.
Debugging — convert to JSON with YAML ↔ JSON. If a value parses as boolean or number when you wanted a string, you'll see it immediately.
TOML — Tom's "Obvious, Minimal Language"
Tom Preston-Werner (GitHub co-founder) published TOML in 2013. Goal: "INI with a real type system, but without YAML's gotchas."
title = "My Project"
[database]
server = "192.168.1.1"
ports = [8001, 8001, 8002]
connection_max = 5000
enabled = true
[servers]
[servers.alpha]
ip = "10.0.0.1"
dc = "eqdc10"
[servers.beta]
ip = "10.0.0.2"
dc = "eqdc10"Strengths:
- Explicit types — strings require quotes, booleans are strictly
true/false, datetimes are RFC 3339 - Comments with
# - Sections (
[section]) — feels like INI - Nested tables (
[a.b.c]) for deep structure - None of YAML's coercion surprises
Weaknesses:
- Deeply nested structures get verbose (XML-ish)
- Not as ubiquitous as JSON or YAML
Best for — Cargo (Rust), Poetry (Python), Hugo, Caddy. Convert with TOML ↔ JSON.
XML — powerful, verbose
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>my-app</artifactId>
<version>1.0.0</version>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.13</version>
<scope>test</scope>
</dependency>
</dependencies>
</project>Strengths:
- XML Schema (XSD) for strict validation
- Namespaces for mixing vocabularies
- Attribute vs element distinction (metadata vs content)
- XSLT / XPath — powerful query and transform languages
- Self-describing — tags carry meaning
Weaknesses:
- Very verbose — 3-5× the size of JSON for the same data
- The "attribute vs child element?" debate has been going for 25 years (
<port>8080</port>vs<server port="8080"/>?) - XXE (XML External Entity) attacks — external entity resolution can expose the filesystem
- Comments can't appear inside attributes
XXE example:
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root>&xxe;</root>
<!-- A naive parser expands the entity → /etc/passwd contents leak -->Whether external entities are enabled by default varies by parser. Older Jackson XML / Spring releases were repeat offenders.
Best for — Maven (pom.xml), Android resources, RSS / Atom, SOAP, SVG, even MS Office's docx internals. XML Formatter / XML ↔ JSON.
INI — the unstandard standard
[Database]
Server=192.168.1.1
Port=5432
Enabled=true
# Comment (or ; also works)
[Logging]
Level=info
File=/var/log/app.logStrengths:
- Dead simple — you can write a parser in 30 minutes
- Section-based grouping
- Windows config files, .gitconfig, php.ini, systemd units
Weaknesses:
- No real standard — quote handling, escapes, and "nested" semantics vary by parser
- No nesting (or only via dotted keys / colons)
- No arrays
- Everything is a string until the consumer parses it
Best for — legacy Windows configs, .gitconfig, php.ini, systemd unit files, MySQL my.cnf. INI ↔ JSON.
Side-by-side matrix
| JSON | YAML | TOML | XML | INI | |
|---|---|---|---|---|---|
| Comments | ❌ | ✅ (#) | ✅ (#) | ✅ (<!---->) | ✅ |
| Nesting | ✅ unlimited | ✅ unlimited | ✅ unlimited | ✅ unlimited | 1 level (kind of) |
| Arrays | ✅ | ✅ | ✅ | repeated elements | ❌ |
| Types | 5 native | fluid (gotchas) | 7 native + datetime | all string + XSD | all string |
| Standard | RFC 8259 | YAML 1.2 | TOML 1.0 | W3C XML 1.0 | none |
| Size (same data) | 1.0× | 0.7× | 0.8× | 3.0× | 0.6× |
| Schema | JSON Schema | JSON Schema compatible | — | XSD / Relax NG | — |
| Multi-doc | — | ✅ (---) | — | — | — |
When to pick what
- API responses / network payloads — JSON. Universal language support, compact.
- Kubernetes / GitHub Actions / Ansible — YAML. It's the ecosystem standard. Watch the gotchas.
- Cargo / Poetry / Hugo / Caddy — TOML. The modern-developer default.
- Maven / Android resources / RSS / SOAP — XML. Legacy and schema-heavy domains.
- Windows / systemd / .gitconfig — INI. Simple and tradition.
- Fresh project default:
- Deep structure + frequent human editing — TOML
- YAML ecosystem (k8s/ansible) — YAML
- API or storage — JSON
- Config that absolutely needs comments — anything but JSON; favor YAML or TOML.
Common pitfalls
1. YAML indent mixing
Tabs vs spaces — some parsers silently misinterpret. The spec forbids tabs.
2. JSON with comments
// not allowed
{
"name": "x",
// description
"value": 1
}tsconfig.json comments rely on VS Code's JSONC mode. Strict JSON parsers reject them.
3. TOML datetime confusion
# Offset datetime (UTC offset present)
date1 = 1979-05-27T07:32:00Z
# Local datetime (no offset)
date2 = 1979-05-27T07:32:00
# Local date
date3 = 1979-05-27TOML's four datetime types are explicit. Converting to JSON flattens them to strings — some precision is lost.
4. XML entity bombs
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
]>
<root>&lol3;</root>
<!-- Expands to ~100M characters → DoS -->The "billion laughs" attack. Modern parsers cap entity expansion by default.
5. INI escape assumptions
Parser A: key=value with = sign → value is "value with = sign." Parser B: splits on the first = only, same result. Parser C: rejects without quoting. Should you quote? No standard exists — check your parser.
References
- JSON RFC 8259 — datatracker
- YAML 1.2 spec — yaml.org
- TOML spec — toml.io
- The Norway Problem — bram.us
- XXE attack — OWASP
Summary
- JSON — the data-interchange standard. No comments or trailing commas. Most ubiquitous.
- YAML — human-friendly with rich gotchas (Norway bug, indentation, type coercion). k8s/CI standard.
- TOML — explicit types without YAML's quirks. The Rust/Python tool default.
- XML — verbose with rich schema (XSD). Watch for XXE.
- INI — simple, parser-dependent semantics. Legacy systems.
Pick based on ecosystem (what reads it?), audience (human vs machine), and complexity (nested vs flat). Convert between formats with YAML ↔ JSON / TOML ↔ JSON / XML ↔ JSON / INI ↔ JSON / XML Formatter / JSON Formatter / Validator.