How XSS Actually Works

Cross-Site Scripting — a perennial OWASP Top 10 entry. User input is baked into a page and the browser executes it as JavaScript. Sounds simple, but every context needs a different encoding, and modern SPAs / templates / frameworks all have their own traps. This guide covers the three XSS types, context-sensitive escaping, and the actual behavior of CSP and Trusted Types.

Three Types of XSS

1. Reflected XSS

URL input echoed straight back in the response.

https://example.com/search?q=<script>alert(1)</script>

Server HTML:
  <h1>Results for <script>alert(1)</script></h1>
                  ↑
                  executes

Attack:
  Attacker → sends URL to victim (email, message)
  Victim clicks → leaks cookies, acts with their own privileges

2. Stored XSS

Malicious input is saved into the DB → delivered to all users.

Attacker → POSTs comment.body = "<script>steal()</script>"
DB stores it
Other users → visit the page → it executes for everyone

→ Much worse than reflected (one attack hits all users)

3. DOM-based XSS

No server involvement — client JS runs it directly.

// Page JS
const name = new URLSearchParams(location.search).get("name");
document.getElementById("greeting").innerHTML = "Hello, " + name;
//                                  ↑↑↑↑↑↑↑↑↑
//                                  the bomb

URL: /?name=<img src=x onerror=alert(1)>
→ innerHTML parses <img> into DOM → onerror fires

Invisible in server logs (URL fragments/hashes aren't sent to the server).

Context-Sensitive Escaping — The Biggest Trap

The same user input needs different escapes depending on where it lands.

1. HTML body — escape & < > " '

<div>{user_input}</div>

input: <script>x</script>
escape: &lt;script&gt;x&lt;/script&gt;
→ shows as text, doesn't execute

2. HTML attribute — same plus quote escape

<input value="{user_input}">

input: " onmouseover="alert(1)
Insufficient escape: <input value="" onmouseover="alert(1)">
                              ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑
                              parsed as a new attribute

Right way: escape quotes + all attribute-breaking characters.

3. JavaScript context — JSON encoding

<script>
  const data = "{user_input}";
</script>

input: ";alert(1);//
Insufficient escape: const data = "";alert(1);//";
                                  ↑↑↑↑↑↑↑↑↑↑↑↑
                                  executes

Right way: JSON.stringify first (or pass via attribute).
  const data = JSON.parse(document.getElementById("data").textContent);

4. URL context

<a href="{user_input}">click</a>

input: javascript:alert(1)
→ click runs JavaScript (javascript: protocol)

Right way: protocol whitelist (only http:, https:, mailto:).
encodeURIComponent alone does NOT block javascript:.

→ A single uniform escape across all contexts is impossible. Rely on your framework's context-aware rendering(React JSX, Vue templates, Angular bindings all escape by default).

The innerHTML Danger

element.innerHTML = userInput;  // ❌ XSS risk

Alternatives:
  element.textContent = userInput;  // ✓ no HTML parsing
  element.setAttribute("class", userInput);  // attribute escaped automatically

Danger comparison:
  // React
  <div>{userInput}</div>            ✓ auto-escaped
  <div dangerouslySetInnerHTML={{__html: userInput}}>  ❌ raw HTML

  // Vue
  <div>{{ userInput }}</div>        ✓ auto-escaped
  <div v-html="userInput">          ❌ raw HTML

  // Every *HTML / *raw* / dangerous* name is a warning sign.

DOMPurify — Letting HTML Through Safely

When you must allow user-supplied HTML (markdown → HTML, rich editor, ...):

const clean = DOMPurify.sanitize(userInput);
element.innerHTML = clean;

DOMPurify:
- removes dangerous tags (<script>, <iframe>, <object>, ...)
- strips event handlers (onclick, onerror, ...)
- removes javascript: URLs
- whitelist-based (only known-safe allowed)

→ Hand-rolled regex to strip <script> always has bypass patterns. Use a library.

Content Security Policy (CSP)

An HTTP header declaring "scripts / styles / images can only come from these origins".

Content-Security-Policy:
  default-src 'self';
  script-src 'self' https://apis.google.com;
  style-src 'self' 'unsafe-inline';
  img-src 'self' data: https:;
  frame-ancestors 'none';

Meaning:
- scripts only from self + Google APIs
- block inline <script> (= 99% of XSS payloads neutralized)
- can't be framed (clickjacking defense)

Primary XSS defense:
- remove 'unsafe-inline' from script-src (biggest impact)
- allow inline via nonce or hash
- 'strict-dynamic' mode (modern, framework-friendly)

Trusted Types — Modern Browser API

Chrome 83+; not in Firefox/Safari yet.

CSP header: require-trusted-types-for 'script'

→ innerHTML / eval / setTimeout(string) etc. reject plain strings
→ Only TrustedHTML / TrustedScript objects allowed
→ Objects require a policy (sanitize explicitly)

policy = trustedTypes.createPolicy("default", {
  createHTML: (input) => DOMPurify.sanitize(input),
});

→ Runtime enforcement: HTML that hasn't been through DOMPurify
   cannot reach the DOM.

Related Tools

HTML Entity Encode / Decode — HTML entity encode/decode (the basic XSS escape utility)

Common Pitfalls

"We use React, XSS is solved" — dangerouslySetInnerHTML / refs.innerHTML / 3rd-party libs bypass. Combine with CSP.
Hand-rolled regex escape — every context needs a different one. Trust your framework's context-aware rendering.
XSS in iframes — XSS in a same-origin iframe affects the parent. Set frame-ancestors / sandbox.
SVG / Markdown XSS— <script> inside SVG, raw HTML inside Markdown — run through DOMPurify too.
Trusting URL parameters — your page reading its own URL isn't trust. Server-side validate + client-side encode.

Wrap-up

XSS at its core: "user input runs as code". Defense is context-aware escaping + CSP + Trusted Types layered. Don't rely on a single layer.

Practical: a modern framework's (React/Vue/Angular) default escape + DOMPurify (where HTML is allowed) + strict CSP (no unsafe-inline) + security headers. Force review whenever an escape hatch like dangerouslySetInnerHTML is used.