Skip to content
yutils
Example

Input (PPTX + mode)

File: deck-q2.pptx (24 slides)
Mode: markdown (heading per slide)

Output (Markdown)

# Slide 1

## yutils Q2 2026 Roadmap
Presenter: jade · 2026-05-13

# Slide 2

## Key Metrics
- MAU: 12k → 25k
- Tools: 55 → 70
- Avg. session: 3m 12s
...

Note

PPTX is XML inside ZIP, so text extraction is reasonably accurate. Text trapped inside images, charts, or complex shapes may be lost due to representation limits.

Usage / FAQ

When to use

  • Convert presentation decks to Markdown (blog posts, retrospectives)
  • Strip plan / report PPTs down to searchable text
  • Excerpt PPTX content as AI prompt input
  • Move slide text into Notion or a wiki
  • Pull source text from decks for multi-language translation

FAQ

Q.Does it work with Keynote (.key)?
A.No — only PPTX (.pptx). Keynote has its own format with a completely different extraction approach. Export to PPTX from Keynote first.
Q.What about chart data and table text?
A.Native tables export cell-by-cell. Chart data labels (embedded Excel) may be partially missing. Text inside shapes and diagrams usually extracts fine.
Q.What about speaker notes?
A.Only slide bodies are extracted today — speaker notes are under consideration as an option. If you need them now, parse the PPTX's `notesSlide` XML directly.
Fun facts
  • A .pptx is actually a ZIP-compressed folder — rename to .zip and you'll see ppt/slides/slide1.xml, slide2.xml inside. Not a custom binary format but a ZIP container holding OOXML (Office Open XML).

    Wikipedia — OOXML
  • PPTX (ECMA-376) was standardized in 2006 and adopted as ISO/IEC 29500 in 2008. Microsoft Office 2007+ uses it by default — a complete break from the old .ppt binary format.

    ECMA-376
  • PPTX slide XML nests <a:p> (paragraph) + <a:r> (run) + <a:t> (text). The same paragraph commonly splits into multiple runs — every font/color change creates a new run. So basic text extraction means concatenating every <a:t>.

    Microsoft — OpenXml Drawing