Skip to content
yutils
Example

Input (PDF + options)

File: report-2026.pdf (12 pages)
Mode: per-page (separated by page)
Format: Markdown

Output (Markdown)

## Page 1

yutils Usage Analysis Report
May 13, 2026

## Page 2

Summary
- Tool entry path: search 65%, favorites 22%
- Most used tools: Base64, JSON Formatter, JWT
...

Note

Pulls from the PDF's text layer — scanned or image-only PDFs return empty results (no OCR). Everything runs locally via pdfjs-dist.

Usage / FAQ

When to use

  • Convert PDF reports, papers, or specs to Markdown quickly
  • Pull PDF text into a searchable / greppable form
  • Excerpt PDF content as AI prompt input
  • Extract just the pages you need from a long PDF
  • Share PDF excerpts in email or Slack

FAQ

Q.Does it handle scanned or image-only PDFs?
A.No. Only the PDF text layer is extracted — scanned or photo PDFs need OCR. An empty result usually means the PDF is image-based.
Q.Is my file uploaded?
A.No. Parsed locally via pdfjs-dist — both the file and the extracted text stay in your browser. Safe for confidential documents.
Q.How are tables and figures handled?
A.Tables flatten into cell-order text (structure isn't preserved). Text inside images can't be extracted without OCR. Complex tables may need manual cleanup.
Fun facts
  • Text extraction from PDF is hard because PDF is a sequence of rendering commands ('draw this glyph at this coordinate'), not a paragraph structure. Line breaks, paragraphs, and table structure are heuristic guesses from coordinates — different tools give different results from the same PDF.

    ISO 32000-1 §7.8 Content Streams
  • More PDFs than you'd expect are essentially un-extractable without OCR — scanned images embedded in PDF (old docs, scanner output, re-printed PDFs). Quick check: try to select text in a PDF viewer. If you can't, you need OCR.

    Wikipedia — OCR
  • pdfjs-dist (used here) is Mozilla's pure-JS PDF renderer. Firefox's built-in PDF viewer is exactly this — the de-facto Web PDF standard. Released 2011, still actively maintained.

    Mozilla pdf.js