ai-tools/pdf-to-text/SKILL.md
--- name: pdf-to-text description: "Convert PDF files to clean text. Handles both embedded-text PDFs and scanned/image PDFs via OCR. Use when the user wants to import, extract, or convert a PDF to text." allowed-tools: Bash, Read, Write argument-hint: [pdf-path] [output-path] --- # PDF to Text Import ## Workflow 1. **Check if PDF has embedded text** ```bash pdftotext <input.pdf> - | head -20 ``` - If output contains readable text → use `pdftotext` (fast, accurate) - If output i
npx skillsauth add randyhaylor/enhanceclaude ai-tools/pdf-to-textInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Check if PDF has embedded text
pdftotext <input.pdf> - | head -20
pdftotext (fast, accurate)Test scan first two pages and evaluate quality
pdftotext -f 1 -l 2 <input.pdf> -
Read the output and check for:
If quality is poor with pdftotext, try OCR on the same two pages:
pdftoppm -f 1 -l 2 -png <input.pdf> /tmp/test-scan
tesseract /tmp/test-scan-1.png - 2>/dev/null
rm /tmp/test-scan-*.png
Compare both outputs. Report findings to user before proceeding with full extraction.
If neither produces clean output, visually inspect the pages:
pdftoppm -f 1 -l 2 -png -r 200 <input.pdf> /tmp/visual-check
Use the Read tool to view /tmp/visual-check-1.png and /tmp/visual-check-2.png (Claude vision will render them). Compare what you see on the page to what the text extraction produced. Identify the specific issue (columns, watermarks, unusual fonts, embedded images of text, etc.) and recommend an extraction strategy before proceeding.
rm /tmp/visual-check-*.png
Extract text
Embedded text (preferred):
pdftotext <input.pdf> <output.txt>
Scanned/image PDF (OCR fallback):
ocrmypdf --force-ocr <input.pdf> /tmp/ocr-temp.pdf
pdftotext /tmp/ocr-temp.pdf <output.txt>
rm /tmp/ocr-temp.pdf
If ocrmypdf is not installed:
# Convert pages to images, then OCR
pdftoppm <input.pdf> /tmp/pdf-page -png
tesseract /tmp/pdf-page-*.png <output> txt
rm /tmp/pdf-page-*.png
Report results
wc -l <output.txt>
pdfinfo <input.pdf> | grep Pages
$ARGUMENTS[0] — path to input PDF (required)$ARGUMENTS[1] — path to output txt file (optional, defaults to same name with .txt extension)Requires at least one of:
pdftotext (from poppler-utils) — for embedded text PDFsocrmypdf + pdftotext — for scanned PDFstesseract + pdftoppm — OCR fallback if ocrmypdf unavailableCheck availability before starting:
which pdftotext ocrmypdf tesseract pdftoppm 2>/dev/null
If missing tools, tell the user what to install:
sudo apt install poppler-utils — pdftotext and pdftoppmsudo apt install tesseract-ocr — tesseractpip install ocrmypdf — ocrmypdftools
# XState v5 Quick Reference ## How to Look Up API Details For complete function signatures, types, and interfaces, **grep `api-reference.md`** — do NOT read it in full (12k+ lines). Example: ``` Grep pattern="createActor" path="~/.claude/skills/xstate/api-reference.md" output_mode="content" -C 5 ``` Then use `Read` with `offset`/`limit` to get the full section. This is the primary way to get precise technical info when the quick reference below isn't enough. ## Design Workflow Recommended
tools
Workaround for agent teams in VS Code extension where TeamCreate teammates cannot execute tools. Uses an echo-back-and-resume pattern where agents return tool requests instead of executing them directly.
development
Format documentation, READMEs, and structured text using header hierarchy where each level stands alone. Use when creating docs, research notes, summaries, or when user requests 'scannable,' 'well-structured,' 'skimmable,' or 'readable at multiple depths' output. Applies to markdown, technical specs, and any hierarchical text formatting.
development
Enforce strict Test-Driven Development workflow: write one test, make it pass, verify, then proceed. Prevents over-implementation and ensures code matches requirements exactly. Use when implementing new features, adding settings, or building functionality incrementally.