skills/document-understanding/SKILL.md
Extract text from images and PDFs using local OCR. Trigger words - OCR, scan, extract text, read document, parse PDF, W-2, tax form, receipt, invoice.
npx skillsauth add svenflow/dispatch document-understandingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Local OCR and document parsing using Apple's Vision Framework. 100% on-device, no cloud APIs.
# OCR an image (returns text with bounding boxes + confidence)
uvx ocrmac path/to/image.png
# OCR with structured JSON output
~/.claude/skills/document-understanding/scripts/ocr path/to/file.png
# OCR a PDF (converts pages to images first)
~/.claude/skills/document-understanding/scripts/ocr path/to/file.pdf
ocrThe ocr script wraps ocrmac with:
# Single image
~/.claude/skills/document-understanding/scripts/ocr receipt.png
# PDF document
~/.claude/skills/document-understanding/scripts/ocr tax-form.pdf
# Specific PDF page
~/.claude/skills/document-understanding/scripts/ocr document.pdf --page 1
{
"file": "document.png",
"pages": [
{
"page": 1,
"text": "full extracted text...",
"blocks": [
{
"text": "Box 1 Wages",
"confidence": 0.98,
"bbox": [x, y, width, height]
}
]
}
]
}
ocrmac - Python wrapper for Apple Vision Frameworksips - macOS built-in image conversion (for PDFs)development
Use when building React/Next.js components, dashboards, admin panels, apps, or any web interface. Trigger words - react, frontend, ui, dashboard, component, interface, web app, polish, audit, design review.
tools
Track flight status and get FlightAware links. Use when asked about flights, flight status, arrival times, or flight tracking. Trigger words - flight, flying, UA, AA, DL, landing, arriving, departure.
development
Query real-time locations of people sharing via Find My. Look up where someone is, reverse geocode GPS coordinates, set up geofence alerts. Trigger words - findmy, find my, location, where is, geofence, track location.
tools
Access Figma designs via MCP or Chrome. Use when asked about Figma files, design mockups, wireframes, or UI designs. Trigger words - figma, design, mockup, wireframe, UI design, FigJam.