packages/sc-docling-pdf/skills/docling-pdf-extraction/SKILL.md
Convert PDF documents to markdown, extract images and tables using the docling CLI. Use when asked to convert a PDF, extract a datasheet, get images from a PDF, or process any document into structured output. Triggers: 'convert pdf', 'pdf to markdown', 'extract images from pdf', 'datasheet', 'get tables from pdf', 'extract diagrams'. No MCP required — uses docling CLI only.
npx skillsauth add randlee/synaptic-canvas docling-pdf-extractionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Convert PDFs to markdown and structured output using the docling CLI. Selects the optimal conversion profile based on document content.
which docling && docling --version
If not found on PATH, also check common install locations — Claude Code's bash PATH may differ from the interactive shell:
for p in "$HOME/.local/bin/docling" "$HOME/.venvs/docling/bin/docling" \
"$(python3 -m site --user-base 2>/dev/null)/bin/docling" \
"/opt/homebrew/bin/docling"; do
[ -x "$p" ] && echo "Found at: $p" && break
done
If found at a non-PATH location, use the full path for all docling commands,
or export that directory to PATH for the session.
If not installed: read references/installation.md before proceeding.
On first use in a session, and after any Docling / transformers / peft upgrade,
run the advanced-runtime validation block from references/installation.md.
If validation fails:
vlm or enrichment-heavy commandstext, scan, or baseline rich without enrichment flagsIf the document may need OCR:
references/profile-scan.md before running the command--ocr-lang for the document language; for English-only scans, use --ocr-lang enBefore choosing a profile, inspect the document to understand its content type.
Read references/document-analysis.md to determine:
| Profile | Document Type | Speed / Quality | Reference |
|---------|--------------|-----------------|-----------|
| text | Digital PDF, prose only, no images needed | Fastest, lowest overhead | references/profile-text.md |
| scan | Scanned or photographed, bitmapped text | Slower; OCR-first | references/profile-scan.md |
| rich | Datasheet, spec sheet, tables + photographs + diagrams ⭐ | Best default: quick and thorough | references/profile-rich.md |
| vlm | Complex layout, dense mixed content, poor standard results | Slowest, highest layout recovery | references/profile-vlm.md |
| code | Technical docs with code blocks or math formulas | Moderate cost, structure-focused | references/profile-code.md |
Tie-breaking rules:
rich over text — it's a superset with minimal overheadrich over vlm — VLM is 3–10× slower; only escalate when standard output is poorscan + rich flags are additiveUse this rule of thumb when choosing between "usable now" and "best quality later":
| Need | Recommended path |
|------|------------------|
| Clean text fast | text |
| Engineering PDF with tables / images, good enough for most agents | baseline rich |
| Scan / photo, readable text first | scan |
| Code / math fidelity matters | code |
| Layout is wrong or content is missing after baseline rich / scan | vlm with smoldocling first |
| Final pass for hardest layouts when runtime is acceptable | vlm with granite_docling |
Practical timing guidance from local runs:
text: secondsrich: tens of secondsscan: tens of seconds to a few minutessmoldocling: minutesgranite_docling: usually the longest path; reserve for cases where the faster paths are not good enoughOutput format is independent of conversion profile. Multiple formats can be generated in one run.
| Need | Reference |
|------|-----------|
| LLM consumption, reading in editor | references/output-markdown.md |
| Viewing extracted photographs and diagrams | references/output-images.md |
| Working with tables or chart data | references/output-tables.md |
| Structured access, metadata, bounding boxes | references/output-json.md |
# Fastest — clean digital PDF
docling INPUT.pdf --to md --output ./out --device mps
# Quick, thorough default for engineering PDFs
docling INPUT.pdf --to md --to json --output ./out \
--image-export-mode referenced \
--table-mode accurate --device mps
# Slower, richer variant after Step 1 validation passes
# --enrich-picture-classes --enrich-picture-description --enrich-chart-extraction
# Scanned document: usable OCR text without bloated Markdown
docling INPUT.pdf --to md --output ./out \
--force-ocr --ocr-engine easyocr --ocr-lang en \
--image-export-mode placeholder --device mps
# Complex layout rescue: try Smol first, Granite only if needed
# docling INPUT.pdf --pipeline vlm --vlm-model smoldocling --to md --to json ...
# docling INPUT.pdf --pipeline vlm --vlm-model granite_docling --to md --to json ...
For all other cases, follow Steps 1–4 above.
tools
Set up a repo-local just task runner with a root Justfile, optional .just/ helper scripts, and curated help, build, fmt, lint, test, and ci recipes. Use when a repo needs just, a Justfile, .just helpers, or when the user mentions task automation, "just build", "just lint", "just fmt", or dropping in a just system.
development
Use when another workflow must launch Claude, Codex, or Gemini as a separate background sub-agent without opening a terminal. Spawns the `launchpad` agent with fenced JSON input and `run_in_background: true`.
testing
Run repo startup: prompt load, checklist sync, optional PR triage, worktree hygiene, and CI pull. Best-effort with structured status.
tools
Harden Rust backend services for production readiness. Use when working on Tokio, Axum, Hyper, Tonic, or Reqwest-based services and you need guidance or review for config validation, structured tracing, request IDs, timeouts, retries, graceful shutdown, backpressure, body limits, health checks, metrics, and dependency hygiene. Not for non-service Rust crates, embedded Rust, pure sync CLI tools, or low-level libraries without runtime, network, or server concerns.