/SKILL.md
An AI-Native skill for parsing PDF / Office / image files into clean Markdown with MinerU — a fast, zero-config document parser for AI agents. Works with NO token via the lightweight Agent API and auto-upgrades to the Standard API (token) for large files, batches, and DOCX/HTML/LaTeX export. Use when: (1) Converting PDF/Word/PPT/Excel/image to Markdown, (2) Extracting text, tables, formulas, or running OCR on scanned docs, (3) Batch-parsing a folder in parallel, (4) Piping parsed Markdown straight back to an agent or into Obsidian.
npx skillsauth add nebutra/mineru-skill mineruInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Parse PDF, Office (Word/PPT/Excel), and image files into clean Markdown — with LaTeX formulas, tables, images, and OCR. One zero-dependency script, two backends, automatic routing.
# Parse a local file or URL — the Agent API needs no login
python3 scripts/mineru.py paper.pdf
# Pipe the Markdown straight back to an agent
python3 scripts/mineru.py paper.pdf --stdout
# Machine-readable status for tool pipelines
python3 scripts/mineru.py paper.pdf --json
No pip install, no API key. The free Agent API handles files ≤ 10 MB / ≤ 20 pages.
scripts/mineru.py carries PEP 723 inline
metadata, so uv runs it directly — no venv, no
pip install, with a uv-managed interpreter:
uv run scripts/mineru.py paper.pdf --stdout # zero-install run
uv run --no-project --with pytest pytest -q # dev suite via uv
export MINERU_TOKEN="..." # https://mineru.net/apiManage/token
# Parallel batch a directory, resume on re-run
python3 scripts/mineru.py ./pdfs/ --output ./out/ --workers 8 --resume
# Export DOCX/HTML/LaTeX alongside Markdown (auto-routes to the Standard API)
python3 scripts/mineru.py report.pdf --format docx --format latex
When a token is set, the tool auto-routes: small single files still use the free Agent API; anything large (> 10 MB / > 20 pages), batched, or needing extra export formats uses the Standard API (≤ 200 MB / ≤ 200 pages). If the Agent API hits a size/page limit, it auto-escalates to the Standard API.
| Modality | Extensions | OCR |
|----------|-----------|-----|
| PDF | .pdf | --ocr |
| Image | .png .jpg .jpeg .jp2 .webp .gif .bmp | built-in |
| Word | .doc .docx | — |
| Slides | .ppt .pptx | — |
| Sheet | .xls .xlsx | — |
| HTML | .html (Standard API, MinerU-HTML model) | — |
INPUT... One or more files, a directory, or a URL
--output, -o Output directory (default: ./output)
--api auto | agent | standard (default: auto)
--model pipeline | vlm | MinerU-HTML (default: vlm)
--format docx | html | latex (repeatable; forces Standard API)
--lang OCR/document language (default: ch)
--ocr Enable OCR for scanned documents
--pages Page range, e.g. "1-10" or "2,4-6"
--workers, -w Concurrent submit/upload/download slots (default: 8)
--resume Skip inputs already parsed
--stdout Print Markdown to stdout
--json Print machine-readable status to stdout
--to SINK Deliver into a content tool (repeatable); --list-sinks to enumerate
--obsidian PATH Shortcut for --to obsidian with this vault
--engine cloud | local | auto (local/auto parse born-digital PDFs offline)
--split Split oversized PDFs past the page caps, parse parts, merge (needs pypdf)
--chunk Emit heading-aware RAG chunks (.chunks.json + --json)
--doctor Environment self-check and exit
Expose MinerU over MCP (zero-dependency stdio JSON-RPC) so an MCP host can call it:
python3 scripts/mineru_mcp.py
Tools: mineru_parse, mineru_parse_to (parse + deliver to sinks), mineru_list_sinks.
--to)Parse once and push the Markdown into content tools via each one's official path:
python3 scripts/mineru.py paper.pdf --to obsidian --to notion --to feishu
Targets: obsidian logseq siyuan notion linear yuque coda slack
feishu confluence onenote ticktick dingtalk airtable wecom (all
zero-dependency), plus roam and wps via optional extras. Each reads its config
from env vars (run --list-sinks). Per-target auth, fidelity, and image notes:
references/integrations.md.
output/
└── document-name/
├── document-name.md # clean Markdown
└── images/ # extracted figures (Standard API)
End-to-end latency for the official demo PDF via the free Agent API:
cold ≈ 14 s · warm ≈ 13 s (submit → poll → download). Batches scale with
--workers. Numbers come from the no-mock live benchmark in tests/test_live.py.
python3 -m pytest # fast unit suite (offline)
MINERU_LIVE=1 python3 -m pytest -m live -s # real API + benchmark (no mocks)
See references/api_reference.md. Official docs: https://mineru.net/apiManage/docs · Token: https://mineru.net/apiManage/token
development
An AI-Native skill for parsing PDF / Office / image files into Markdown with MinerU — a fast, zero-config document parser for AI agents. Works with NO token via the Agent API and auto-upgrades to the Standard API (token) for large files, batches, and DOCX/HTML/LaTeX export. Use when converting PDF/Word/PPT/Excel/image documents, extracting text/tables/formulas, running OCR, or batch processing.
development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.
development
Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.
development
Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.