skills/research/markdown-converter/SKILL.md
Convert PDF, Office, HTML, data, media, ZIP to Markdown.
npx skillsauth add notque/claude-code-toolkit markdown-converterInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Convert a file to Markdown with markitdown, zero install:
uvx 'markitdown[all]' input.pdf -o output.md # to file
uvx 'markitdown[all]' input.docx # to stdout
cat blob | uvx 'markitdown[all]' -x .pdf # stdin, with extension hint
When uvx is missing, run pipx run 'markitdown[all]' … with the same arguments. First run downloads dependencies; later runs hit the cache. Output preserves headings, tables, lists, and links.
For video transcripts, use the video-transcript skill.
| Input | Notes | |---|---| | PDF, .docx, .pptx, .xlsx, .xls | Document structure preserved | | HTML, CSV, JSON, XML | Structured Markdown | | Images | EXIF metadata + OCR text | | Audio | EXIF metadata + speech transcription | | ZIP, EPub | Iterates contents, converts each |
| Flag | Effect |
|---|---|
| -o FILE | Write output to FILE |
| -x .EXT | Extension hint for stdin input |
| -m MIME | MIME-type hint |
| -c CHARSET | Charset hint, e.g. UTF-8 |
Cause: page is an image; the base extractor reads text layers only.
Solution: render pages to images (pdftoppm), then convert the images so OCR runs.
data-ai
Extract video transcripts: yt-dlp subtitles to clean paragraphs.
tools
Collect, filter, and freshness-qualify news items.
testing
Verify factual claims against sources before publish.
data-ai
Package session state for the next agent, or rehydrate it at start.