Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

lirrensi/document-extractor

Name: document-extractor
Author: lirrensi

skills/document-extractor/SKILL.md

npx skillsauth add lirrensi/agent-cli-helpers document-extractor

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

DocumentExtractor

Use markitdown to convert supported inputs into Markdown that is easier to inspect, search, and feed into LLM workflows.

Check And Install

Check whether the CLI already exists:

Get-Command markitdown -ErrorAction SilentlyContinue
markitdown --version

Install with uv first:

uv tool install markitdown
uv tool install 'markitdown[pdf,docx,pptx]'
uv tool install 'markitdown[all]'

Fall back to pipx if uv is unavailable:

pipx install markitdown
pipx install 'markitdown[pdf,docx,pptx]'
pipx install 'markitdown[all]'

If the tool is installed without the feature group you need, reinstall it with the narrower or broader extras set you actually want.

Read references/feature-groups.md before installing extras when the user only needs a subset of formats.

Use The CLI

Convert a file and print Markdown to stdout:

markitdown .\report.pdf

Write to a file:

markitdown .\report.pdf -o .\report.md

Pipe binary input and give MarkItDown an extension hint:

Get-Content .\report.pdf -AsByteStream | markitdown -x .pdf -o .\report.md

Use MIME or charset hints when the input source is ambiguous:

markitdown -x .html -m text/html .\page.bin
markitdown -x .csv -c utf-8 .\data.txt

Keep inline data: URIs instead of truncating them:

markitdown --keep-data-uris .\page.html -o .\page.md

Choose Extras Deliberately

Install base markitdown for lightweight text-like inputs and general conversion.
Install targeted extras when the user only needs specific formats.
Install [all] only when broad coverage matters more than dependency size.
Reinstall with az-doc-intel when using Azure Document Intelligence.
Reinstall with audio-transcription or youtube-transcription only for transcription workflows.

The current extras list and format mapping lives in references/feature-groups.md.

Common Workflows

Convert Office or PDF documents:

markitdown .\slides.pptx -o .\slides.md
markitdown .\notes.docx -o .\notes.md
markitdown .\table.xlsx -o .\table.md
markitdown .\scan.pdf -o .\scan.md

Convert a YouTube URL or archive when the relevant support is installed:

markitdown "https://www.youtube.com/watch?v=dQw4w9WgXcQ" -o .\video.md
markitdown .\bundle.zip -o .\bundle.md

Use Azure Document Intelligence for extraction:

markitdown .\scan.pdf -d -e "https://<resource>.cognitiveservices.azure.com/" -o .\scan.md

List installed third-party plugins:

markitdown --list-plugins
markitdown --use-plugins .\input.pdf -o .\input.md

Use The Python API

Use the Python API when the user needs MarkItDown inside a script instead of as a standalone command:

from markitdown import MarkItDown

md = MarkItDown(enable_plugins=False)
result = md.convert("report.pdf")
print(result.markdown)

Use a configured endpoint for Document Intelligence:

from markitdown import MarkItDown

md = MarkItDown(docintel_endpoint="https://<resource>.cognitiveservices.azure.com/")
result = md.convert("scan.pdf")
print(result.markdown)

Troubleshoot Quickly

If markitdown is missing, install it with uv tool install ... or pipx install ....
If a format is unsupported, check whether the right extra was installed first.
If stdin conversion looks wrong, add -x, -m, or -c hints.
If -d fails, verify the endpoint and that az-doc-intel support is installed.
If plugin behavior is expected, run markitdown --list-plugins and then add --use-plugins.
If output details are unclear, run markitdown --help and then check the upstream docs.

Last Resort

Use these sources when local behavior is unclear or the package changes:

CLI help: markitdown --help
Main docs: https://github.com/microsoft/markitdown/tree/main
README: https://raw.githubusercontent.com/microsoft/markitdown/main/README.md

lirrensi/document-extractor

skills/document-extractor/SKILL.md

Convert hard-to-read files into clear Markdown with MarkItDown. Use for PDF, Word, PowerPoint, Excel, images, audio, HTML, CSV, JSON, XML, ZIP, EPUB, Outlook, and YouTube inputs. Check whether `markitdown` is installed first, prefer `uv tool install`, fall back to `pipx`, and use the upstream docs when needed.

2 stars

tools

Updated Apr 6, 2026

$ install --global

skillsauth

npx skillsauth add lirrensi/agent-cli-helpers document-extractor

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 6, 2026, 9:48 PM106.1s2 files scanned

SKILL.md

name:: document-extractor
description:: >

DocumentExtractor

Use markitdown to convert supported inputs into Markdown that is easier to inspect, search, and feed into LLM workflows.

Check And Install

Check whether the CLI already exists:

Get-Command markitdown -ErrorAction SilentlyContinue
markitdown --version

Install with uv first:

uv tool install markitdown
uv tool install 'markitdown[pdf,docx,pptx]'
uv tool install 'markitdown[all]'

Fall back to pipx if uv is unavailable:

pipx install markitdown
pipx install 'markitdown[pdf,docx,pptx]'
pipx install 'markitdown[all]'

If the tool is installed without the feature group you need, reinstall it with the narrower or broader extras set you actually want.

Read references/feature-groups.md before installing extras when the user only needs a subset of formats.

Use The CLI

Convert a file and print Markdown to stdout:

markitdown .\report.pdf

Write to a file:

markitdown .\report.pdf -o .\report.md

Pipe binary input and give MarkItDown an extension hint:

Get-Content .\report.pdf -AsByteStream | markitdown -x .pdf -o .\report.md

Use MIME or charset hints when the input source is ambiguous:

markitdown -x .html -m text/html .\page.bin
markitdown -x .csv -c utf-8 .\data.txt

Keep inline data: URIs instead of truncating them:

markitdown --keep-data-uris .\page.html -o .\page.md

Choose Extras Deliberately

Install base markitdown for lightweight text-like inputs and general conversion.
Install targeted extras when the user only needs specific formats.
Install [all] only when broad coverage matters more than dependency size.
Reinstall with az-doc-intel when using Azure Document Intelligence.
Reinstall with audio-transcription or youtube-transcription only for transcription workflows.

The current extras list and format mapping lives in references/feature-groups.md.

Common Workflows

Convert Office or PDF documents:

markitdown .\slides.pptx -o .\slides.md
markitdown .\notes.docx -o .\notes.md
markitdown .\table.xlsx -o .\table.md
markitdown .\scan.pdf -o .\scan.md

Convert a YouTube URL or archive when the relevant support is installed:

markitdown "https://www.youtube.com/watch?v=dQw4w9WgXcQ" -o .\video.md
markitdown .\bundle.zip -o .\bundle.md

Use Azure Document Intelligence for extraction:

markitdown .\scan.pdf -d -e "https://<resource>.cognitiveservices.azure.com/" -o .\scan.md

List installed third-party plugins:

markitdown --list-plugins
markitdown --use-plugins .\input.pdf -o .\input.md

Use The Python API

Use the Python API when the user needs MarkItDown inside a script instead of as a standalone command:

from markitdown import MarkItDown

md = MarkItDown(enable_plugins=False)
result = md.convert("report.pdf")
print(result.markdown)

Use a configured endpoint for Document Intelligence:

from markitdown import MarkItDown

md = MarkItDown(docintel_endpoint="https://<resource>.cognitiveservices.azure.com/")
result = md.convert("scan.pdf")
print(result.markdown)

Troubleshoot Quickly

If markitdown is missing, install it with uv tool install ... or pipx install ....
If a format is unsupported, check whether the right extra was installed first.
If stdin conversion looks wrong, add -x, -m, or -c hints.
If -d fails, verify the endpoint and that az-doc-intel support is installed.
If plugin behavior is expected, run markitdown --list-plugins and then add --use-plugins.
If output details are unclear, run markitdown --help and then check the upstream docs.

Last Resort

Use these sources when local behavior is unclear or the package changes:

CLI help: markitdown --help
Main docs: https://github.com/microsoft/markitdown/tree/main
README: https://raw.githubusercontent.com/microsoft/markitdown/main/README.md

Related Skills

lirrensi/essh

data-ai

VerifiedTrustedCommunity

Portable SSH profile manager for agents. Run remote commands on saved hosts by friendly name instead of typing user@host -i key every time. Type less crap around your SSH commands.

4SKILL.mdUpdated May 27, 2026

lirrensi/engage

development

VerifiedTrustedCommunity

Autonomous execution mode triggered by the word "engage". Use when the user has finished planning and wants the agent to execute autonomously without further questions until the workflow is fully complete. The agent must build, test, verify, and deliver proof of work — never exiting with an incomplete or unverified result. Trigger on: "engage", "go autonomous", "execute the plan", "run it", "make it happen", or any explicit signal to switch from planning mode into fully autonomous build-and-verify mode.

4SKILL.mdUpdated May 27, 2026

lirrensi/task-system

tools

VerifiedTrustedCommunity

Use this skill when you need to manage project tasks — create, update, complete, prioritize, filter, review, track dependencies, or find unblocked work. Trigger on: 'add a task', 'create task', 'show tasks', 'what's next', 'mark done', 'update task', 'task status', 'task history', 'next task', 'task inbox', 'list tasks', 'init tasks', 'task deps', 'ready tasks', 'blocked tasks', 'search tasks', 'tag-any', 'dependency graph'. Also use proactively when starting a new work session — check `tasks status` and `tasks ready` to orient yourself. This skill covers the project's static, file-based task system (persistent, in-repo history) with typed dependency tracking, ready queue, and priority management — NOT ephemeral runtime task tools.

4SKILL.mdUpdated May 10, 2026

lirrensi/skill-store

tools

VerifiedTrustedCommunity

On-demand skill loading from a local skill registry. Trigger on: "skill store", "load skill", "find a skill", "list skills", "import skill", "skill-store", "browse skills", "search skills", or any request to fetch a skill that is NOT currently loaded in the active context. This skill is NOT for managing the already-loaded skills in your prompt. It is for accessing the much larger skill storage (~100s to 1000s) that you only bring into context when you need them. Think of it as a lazy loader: the skills here stay on disk until you explicitly call for them via CLI.

3SKILL.mdUpdated May 16, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/lirrensi/agent-cli-helpers.git

# Copy into Claude Code skills folder (global)
cp -r agent-cli-helpers/skills/document-extractor ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

lirrensi/agent-cli-helpers

2 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT