Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

letta-ai/extracting-pdf-text

Name: extracting-pdf-text
Author: letta-ai

tools/extracting-pdf-text/SKILL.md

npx skillsauth add letta-ai/skills extracting-pdf-text

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Extracting PDF Text for LLMs

This skill provides tools and guidance for extracting text from PDFs in formats suitable for language model consumption.

Quick Decision Guide

| PDF Type | Best Approach | Script | |----------|--------------|--------| | Simple text PDF | PyMuPDF | scripts/extract_pymupdf.py | | PDF with tables | pdfplumber | scripts/extract_pdfplumber.py | | Scanned/image PDF (local) | pytesseract | scripts/extract_with_ocr.py | | Complex layout, highest accuracy | Mistral OCR API | scripts/extract_mistral_ocr.py | | End-to-end RAG pipeline | marker-pdf | pip install marker-pdf |

Recommended Workflow

Try PyMuPDF first - fastest, handles most text-based PDFs well
If tables are mangled - switch to pdfplumber
If scanned/image-based - use Mistral OCR API (best accuracy) or local OCR (free but slower)

Local Extraction (No API Required)

PyMuPDF - Fast General Extraction

Best for: Text-heavy PDFs, speed-critical workflows, basic structure preservation.

uv run scripts/extract_pymupdf.py input.pdf output.md

The script outputs markdown with preserved headings and paragraphs. For LLM-optimized output, it uses pymupdf4llm which formats text for RAG systems.

pdfplumber - Table Extraction

Best for: PDFs with tables, financial documents, structured data.

uv run scripts/extract_pdfplumber.py input.pdf output.md

Tables are converted to markdown format. Note: pdfplumber works best on machine-generated PDFs, not scanned documents.

Local OCR - Scanned Documents

Best for: Scanned PDFs when API access is unavailable.

uv run scripts/extract_with_ocr.py input.pdf output.txt

Requires: pytesseract, pdf2image, and Tesseract installed (brew install tesseract on macOS).

API-Based Extraction

Mistral OCR API

Best for: Complex layouts, scanned documents, highest accuracy, multilingual content, math formulas.

Pricing: ~1000 pages per dollar (very cost-effective)

export MISTRAL_API_KEY="your-key"
uv run scripts/extract_mistral_ocr.py input.pdf output.md

Features:

Outputs clean markdown
Preserves document structure (headings, lists, tables)
Handles images, math equations, multilingual text
95%+ accuracy on complex documents

For detailed API options and other services, see references/api-services.md.

Output Format Recommendations

For LLM consumption, markdown is preferred:

Preserves semantic structure (headings become context boundaries)
Tables remain readable
Compatible with most RAG chunking strategies

For detailed comparisons of local tools, see references/local-tools.md.

letta-ai/extracting-pdf-text

tools/extracting-pdf-text/SKILL.md

Extract text from PDFs for LLM consumption. Use when processing PDFs for RAG, document analysis, or text extraction. Supports API services (Mistral OCR) and local tools (PyMuPDF, pdfplumber). Handles text-based PDFs, tables, and scanned documents with OCR.

85 stars

tools

Updated Apr 6, 2026

$ install --global

skillsauth

npx skillsauth add letta-ai/skills extracting-pdf-text

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 25, 2026, 12:01 AM1.8s1 file scanned

SKILL.md

name:: extracting-pdf-text
description:: Extract text from PDFs for LLM consumption. Use when processing PDFs for RAG, document analysis, or text extraction. Supports API services (Mistral OCR) and local tools (PyMuPDF, pdfplumber). Handles text-based PDFs, tables, and scanned documents with OCR.

Extracting PDF Text for LLMs

This skill provides tools and guidance for extracting text from PDFs in formats suitable for language model consumption.

Quick Decision Guide

Recommended Workflow

Try PyMuPDF first - fastest, handles most text-based PDFs well
If tables are mangled - switch to pdfplumber
If scanned/image-based - use Mistral OCR API (best accuracy) or local OCR (free but slower)

Local Extraction (No API Required)

PyMuPDF - Fast General Extraction

Best for: Text-heavy PDFs, speed-critical workflows, basic structure preservation.

uv run scripts/extract_pymupdf.py input.pdf output.md

The script outputs markdown with preserved headings and paragraphs. For LLM-optimized output, it uses pymupdf4llm which formats text for RAG systems.

pdfplumber - Table Extraction

Best for: PDFs with tables, financial documents, structured data.

uv run scripts/extract_pdfplumber.py input.pdf output.md

Tables are converted to markdown format. Note: pdfplumber works best on machine-generated PDFs, not scanned documents.

Local OCR - Scanned Documents

Best for: Scanned PDFs when API access is unavailable.

uv run scripts/extract_with_ocr.py input.pdf output.txt

Requires: pytesseract, pdf2image, and Tesseract installed (brew install tesseract on macOS).

API-Based Extraction

Mistral OCR API

Best for: Complex layouts, scanned documents, highest accuracy, multilingual content, math formulas.

Pricing: ~1000 pages per dollar (very cost-effective)

export MISTRAL_API_KEY="your-key"
uv run scripts/extract_mistral_ocr.py input.pdf output.md

Features:

Outputs clean markdown
Preserves document structure (headings, lists, tables)
Handles images, math equations, multilingual text
95%+ accuracy on complex documents

For detailed API options and other services, see references/api-services.md.

Output Format Recommendations

For LLM consumption, markdown is preferred:

Preserves semantic structure (headings become context boundaries)
Tables remain readable
Compatible with most RAG chunking strategies

For detailed comparisons of local tools, see references/local-tools.md.

Related Skills

letta-ai/remote-desktop-testing-windows

tools

VerifiedTrustedCommunity

Test any GUI app or change on a Daytona Windows remote desktop sandbox. Use to launch a GUI program, sync a local project, take a screenshot, record a video, or share a clickable live-desktop link with a teammate. Generic — the only dependency is Daytona. For Linux, use remote-desktop-testing-linux.

123SKILL.mdUpdated Jul 4, 2026

letta-ai/remote-desktop-testing-windows

letta-ai/remote-desktop-testing-linux

tools

VerifiedTrustedCommunity

Test any GUI app or change on a Daytona Linux (Ubuntu xfce4 + noVNC) remote desktop sandbox. Use to launch a GUI program, sync a local project, take a screenshot, record a video, or share a clickable live-desktop link with a teammate. Generic — the only dependency is Daytona. For Windows, use remote-desktop-testing-windows.

123SKILL.mdUpdated Jul 4, 2026

letta-ai/remote-desktop-testing-linux

letta-ai/self-configuration

testing

VerifiedTrustedCommunity

Configures Letta agents' own runtime behavior, including model, context window, system prompt, reasoning, conversation overrides, compaction settings, and compaction prompts. Use when an agent or user asks to self-modify, tune summarization/compaction, change identity/system instructions, adjust model settings, or test conversation-scoped overrides.

121SKILL.mdUpdated Jun 17, 2026

letta-ai/self-configuration

letta-ai/setting-profile-images

development

VerifiedTrustedCommunity

Sets Letta Desktop and Letta Code agent profile images by writing profile.png into an agent MemFS repository. Use when the user asks to add, change, generate, or fix an agent avatar, profile picture, profile image, or Desktop agent photo.

119SKILL.mdUpdated Jun 16, 2026

letta-ai/setting-profile-images

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/letta-ai/skills.git

# Copy into Claude Code skills folder (global)
cp -r skills/tools/extracting-pdf-text ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

letta-ai/skills

85 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT