skills/ocrmypdf-skills/ocrmypdf-optimize/SKILL.md
OCRmyPDF optimization skill — compress PDFs, configure PDF/A output, JBIG2 encoding, and lossless optimization. Use when the user needs to reduce PDF file size, create archival PDF/A files, or optimize OCR output.
npx skillsauth add teachingai/agent-skills ocrmypdf-optimizeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
OCRmyPDF provides extensive optimization options to reduce file size, create PDF/A archival documents, and configure output quality.
For core OCR functionality, see the ocrmypdf skill. For image processing (deskew, rotate, clean), see ocrmypdf-image. For batch/Docker/scripting, see ocrmypdf-batch.
# Level 0 — no optimization (fastest)
ocrmypdf --optimize 0 input.pdf output.pdf
# Level 1 — lossless (default)
ocrmypdf --optimize 1 input.pdf output.pdf
# Level 2 — lossy (aggressive)
ocrmypdf --optimize 2 input.pdf output.pdf
# Level 3 — lossless, aggressive JPEG recompression
ocrmypdf --optimize 3 input.pdf output.pdf
PDF/A is an archival format with embedded fonts and colorspaces:
# PDF/A-1b (basic, default)
ocrmypdf --output-type pdfa input.pdf output.pdf
# PDF/A-2b (includes transparency)
ocrmypdf --output-type pdfa2b input.pdf output.pdf
# PDF/A-2u (Unicode)
ocrmypdf --output-type pdfa2u input.pdf output.pdf
# Standard PDF (no archival)
ocrmypdf --output-type pdf input.pdf output.pdf
JBIG2 provides excellent compression for monochrome (1-bit) images:
# Enable JBIG2 (requires jbig2enc)
ocrmypdf --jbig2-lossy input.pdf output.pdf # Lossy
ocrmypdf --jbib2-lossless input.pdf output.pdf # Lossless (v17+)
Requirements:
# Debian/Ubuntu
apt install jbig2enc
# macOS
brew install jbig2enc
Optimize embedded PNG images:
# Use pngquant for lossy compression
ocrmypdf --png-lossy input.pdf output.pdf
# Lossless PNG optimization
ocrmypdf --png-lossless input.pdf output.pdf
Fine-tune PDF processing with Ghostscript:
# Set PDF minor version
ocrmypdf --pdf-renderer hatch input.pdf output.pdf
# Use pdfimages for better image extraction
ocrmypdf --pdf-renderer img2pdf input.pdf output.pdf
Generate text file alongside PDF without modifying PDF:
# Generate sidecar only
ocrmypdf --output-type none --sidecar text.txt input.pdf output.pdf
# Typical sidecar workflow
ocrmypdf --sidecar text.txt --force-ocr input.pdf output.pdf
ocrmypdf --optimize 3 --jbig2-lossy --png-lossy input.pdf small.pdf
ocrmypdf --output-type pdfa --optimize 2 input.pdf archival.pdf
ocrmypdf --output-type pdf --optimize 1 --png-lossless input.pdf lossless.pdf
| Task | Command |
|------|---------|
| No optimization | --optimize 0 |
| Lossless default | --optimize 1 |
| Aggressive lossy | --optimize 2 |
| Max quality | --optimize 3 |
| PDF/A-1b (default) | --output-type pdfa |
| PDF/A-2b | --output-type pdfa2b |
| JBIG2 lossy | --jbig2-lossy |
| PNG lossy | --png-lossy |
| Sidecar text | --sidecar text.txt |
--optimize 2 or --png-lossy.--output-type pdfa2b for better compatibility.development
Guidance for Next.js using the official docs at nextjs.org/docs. Use when the user needs Next.js concepts, configuration, routing, data fetching, or API reference details.
tools
Provides comprehensive guidance for Flask framework including routing, templates, forms, database integration, extensions, and deployment. Use when the user asks about Flask, needs to create web applications, implement routes, or build Python web services.
development
Provides comprehensive guidance for FastAPI framework including routing, request validation, dependency injection, async operations, OpenAPI documentation, and database integration. Use when the user asks about FastAPI, needs to create REST APIs, or build high-performance Python web services.
development
Provides comprehensive guidance for Django framework including models, views, templates, forms, admin, REST framework, and deployment. Use when the user asks about Django, needs to create web applications, implement models and views, or build Django REST APIs.