skills/ocrmypdf-skills/ocrmypdf-image/SKILL.md
OCRmyPDF image processing skill — deskew, rotate, clean, despeckle, remove border from scanned documents. Use when the user needs to improve scanned PDF quality, fix skewed pages, remove noise, or clean up scanned documents before OCR.
npx skillsauth add teachingai/agent-skills ocrmypdf-imageInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
OCRmyPDF includes powerful image processing capabilities to improve scan quality before OCR. These tools help fix skewed pages, remove noise, clean borders, and enhance readability.
For core OCR functionality, see the ocrmypdf skill. For optimization and PDF/A options, see ocrmypdf-optimize. For batch/Docker/scripting, see ocrmypdf-batch.
Deskew corrects pages that are slightly rotated (e.g., from feed scanner skew).
# Auto deskew (recommended)
ocrmypdf --deskew input.pdf output.pdf
# Force deskew even if rotation is minimal
ocrmypdf --deskew --force-ocr input.pdf output.pdf
Rotate pages to correct upside-down or sideways scans:
# Auto-rotate based on text orientation
ocrmypdf --rotate-pages input.pdf output.pdf
# Force rotate all pages
ocrmypdf --rotate-pages --force-ocr input.pdf output.pdf
Remove unwanted borders, artifacts, and noise from scanned pages:
# Remove borders (dots, solid borders)
ocrmypdf --remove-bordering input.pdf output.pdf
# Combine with cleanup
ocrmypdf --remove-bordering --clean input.pdf output.pdf
Remove speckles and isolated noise pixels:
# Remove speckles
ocrmypdf --despeckle input.pdf output.pdf
# Aggressive despeckle for very noisy scans
ocrmypdf --despeckle --clean input.pdf output.pdf
unpaper provides advanced post-processing:
# Apply unpaper with default settings
ocrmypdf --unpaper input.pdf output.pdf
# Custom unpaper board options
ocrmypdf --unpaper-args "--board A4" input.pdf output.pdf
Increase image resolution before OCR for better accuracy:
# Oversample to 300 DPI before OCR
ocrmypdf --oversample 300 input.pdf output.pdf
# Common for low-resolution scans
ocrmypdf --oversample 400 input.pdf output.pdf
ocrmypdf --deskew --remove-bordering --despeckle scanned.pdf fixed.pdf
ocrmypdf --deskew --rotate-pages --despeckle --clean --oversample 300 noisy.pdf clean.pdf
ocrmypdf --remove-bordering --unpaper --despeckle dirty.pdf clean.pdf
| Task | Command |
|------|---------|
| Auto deskew | --deskew |
| Auto rotate | --rotate-pages |
| Remove borders | --remove-bordering |
| Remove speckles | --despeckle |
| Unpaper | --unpaper |
| Oversample DPI | --oversample N |
--oversample 300 to increase input quality.--unpaper for aggressive cleanup.development
Guidance for Next.js using the official docs at nextjs.org/docs. Use when the user needs Next.js concepts, configuration, routing, data fetching, or API reference details.
tools
Provides comprehensive guidance for Flask framework including routing, templates, forms, database integration, extensions, and deployment. Use when the user asks about Flask, needs to create web applications, implement routes, or build Python web services.
development
Provides comprehensive guidance for FastAPI framework including routing, request validation, dependency injection, async operations, OpenAPI documentation, and database integration. Use when the user asks about FastAPI, needs to create REST APIs, or build high-performance Python web services.
development
Provides comprehensive guidance for Django framework including models, views, templates, forms, admin, REST framework, and deployment. Use when the user asks about Django, needs to create web applications, implement models and views, or build Django REST APIs.