Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

partme-ai/ocrmypdf

Name: ocrmypdf
Author: partme-ai

skills/ocrmypdf-skills/ocrmypdf/SKILL.md

npx skillsauth add partme-ai/full-stack-skills ocrmypdf

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

OCRmyPDF — Core OCR Guide

Overview

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. It uses Tesseract OCR, supports 100+ languages, produces PDF/A by default, and distributes work across all CPU cores.

For image processing (deskew, rotate, clean), see the ocrmypdf-image skill. For optimization and PDF/A options, see ocrmypdf-optimize. For batch/Docker/scripting, see ocrmypdf-batch. For Python API and plugins, see ocrmypdf-api.

Installation

One-liner installs (recommended)

| OS | Command | |----|---------| | Debian / Ubuntu | apt install ocrmypdf | | Fedora | dnf install ocrmypdf tesseract-osd | | macOS (Homebrew) | brew install ocrmypdf | | macOS (MacPorts) | port install ocrmypdf | | FreeBSD | pkg install py-ocrmypdf | | Snap | snap install ocrmypdf |

pip install (latest version)

# After installing system dependencies (Tesseract, Ghostscript)
pip install ocrmypdf

Verify

ocrmypdf --version
ocrmypdf --help

Requirements

Python 3.11+
Tesseract 4.1.1+ (OCR engine)
Ghostscript 9.54+ or pypdfium2 (PDF rasterization)
Optional: jbig2enc (compression), pngquant (image optimization), unpaper (cleaning)

Quick Start

# Basic OCR — input scanned PDF, output searchable PDF/A
ocrmypdf input.pdf output.pdf

# OCR an image file directly
ocrmypdf --image-dpi 300 scan.png output.pdf

# OCR in place (only overwrites on success)
ocrmypdf myfile.pdf myfile.pdf

Language Support

OCRmyPDF uses Tesseract language packs. Install them for your OS:

# Debian / Ubuntu
apt-cache search tesseract-ocr          # List all language packs
apt install tesseract-ocr-chi-sim       # Chinese Simplified
apt install tesseract-ocr-fra           # French

# macOS (Homebrew)
brew install tesseract-lang             # All languages

# Fedora
dnf search tesseract-langpack
dnf install tesseract-langpack-ita      # Italian

Using languages

# Single language
ocrmypdf -l fra document.pdf output.pdf

# Multiple languages
ocrmypdf -l eng+fra bilingual.pdf output.pdf

# Chinese Simplified + English
ocrmypdf -l chi_sim+eng chinese-doc.pdf output.pdf

Note: Use ISO 639-3 codes for language identifiers.

OCR Modes

Default mode (skip existing text)

# Skip pages that already have text — only OCR pages without text
ocrmypdf input.pdf output.pdf

Force OCR (`--force-ocr` or `-m force`)

# Rasterize and OCR all pages, even those with existing text
ocrmypdf --force-ocr input.pdf output.pdf
# v17+ short form:
ocrmypdf -m force input.pdf output.pdf

Redo OCR (`--redo-ocr` or `-m redo`)

# Replace existing OCR without rasterizing (preserves quality)
ocrmypdf --redo-ocr input.pdf output.pdf
# v17+ short form:
ocrmypdf -m redo input.pdf output.pdf

Skip text (`--skip-text` or `-m skip`)

# Skip pages with any text, only OCR blank/image pages
ocrmypdf --skip-text input.pdf output.pdf
# v17+ short form:
ocrmypdf -m skip input.pdf output.pdf

No OCR (image processing only)

# Apply image processing / PDF/A conversion without OCR
ocrmypdf --ocr-engine none input.pdf output.pdf

Page Selection

# OCR only specific pages
ocrmypdf --pages 1,3,5-10 input.pdf output.pdf

# OCR only the first page, minimal changes elsewhere
ocrmypdf --pages 1 --output-type pdf --optimize 0 input.pdf output.pdf

Output Types

# PDF/A (default) — for archival
ocrmypdf --output-type pdfa input.pdf output.pdf

# Standard PDF
ocrmypdf --output-type pdf input.pdf output.pdf

# Auto (v17+) — speculative PDF/A, falls back to standard PDF
ocrmypdf --output-type auto input.pdf output.pdf

# No output PDF — only produce sidecar text
ocrmypdf --output-type none --sidecar text.txt input.pdf -

Sidecar Text File

# Produce a companion text file with OCR text
ocrmypdf --sidecar output.txt input.pdf output.pdf

Metadata

# Set output PDF metadata
ocrmypdf --title "My Document" --author "Author Name" --subject "Subject" input.pdf output.pdf

Parallel Processing

# Use 4 CPU cores (default: all available)
ocrmypdf --jobs 4 input.pdf output.pdf

# Single-threaded
ocrmypdf --jobs 1 input.pdf output.pdf

Common Recipes

Make a scanned PDF searchable

ocrmypdf scanned.pdf searchable.pdf

Convert image to searchable PDF

ocrmypdf --image-dpi 300 scan.jpg output.pdf

OCR a multilingual document

ocrmypdf -l eng+deu+fra multilingual.pdf output.pdf

Re-OCR with newer Tesseract

ocrmypdf --redo-ocr old-ocr.pdf updated.pdf

Strip all text/OCR from a PDF

ocrmypdf --ocr-engine none --force-ocr input.pdf stripped.pdf

Quick Reference

| Task | Command | |------|---------| | Basic OCR | ocrmypdf input.pdf output.pdf | | Specify language | ocrmypdf -l fra input.pdf output.pdf | | Multiple languages | ocrmypdf -l eng+fra input.pdf output.pdf | | Force re-OCR all pages | ocrmypdf --force-ocr input.pdf output.pdf | | Replace existing OCR | ocrmypdf --redo-ocr input.pdf output.pdf | | Skip pages with text | ocrmypdf --skip-text input.pdf output.pdf | | Specific pages only | ocrmypdf --pages 1,3,5-10 input.pdf output.pdf | | Output standard PDF | ocrmypdf --output-type pdf input.pdf output.pdf | | Extract text sidecar | ocrmypdf --sidecar text.txt input.pdf output.pdf | | Image to PDF | ocrmypdf --image-dpi 300 image.png output.pdf | | In-place OCR | ocrmypdf myfile.pdf myfile.pdf | | Set metadata | ocrmypdf --title "Title" input.pdf output.pdf | | Parallel jobs | ocrmypdf --jobs 4 input.pdf output.pdf |

Troubleshooting

"Tesseract not found": Install Tesseract and ensure it's on PATH.
Poor OCR quality: Check language packs (-l), try --deskew (see ocrmypdf-image), or --oversample 300.
"Input file has text": Use --force-ocr, --redo-ocr, or --skip-text as appropriate.
Large output files: See ocrmypdf-optimize for --optimize levels and JBIG2.
Signed PDFs: Use --invalidate-digital-signatures to override (signatures will be invalidated).

References

OCRmyPDF Documentation
OCRmyPDF GitHub
Tesseract Language Packs
OCRmyPDF Cookbook

partme-ai/ocrmypdf

skills/ocrmypdf-skills/ocrmypdf/SKILL.md

OCRmyPDF core skill — add searchable OCR text layer to scanned PDFs, convert images to searchable PDFs, support 100+ languages via Tesseract. Use when the user needs to OCR a PDF, make a scanned PDF searchable, or extract text from scanned documents.

270 stars

documentation

Updated Apr 10, 2026

$ install --global

skillsauth

npx skillsauth add partme-ai/full-stack-skills ocrmypdf

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 20, 2026, 11:05 AM9.5s1 file scanned

SKILL.md

name:: ocrmypdf
description:: OCRmyPDF core skill — add searchable OCR text layer to scanned PDFs, convert images to searchable PDFs, support 100+ languages via Tesseract. Use when the user needs to OCR a PDF, make a scanned PDF searchable, or extract text from scanned documents.

OCRmyPDF — Core OCR Guide

Overview

Installation

One-liner installs (recommended)

pip install (latest version)

# After installing system dependencies (Tesseract, Ghostscript)
pip install ocrmypdf

Verify

ocrmypdf --version
ocrmypdf --help

Requirements

Python 3.11+
Tesseract 4.1.1+ (OCR engine)
Ghostscript 9.54+ or pypdfium2 (PDF rasterization)
Optional: jbig2enc (compression), pngquant (image optimization), unpaper (cleaning)

Quick Start

# Basic OCR — input scanned PDF, output searchable PDF/A
ocrmypdf input.pdf output.pdf

# OCR an image file directly
ocrmypdf --image-dpi 300 scan.png output.pdf

# OCR in place (only overwrites on success)
ocrmypdf myfile.pdf myfile.pdf

Language Support

OCRmyPDF uses Tesseract language packs. Install them for your OS:

# Debian / Ubuntu
apt-cache search tesseract-ocr          # List all language packs
apt install tesseract-ocr-chi-sim       # Chinese Simplified
apt install tesseract-ocr-fra           # French

# macOS (Homebrew)
brew install tesseract-lang             # All languages

# Fedora
dnf search tesseract-langpack
dnf install tesseract-langpack-ita      # Italian

Using languages

# Single language
ocrmypdf -l fra document.pdf output.pdf

# Multiple languages
ocrmypdf -l eng+fra bilingual.pdf output.pdf

# Chinese Simplified + English
ocrmypdf -l chi_sim+eng chinese-doc.pdf output.pdf

Note: Use ISO 639-3 codes for language identifiers.

OCR Modes

Default mode (skip existing text)

# Skip pages that already have text — only OCR pages without text
ocrmypdf input.pdf output.pdf

Force OCR (`--force-ocr` or `-m force`)

# Rasterize and OCR all pages, even those with existing text
ocrmypdf --force-ocr input.pdf output.pdf
# v17+ short form:
ocrmypdf -m force input.pdf output.pdf

Redo OCR (`--redo-ocr` or `-m redo`)

# Replace existing OCR without rasterizing (preserves quality)
ocrmypdf --redo-ocr input.pdf output.pdf
# v17+ short form:
ocrmypdf -m redo input.pdf output.pdf

Skip text (`--skip-text` or `-m skip`)

# Skip pages with any text, only OCR blank/image pages
ocrmypdf --skip-text input.pdf output.pdf
# v17+ short form:
ocrmypdf -m skip input.pdf output.pdf

No OCR (image processing only)

# Apply image processing / PDF/A conversion without OCR
ocrmypdf --ocr-engine none input.pdf output.pdf

Page Selection

# OCR only specific pages
ocrmypdf --pages 1,3,5-10 input.pdf output.pdf

# OCR only the first page, minimal changes elsewhere
ocrmypdf --pages 1 --output-type pdf --optimize 0 input.pdf output.pdf

Output Types

# PDF/A (default) — for archival
ocrmypdf --output-type pdfa input.pdf output.pdf

# Standard PDF
ocrmypdf --output-type pdf input.pdf output.pdf

# Auto (v17+) — speculative PDF/A, falls back to standard PDF
ocrmypdf --output-type auto input.pdf output.pdf

# No output PDF — only produce sidecar text
ocrmypdf --output-type none --sidecar text.txt input.pdf -

Sidecar Text File

# Produce a companion text file with OCR text
ocrmypdf --sidecar output.txt input.pdf output.pdf

Metadata

# Set output PDF metadata
ocrmypdf --title "My Document" --author "Author Name" --subject "Subject" input.pdf output.pdf

Parallel Processing

# Use 4 CPU cores (default: all available)
ocrmypdf --jobs 4 input.pdf output.pdf

# Single-threaded
ocrmypdf --jobs 1 input.pdf output.pdf

Common Recipes

Make a scanned PDF searchable

ocrmypdf scanned.pdf searchable.pdf

Convert image to searchable PDF

ocrmypdf --image-dpi 300 scan.jpg output.pdf

OCR a multilingual document

ocrmypdf -l eng+deu+fra multilingual.pdf output.pdf

Re-OCR with newer Tesseract

ocrmypdf --redo-ocr old-ocr.pdf updated.pdf

Strip all text/OCR from a PDF

ocrmypdf --ocr-engine none --force-ocr input.pdf stripped.pdf

Quick Reference

Troubleshooting

"Tesseract not found": Install Tesseract and ensure it's on PATH.
Poor OCR quality: Check language packs (-l), try --deskew (see ocrmypdf-image), or --oversample 300.
"Input file has text": Use --force-ocr, --redo-ocr, or --skip-text as appropriate.
Large output files: See ocrmypdf-optimize for --optimize levels and JBIG2.
Signed PDFs: Use --invalidate-digital-signatures to override (signatures will be invalidated).

References

OCRmyPDF Documentation
OCRmyPDF GitHub
Tesseract Language Packs
OCRmyPDF Cookbook

Related Skills

partme-ai/uniapp-project

development

VerifiedTrustedCommunity

Provides per-component and per-API examples with cross-platform compatibility details for uni-app, covering built-in components, uni-ui components, and APIs (network, storage, device, UI, navigation, media). Use when the user needs official uni-app components or APIs, wants per-component examples with doc links, or needs platform compatibility checks.

456SKILL.mdUpdated Jun 4, 2026

partme-ai/uniapp-project

partme-ai/uniapp-project-creator

tools

VerifiedTrustedCommunity

Creates new uni-app projects via the official CLI or HBuilderX with Vue 2/Vue 3 template selection, manifest.json and pages.json configuration, and directory structure setup. Use when the user wants to scaffold a new uni-app project, initialize project files with a single command, or set up the development environment.

456SKILL.mdUpdated Jun 4, 2026

partme-ai/uniapp-project-creator

partme-ai/uniapp-plugin

tools

VerifiedTrustedCommunity

Browses, installs, configures, and manages plugins from the uni-app plugin market (ext.dcloud.net.cn) including component plugins, API plugins, and template plugins with dependency handling. Use when the user needs to find and install uni-app plugins, configure plugin settings, manage plugin dependencies, or integrate third-party components.

456SKILL.mdUpdated Jun 4, 2026

partme-ai/uniapp-plugin

partme-ai/uniapp-native-plugin

tools

VerifiedTrustedCommunity

Develops native Android and iOS plugins for uni-app including module creation, JavaScript-to-native communication, and plugin packaging for distribution. Use when the user needs to build custom native modules, extend uni-app with native capabilities (camera, Bluetooth, sensors), or create publishable native plugins.

456SKILL.mdUpdated Jun 4, 2026

partme-ai/uniapp-native-plugin

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/partme-ai/full-stack-skills.git

# Copy into Claude Code skills folder (global)
cp -r full-stack-skills/skills/ocrmypdf-skills/ocrmypdf ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

partme-ai/full-stack-skills

270 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

partme-ai/ocrmypdf

$ install --global

Security Scan Results

SKILL.md

OCRmyPDF — Core OCR Guide

Overview

Installation

One-liner installs (recommended)

pip install (latest version)

Verify

Requirements

Quick Start

Language Support

Using languages

OCR Modes

Default mode (skip existing text)

Force OCR (--force-ocr or -m force)

Redo OCR (--redo-ocr or -m redo)

Skip text (--skip-text or -m skip)

No OCR (image processing only)

Page Selection

Output Types

Sidecar Text File

Metadata

Parallel Processing

Common Recipes

Make a scanned PDF searchable

Convert image to searchable PDF

OCR a multilingual document

Re-OCR with newer Tesseract

Strip all text/OCR from a PDF

Quick Reference

Troubleshooting

References

Related Skills

partme-ai/uniapp-project

partme-ai/uniapp-project-creator

partme-ai/uniapp-plugin

partme-ai/uniapp-native-plugin

partme-ai/ocrmypdf

$ install --global

Security Scan Results

SKILL.md

OCRmyPDF — Core OCR Guide

Overview

Installation

One-liner installs (recommended)

pip install (latest version)

Verify

Requirements

Quick Start

Language Support

Using languages

OCR Modes

Default mode (skip existing text)

Force OCR (--force-ocr or -m force)

Redo OCR (--redo-ocr or -m redo)

Skip text (--skip-text or -m skip)

No OCR (image processing only)

Page Selection

Output Types

Sidecar Text File

Metadata

Parallel Processing

Common Recipes

Make a scanned PDF searchable

Convert image to searchable PDF

OCR a multilingual document

Re-OCR with newer Tesseract

Strip all text/OCR from a PDF

Quick Reference

Troubleshooting

References

Related Skills

partme-ai/uniapp-project

partme-ai/uniapp-project-creator

partme-ai/uniapp-plugin

partme-ai/uniapp-native-plugin

Force OCR (`--force-ocr` or `-m force`)

Redo OCR (`--redo-ocr` or `-m redo`)

Skip text (`--skip-text` or `-m skip`)

Force OCR (`--force-ocr` or `-m force`)

Redo OCR (`--redo-ocr` or `-m redo`)

Skip text (`--skip-text` or `-m skip`)