Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

aidotnet/image-ocr

Name: image-ocr
Author: aidotnet

resources/skills/image-ocr/SKILL.md

npx skillsauth add aidotnet/opencowork image-ocr

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Image OCR

Extract text from images using Tesseract OCR via Python.

When to use this skill

User asks to read or extract text from an image
User has a screenshot with text they want to process
User has scanned documents that need text extraction
User wants to digitize text from photos

Scripts overview

| Script | Purpose | Dependencies | | ---------------- | ---------------------------------------------- | ----------------------- | | ocr_extract.py | Extract text from images with multiple options | pytesseract, Pillow |

Steps

1. Install dependencies (first time only)

Install the Python packages:

pip install pytesseract Pillow

Install Tesseract OCR engine:

Windows: Download installer from https://github.com/UB-Mannheim/tesseract/wiki
macOS: brew install tesseract
Linux (Ubuntu/Debian): sudo apt install tesseract-ocr
Linux (Fedora): sudo dnf install tesseract

For additional language support:

Windows: Select languages during installation
Linux: sudo apt install tesseract-ocr-chi-sim (Chinese Simplified), tesseract-ocr-jpn (Japanese), etc.

CRITICAL — Dependency Error Recovery: If the script fails with an ImportError or "tesseract not found" error, install the missing dependencies using the commands above, then re-run the EXACT SAME script command that failed.

2. Extract text from an image

python scripts/ocr_extract.py "IMAGE_PATH"

Options:

--lang LANG — OCR language (default: eng). Use chi_sim for Chinese, jpn for Japanese, eng+chi_sim for multiple.
--save OUTPUT_PATH — Save extracted text to a file
--preprocess MODE — Image preprocessing: none (default), grayscale, threshold, blur
--dpi DPI — Set image DPI for better accuracy (default: auto-detect)
--psm MODE — Tesseract page segmentation mode (0-13, default: 3 = auto)

Examples:

# Basic text extraction
python scripts/ocr_extract.py "screenshot.png"

# Chinese text extraction
python scripts/ocr_extract.py "document.jpg" --lang chi_sim

# Mixed English and Chinese
python scripts/ocr_extract.py "mixed.png" --lang eng+chi_sim

# Preprocess noisy image for better accuracy
python scripts/ocr_extract.py "noisy_scan.png" --preprocess threshold

# Save output to file
python scripts/ocr_extract.py "scan.tiff" --save output.txt

# Single line of text (e.g., license plate, serial number)
python scripts/ocr_extract.py "plate.jpg" --psm 7

Page Segmentation Modes (PSM)

| Mode | Description | Use Case | | ---- | ------------------------- | ----------------------- | | 3 | Fully automatic (default) | General documents | | 4 | Assume single column | Single-column text | | 6 | Assume single block | Uniform text block | | 7 | Single line | One line of text | | 8 | Single word | One word | | 11 | Sparse text | Text scattered on image | | 13 | Raw line | Single line, no OSD |

Edge cases

Low quality images: Use --preprocess threshold or --preprocess blur to improve results
Rotated text: Tesseract handles slight rotation; for heavily rotated images, rotate first
Very small text: Increase DPI with --dpi 300 or higher
Mixed languages: Combine with +, e.g., --lang eng+chi_sim+jpn
Empty results: Try different PSM modes or preprocessing options

Scripts

ocr_extract.py — Extract text from images using Tesseract OCR

aidotnet/image-ocr

resources/skills/image-ocr/SKILL.md

Extract text from images using Python OCR. Use when the user wants to read text from screenshots, photos of documents, scanned pages, or any image containing text. Supports PNG, JPEG, TIFF, BMP, and WebP formats.

284 stars

development

Updated Mar 27, 2026

$ install --global

skillsauth

npx skillsauth add aidotnet/opencowork image-ocr

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

70%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 12:33 AM264.3s1 file scanned

SKILL.md

name:: image-ocr
description:: Extract text from images using Python OCR. Use when the user wants to read text from screenshots, photos of documents, scanned pages, or any image containing text. Supports PNG, JPEG, TIFF, BMP, and WebP formats.
compatibility:: Requires Python 3 and pytesseract + Pillow (pip install pytesseract Pillow). Also requires Tesseract OCR engine installed on the system.

Image OCR

Extract text from images using Tesseract OCR via Python.

When to use this skill

User asks to read or extract text from an image
User has a screenshot with text they want to process
User has scanned documents that need text extraction
User wants to digitize text from photos

Scripts overview

Steps

1. Install dependencies (first time only)

Install the Python packages:

pip install pytesseract Pillow

Install Tesseract OCR engine:

Windows: Download installer from https://github.com/UB-Mannheim/tesseract/wiki
macOS: brew install tesseract
Linux (Ubuntu/Debian): sudo apt install tesseract-ocr
Linux (Fedora): sudo dnf install tesseract

For additional language support:

Windows: Select languages during installation
Linux: sudo apt install tesseract-ocr-chi-sim (Chinese Simplified), tesseract-ocr-jpn (Japanese), etc.

CRITICAL — Dependency Error Recovery: If the script fails with an ImportError or "tesseract not found" error, install the missing dependencies using the commands above, then re-run the EXACT SAME script command that failed.

2. Extract text from an image

python scripts/ocr_extract.py "IMAGE_PATH"

Options:

--lang LANG — OCR language (default: eng). Use chi_sim for Chinese, jpn for Japanese, eng+chi_sim for multiple.
--save OUTPUT_PATH — Save extracted text to a file
--preprocess MODE — Image preprocessing: none (default), grayscale, threshold, blur
--dpi DPI — Set image DPI for better accuracy (default: auto-detect)
--psm MODE — Tesseract page segmentation mode (0-13, default: 3 = auto)

Examples:

# Basic text extraction
python scripts/ocr_extract.py "screenshot.png"

# Chinese text extraction
python scripts/ocr_extract.py "document.jpg" --lang chi_sim

# Mixed English and Chinese
python scripts/ocr_extract.py "mixed.png" --lang eng+chi_sim

# Preprocess noisy image for better accuracy
python scripts/ocr_extract.py "noisy_scan.png" --preprocess threshold

# Save output to file
python scripts/ocr_extract.py "scan.tiff" --save output.txt

# Single line of text (e.g., license plate, serial number)
python scripts/ocr_extract.py "plate.jpg" --psm 7

Page Segmentation Modes (PSM)

Edge cases

Low quality images: Use --preprocess threshold or --preprocess blur to improve results
Rotated text: Tesseract handles slight rotation; for heavily rotated images, rotate first
Very small text: Increase DPI with --dpi 300 or higher
Mixed languages: Combine with +, e.g., --lang eng+chi_sim+jpn
Empty results: Try different PSM modes or preprocessing options

Scripts

ocr_extract.py — Extract text from images using Tesseract OCR

Related Skills

aidotnet/resources/skills/post-to-x

tools

VerifiedTrustedCommunity

Post tweets to X.com (Twitter) using the system browser's login state

448SKILL.mdUpdated Mar 27, 2026

aidotnet/resources/skills/post-to-x

aidotnet/docx

development

VerifiedTrustedCommunity

Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When GLM needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks

448SKILL.mdUpdated Mar 27, 2026

aidotnet/xlsx

development

VerifiedTrustedCommunity

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When GLM needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas

284SKILL.mdUpdated Mar 27, 2026

aidotnet/xiaohongshu-search

testing

VerifiedTrustedCommunity

Search Xiaohongshu (Rednote) by keyword and extract note image URLs and titles with Playwright. Use when the user wants 小红书搜索结果抓取、图片链接提取或标题采集导出。Supports terminal JSON output and optional local text export.

284SKILL.mdUpdated Mar 27, 2026

aidotnet/xiaohongshu-search

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/aidotnet/opencowork.git

# Copy into Claude Code skills folder (global)
cp -r opencowork/resources/skills/image-ocr ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

aidotnet/opencowork

284 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT