Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

malue-ai/multi-lang-ocr

Name: multi-lang-ocr
Author: malue-ai

instances/xiaodazi/skills/multi-lang-ocr/SKILL.md

npx skillsauth add malue-ai/dazee-small multi-lang-ocr

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

多语言 OCR — 图片文字提取

从图片、截图、扫描件中提取文字。支持中文、英文、中英混排、日文、韩文。100% 本地运行，保护隐私。

使用场景

用户说「帮我提取这张图片里的文字」「截图转文字」
用户说「识别这份扫描文档的内容」「名片上的信息提取出来」
用户说「把这张照片里的表格提取成文本」
处理 PDF 中无法提取文字的扫描页

引擎选择（分层策略）

macOS 优先路径（零安装）

macOS 内置 Vision Framework，中英混排识别质量优秀，无需安装任何依赖。

import subprocess, json

def ocr_macos_vision(image_path: str) -> str:
    """Use macOS Vision Framework for OCR (zero install, best quality on Mac)."""
    swift_code = f'''
import Foundation
import Vision

let url = URL(fileURLWithPath: "{image_path}")
guard let image = CGImage.from(url: url) else {{ exit(1) }}

let request = VNRecognizeTextRequest()
request.recognitionLevel = .accurate
request.recognitionLanguages = ["zh-Hans", "zh-Hant", "en-US", "ja", "ko"]
request.usesLanguageCorrection = true

let handler = VNImageRequestHandler(cgImage: image)
try handler.perform([request])

let results = request.results ?? []
for obs in results {{
    if let candidate = obs.topCandidates(1).first {{
        print(candidate.string)
    }}
}}
'''
    # Save and execute Swift script
    import tempfile, os
    script_path = tempfile.mktemp(suffix='.swift')
    with open(script_path, 'w') as f:
        f.write(swift_code)
    try:
        result = subprocess.run(
            ['swift', script_path],
            capture_output=True, text=True, timeout=30
        )
        return result.stdout.strip()
    finally:
        os.unlink(script_path)

使用条件：macOS 13+，无需安装任何依赖。通过 nodes 执行即可。

跨平台路径（pip 安装，~50MB）

使用 rapidocr-onnxruntime，基于 PaddleOCR v4 模型的 ONNX 推理版本。

# 首次安装（约 50MB，30 秒内完成）
pip install rapidocr-onnxruntime

from rapidocr_onnxruntime import RapidOCR

engine = RapidOCR()

# 基本识别（自动检测中英文，无需指定语言）
result, elapse = engine("/path/to/image.png")

# result 是列表：[[坐标, 文字, 置信度], ...]
if result:
    for line in result:
        box, text, confidence = line
        print(f"{text}  (置信度: {confidence:.2f})")

执行方式

通过 nodes 写 Python 脚本执行 OCR。优先尝试 macOS Vision，不可用时降级到 rapidocr。

批量处理（目录中所有图片）

import os, glob

image_dir = "/path/to/scanned_pages"
output_file = "/path/to/extracted_text.md"

images = sorted(glob.glob(os.path.join(image_dir, "*.{png,jpg,jpeg,tiff,bmp}")))
all_text = []

for i, img_path in enumerate(images, 1):
    print(f"Processing page {i}/{len(images)}: {os.path.basename(img_path)}")
    text = ocr_with_best_engine(img_path)
    all_text.append(f"## Page {i}\n\n{text}")

with open(output_file, "w", encoding="utf-8") as f:
    f.write("\n\n---\n\n".join(all_text))

print(f"Done: {len(images)} pages -> {output_file}")

语言支持

| 语言 | rapidocr | macOS Vision | 说明 | |------|----------|-------------|------| | 简体中文 | 默认支持 | 默认支持 | 无需额外配置 | | 英文 | 默认支持 | 默认支持 | 无需额外配置 | | 中英混排 | 默认支持 | 默认支持 | 一个模型同时识别，无需切换 | | 繁体中文 | 默认支持 | 默认支持 | 自动识别 | | 日文 | 需下载模型 | 默认支持 | rapidocr 需额外步骤 | | 韩文 | 需下载模型 | 默认支持 | rapidocr 需额外步骤 |

安全规则

100% 本地处理：所有 OCR 操作在本地完成，图片不上传到任何云端
临时文件清理：处理完成后删除中间临时文件

输出规范

提取后展示识别文本（纯文本格式）
如有多页，按页码分段展示
如识别效果差（文字模糊/手写），告知用户并建议提供更清晰图片
表格内容尽量保持结构化格式（用 Markdown 表格）
将结果写入文件并告知路径，方便用户后续使用

malue-ai/multi-lang-ocr

instances/xiaodazi/skills/multi-lang-ocr/SKILL.md

Extract text from images, screenshots, and scanned documents using local OCR. Supports Chinese, English, Japanese, Korean and mixed-language text. Runs 100% locally for privacy.

32 stars

documentation

Updated Apr 6, 2026

$ install --global

skillsauth

npx skillsauth add malue-ai/dazee-small multi-lang-ocr

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 6, 2026, 10:58 PM98.4s1 file scanned

SKILL.md

name:: multi-lang-ocr
description:: Optional output file path (default: print to stdout)
version:: 2.0.0
- name:: output_path
type:: string
required:: false
dependency_level:: builtin
os:: [common]
backend_type:: local
user_facing:: true
# NOTE:: 依赖管理由 skills.yaml 按 OS 分区处理：
# - macOS:: darwin/builtin（Vision Framework 内置，零安装）
# - Windows/Linux:: lightweight（auto_install: pip install rapidocr-onnxruntime）

多语言 OCR — 图片文字提取

从图片、截图、扫描件中提取文字。支持中文、英文、中英混排、日文、韩文。100% 本地运行，保护隐私。

使用场景

用户说「帮我提取这张图片里的文字」「截图转文字」
用户说「识别这份扫描文档的内容」「名片上的信息提取出来」
用户说「把这张照片里的表格提取成文本」
处理 PDF 中无法提取文字的扫描页

引擎选择（分层策略）

macOS 优先路径（零安装）

macOS 内置 Vision Framework，中英混排识别质量优秀，无需安装任何依赖。

import subprocess, json

def ocr_macos_vision(image_path: str) -> str:
    """Use macOS Vision Framework for OCR (zero install, best quality on Mac)."""
    swift_code = f'''
import Foundation
import Vision

let url = URL(fileURLWithPath: "{image_path}")
guard let image = CGImage.from(url: url) else {{ exit(1) }}

let request = VNRecognizeTextRequest()
request.recognitionLevel = .accurate
request.recognitionLanguages = ["zh-Hans", "zh-Hant", "en-US", "ja", "ko"]
request.usesLanguageCorrection = true

let handler = VNImageRequestHandler(cgImage: image)
try handler.perform([request])

let results = request.results ?? []
for obs in results {{
    if let candidate = obs.topCandidates(1).first {{
        print(candidate.string)
    }}
}}
'''
    # Save and execute Swift script
    import tempfile, os
    script_path = tempfile.mktemp(suffix='.swift')
    with open(script_path, 'w') as f:
        f.write(swift_code)
    try:
        result = subprocess.run(
            ['swift', script_path],
            capture_output=True, text=True, timeout=30
        )
        return result.stdout.strip()
    finally:
        os.unlink(script_path)

使用条件：macOS 13+，无需安装任何依赖。通过 nodes 执行即可。

跨平台路径（pip 安装，~50MB）

使用 rapidocr-onnxruntime，基于 PaddleOCR v4 模型的 ONNX 推理版本。

# 首次安装（约 50MB，30 秒内完成）
pip install rapidocr-onnxruntime

from rapidocr_onnxruntime import RapidOCR

engine = RapidOCR()

# 基本识别（自动检测中英文，无需指定语言）
result, elapse = engine("/path/to/image.png")

# result 是列表：[[坐标, 文字, 置信度], ...]
if result:
    for line in result:
        box, text, confidence = line
        print(f"{text}  (置信度: {confidence:.2f})")

执行方式

通过 nodes 写 Python 脚本执行 OCR。优先尝试 macOS Vision，不可用时降级到 rapidocr。

批量处理（目录中所有图片）

import os, glob

image_dir = "/path/to/scanned_pages"
output_file = "/path/to/extracted_text.md"

images = sorted(glob.glob(os.path.join(image_dir, "*.{png,jpg,jpeg,tiff,bmp}")))
all_text = []

for i, img_path in enumerate(images, 1):
    print(f"Processing page {i}/{len(images)}: {os.path.basename(img_path)}")
    text = ocr_with_best_engine(img_path)
    all_text.append(f"## Page {i}\n\n{text}")

with open(output_file, "w", encoding="utf-8") as f:
    f.write("\n\n---\n\n".join(all_text))

print(f"Done: {len(images)} pages -> {output_file}")

语言支持

安全规则

100% 本地处理：所有 OCR 操作在本地完成，图片不上传到任何云端
临时文件清理：处理完成后删除中间临时文件

输出规范

提取后展示识别文本（纯文本格式）
如有多页，按页码分段展示
如识别效果差（文字模糊/手写），告知用户并建议提供更清晰图片
表格内容尽量保持结构化格式（用 Markdown 表格）
将结果写入文件并告知路径，方便用户后续使用

Related Skills

malue-ai/web-search

development

VerifiedTrustedCommunity

Local web search (Tavily/Exa, requires API Key). For quick searches. If no Key configured or deep research needed, use cloud_agent instead.

32SKILL.mdUpdated Apr 6, 2026

malue-ai/weather

development

VerifiedTrustedCommunity

Get current weather and forecasts (no API key required).

32SKILL.mdUpdated Apr 6, 2026

malue-ai/wacli

tools

VerifiedTrustedCommunity

Send WhatsApp messages to other people or search/sync WhatsApp history via the wacli CLI (not for normal user chats).

32SKILL.mdUpdated Apr 6, 2026

malue-ai/voice-call

tools

VerifiedTrustedCommunity

Start voice calls via the Moltbot voice-call plugin.

32SKILL.mdUpdated Apr 6, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/malue-ai/dazee-small.git

# Copy into Claude Code skills folder (global)
cp -r dazee-small/instances/xiaodazi/skills/multi-lang-ocr ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

malue-ai/dazee-small

32 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

malue-ai/multi-lang-ocr

$ install --global

Security Scan Results

SKILL.md

多语言 OCR — 图片文字提取

使用场景

引擎选择（分层策略）

macOS 优先路径（零安装）

跨平台路径（pip 安装，~50MB）

执行方式

推荐执行脚本模板

批量处理（目录中所有图片）

语言支持

安全规则

输出规范

Related Skills

malue-ai/web-search

malue-ai/weather

malue-ai/wacli

malue-ai/voice-call

malue-ai/multi-lang-ocr

$ install --global

Security Scan Results

SKILL.md

多语言 OCR — 图片文字提取

使用场景

引擎选择（分层策略）

macOS 优先路径（零安装）

跨平台路径（pip 安装，~50MB）

执行方式

推荐执行脚本模板

批量处理（目录中所有图片）

语言支持

安全规则

输出规范

Related Skills

malue-ai/web-search

malue-ai/weather

malue-ai/wacli

malue-ai/voice-call