skills/.trash/openclaw-search-skills/mineru-extract/SKILL.md
Use the official MinerU (mineru.net) parsing API to convert a URL (HTML pages like WeChat articles, or direct PDF/Office/image links) into clean Markdown + structured outputs. Use when web_fetch/browser can’t access or extracts messy content, and you want higher-fidelity parsing (layout/table/formula/OCR).
npx skillsauth add aaaaqwq/claude-code-skills mineru-extractInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use MinerU as an upstream “content normalizer”: submit a URL to MinerU, poll for completion, download the result zip, and extract the main Markdown.
We align to the MinerU MCP mental model, but we do not run an MCP server.
scripts/mineru_parse_documents.py
--file-sources (comma/newline-separated){ ok, items, errors }scripts/mineru_extract.pyAuth:
MINERU_TOKEN (Bearer token from mineru.net)Default model heuristic:
.pdf/.doc/.ppt/.png/.jpg → pipelineMinerU-HTML (best for HTML pages like WeChat articles)Put secrets in skill root .env (do not paste into chat outputs):
# In the mineru-extract skill directory: .env
MINERU_TOKEN=your_token_here
MINERU_API_BASE=https://mineru.net
MCP-style wrapper (returns JSON, optionally includes markdown text):
python3 mineru-extract/scripts/mineru_parse_documents.py \
--file-sources "<URL1>\n<URL2>" \
--language ch \
--enable-ocr \
--model-version MinerU-HTML
If you want the markdown content inline in the JSON (can be large):
python3 mineru-extract/scripts/mineru_parse_documents.py \
--file-sources "<URL>" \
--model-version MinerU-HTML \
--emit-markdown --max-chars 20000
Low-level (single URL, print markdown to stdout):
python3 mineru-extract/scripts/mineru_extract.py "<URL>" --model MinerU-HTML --print > /tmp/out.md
The script always downloads + extracts the MinerU result zip to:
~/.openclaw/workspace/mineru/<task_id>/
It writes:
result.zipIt prints a JSON summary to stderr with paths:
task_id, full_zip_url, out_dir, markdown_path--model: pipeline | vlm | MinerU-HTML (HTML requires MinerU-HTML)--ocr/--no-ocr: enable OCR (effective for pipeline/vlm)--table/--no-table: table recognition--formula/--no-formula: formula recognition--language ch|en|...--page-ranges "2,4-6" (non-HTML)--timeout 600 / --poll-interval 2err_msg and keep an original-source link in outputs.testing
通用自媒体文章自动发布工具。支持百家号、搜狐号、知乎、微信公众号、小红书、抖音号六个平台的自动化发布流程。使用Playwright自动化实现平台导航和发布,支持通过storageState管理Cookie实现账号切换。
development
# SKILL.md - Model Configuration Status (mcstatus) ## 触发条件 - `/mcstatus` 命令 - 用户询问模型配备、模型配置、model status、模型列表等 ## 功能 实时生成 Agent + Cron 的模型配置报告,展示当前所有 agent 的主模型/fallback链和所有 cron 任务的模型分配。 ## 执行步骤 ### Step 1: 收集 Agent 模型配置 读取各 agent 的 models.json 获取主模型和 fallback 链: ```bash for agent in main ops code quant data research content market finance pm law product sales batch; do config=$(cat ~/.openclaw/agents/$agent/agent/models.json 2>/dev/null) if [ -n "$config" ]; then echo "=== $agent
tools
MCP 服务器智能管理助手。自动检测 MCP 可用性、智能开关、功能问答,提供人性化的 MCP 管理体验。
tools
从GitHub搜索并自动安装配置MCP(Model Context Protocol)服务器工具到Claude配置文件。当用户需要安装MCP工具时触发此技能。工作流程:搜索GitHub上的MCP项目 -> 提取npx配置 -> 添加到~/.claude.json -> 处理API密钥(如有)。