438061781/pdf-figure-extractor/SKILL.md
从PDF论文中精确提取Figure图片,自动分析PDF结构、定位caption位置、裁剪干净图形,并验证图片质量。支持学术新闻稿、论文写作等场景的自动化图片处理。
npx skillsauth add openclaw/skills pdf-figure-extractorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
import fitz
doc = fitz.open(pdf_path)
page = doc[page_num]
# 获取所有文本块
blocks = page.get_text("blocks")
for block in blocks:
x0, y0, x1, y1, text, block_no, block_type = block
if "Fig." in text or "Figure" in text:
print(f"Figure相关: y={y0:.0f}-{y1:.0f}, {text[:50]}...")
# 搜索Fig. X的精确位置
text_instances = page.search_for(f"Fig. {fig_num}")
for inst in text_instances:
print(f"Fig.{fig_num}位置: y={inst.y0:.0f}-{inst.y1:.0f}")
根据caption位置判断图形区域:
| Caption位置 | 图形区域 | |------------|---------| | y=400 (页面中部) | y=100-395 (caption上方) | | y=666 (页面底部) | y=350-660 (caption上方) | | y=326 (页面底部) | y=100-320 (caption上方) |
rect = fitz.Rect(50, y_start, page.rect.width - 50, y_end)
pix = page.get_pixmap(matrix=fitz.Matrix(2, 2), clip=rect)
pix.save(f"fig{fig_num}.png")
检查清单:
原因: 裁剪范围太大 解决: 缩小y_end,确保在caption之前结束
原因: 裁剪范围太小 解决: 扩大y_start/y_end,包含完整图形
原因: 裁剪范围包含了caption区域 解决: 根据caption的y坐标精确调整裁剪边界
matrix=fitz.Matrix(2, 2)"提取PDF图片", "从PDF提取Figure", "PDF图片裁剪", "学术论文图片提取"
tools
Use when the user wants to connect to, test, or use the McDonalds service at mcp.mcd.cn, including checking authentication, probing MCP endpoints, listing tools, or calling McDonalds MCP tools through a reusable local CLI.
development
Web scraping platform — Twitter/X data, Vinted marketplace, and general web scraping API
development
SlowMist AI Agent Security Review — comprehensive security framework for skills, repositories, URLs, on-chain addresses, and products (Claude Code version)
data-ai
去除中文文本中的 AI 写作痕迹,使其读起来自然。基于维基百科 AI 写作特征指南,检测 24 种 AI 模式。触发词:humanizer-cn、去除 AI 痕迹、去除 AI 写作痕迹、中文文本人性化。