openclaw-skills/capture-screen/SKILL.md
Programmatic screenshot capture on macOS. Find window IDs with Swift CGWindowListCopyWindowInfo, control application windows via AppleScript (zoom, scroll, select), and capture with screencapture. Use when automating screenshots, capturing application windows for documentation, or building multi-shot visual workflows.
npx skillsauth add seaworld008/commonly-used-high-value-skills capture-screenInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Programmatic screenshot capture on macOS: find windows, control views, capture images.
Use this skill when the user wants to:
Recommended flow:
find window id
-> control or position the app
-> wait for UI to settle
-> capture with screencapture
-> verify output file
# Find Excel window ID
swift scripts/get_window_id.swift Excel
# Capture that window (replace 12345 with actual WID)
screencapture -x -l 12345 output.png
Three-step workflow:
1. Find Window → Swift CGWindowListCopyWindowInfo → get numeric Window ID
2. Control View → AppleScript (osascript) → zoom, scroll, select
3. Capture → screencapture -l <WID> → PNG/JPEG output
Use Swift with CoreGraphics to enumerate windows. This is the only reliable method on macOS.
swift -e '
import CoreGraphics
let keyword = "Excel"
let list = CGWindowListCopyWindowInfo(.optionOnScreenOnly, kCGNullWindowID) as? [[String: Any]] ?? []
for w in list {
let owner = w[kCGWindowOwnerName as String] as? String ?? ""
let name = w[kCGWindowName as String] as? String ?? ""
let wid = w[kCGWindowNumber as String] as? Int ?? 0
if owner.localizedCaseInsensitiveContains(keyword) || name.localizedCaseInsensitiveContains(keyword) {
print("WID=\(wid) | App=\(owner) | Title=\(name)")
}
}
'
swift scripts/get_window_id.swift Excel
swift scripts/get_window_id.swift Chrome
swift scripts/get_window_id.swift # List all windows
Output format: WID=12345 | App=Microsoft Excel | Title=workbook.xlsx
Parse the WID number for use with screencapture -l.
Verified commands for controlling application windows before capture.
# Activate (bring to front)
osascript -e 'tell application "Microsoft Excel" to activate'
# Set zoom level (percentage)
osascript -e 'tell application "Microsoft Excel"
set zoom of active window to 120
end tell'
# Scroll to specific row
osascript -e 'tell application "Microsoft Excel"
set scroll row of active window to 45
end tell'
# Scroll to specific column
osascript -e 'tell application "Microsoft Excel"
set scroll column of active window to 3
end tell'
# Select a cell range
osascript -e 'tell application "Microsoft Excel"
select range "A1" of active sheet
end tell'
# Select a specific sheet
osascript -e 'tell application "Microsoft Excel"
activate object sheet "DCF" of active workbook
end tell'
# Open a file
osascript -e 'tell application "Microsoft Excel"
open POSIX file "/path/to/file.xlsx"
end tell'
# Activate any app
osascript -e 'tell application "Google Chrome" to activate'
# Bring specific window to front (by index)
osascript -e 'tell application "System Events"
tell process "Google Chrome"
perform action "AXRaise" of window 1
end tell
end tell'
Always add sleep 1 after AppleScript commands before capturing, to allow UI rendering to complete.
IMPORTANT: osascript hangs indefinitely if the target application is not running or not responding. Always wrap with timeout:
timeout 5 osascript -e 'tell application "Microsoft Excel" to activate'
# Capture specific window by ID
screencapture -l <WID> output.png
# Silent capture (no camera shutter sound)
screencapture -x -l <WID> output.png
# Capture as JPEG
screencapture -l <WID> -t jpg output.jpg
# Capture with delay (seconds)
screencapture -l <WID> -T 2 output.png
# Capture a screen region (interactive)
screencapture -R x,y,width,height output.png
On Retina Macs, screencapture outputs 2x resolution by default (e.g., a 2032x1238 window produces a 4064x2476 PNG). This is normal. To get 1x resolution, resize after capture:
sips --resampleWidth 2032 output.png --out output_1x.png
# Check file was created and has content
ls -la output.png
file output.png # Should show "PNG image data, ..."
Complete example: capture multiple sections of an Excel workbook.
# 1. Open file and activate Excel
osascript -e 'tell application "Microsoft Excel"
open POSIX file "/path/to/model.xlsx"
activate
end tell'
sleep 2
# 2. Set up view
osascript -e 'tell application "Microsoft Excel"
set zoom of active window to 130
activate object sheet "Summary" of active workbook
end tell'
sleep 1
# 3. Get window ID
# IMPORTANT: Always re-fetch before capturing. CGWindowID is invalidated
# when an app restarts or a window is closed and reopened.
WID=$(swift -e '
import CoreGraphics
let list = CGWindowListCopyWindowInfo(.optionOnScreenOnly, kCGNullWindowID) as? [[String: Any]] ?? []
for w in list {
let owner = w[kCGWindowOwnerName as String] as? String ?? ""
let wid = w[kCGWindowNumber as String] as? Int ?? 0
if owner == "Microsoft Excel" { print(wid); break }
}
')
echo "Window ID: $WID"
# 4. Capture Section A (top of sheet)
osascript -e 'tell application "Microsoft Excel"
set scroll row of active window to 1
end tell'
sleep 1
screencapture -x -l $WID section_a.png
# 5. Capture Section B (further down)
osascript -e 'tell application "Microsoft Excel"
set scroll row of active window to 45
end tell'
sleep 1
screencapture -x -l $WID section_b.png
# 6. Switch sheet and capture
osascript -e 'tell application "Microsoft Excel"
activate object sheet "DCF" of active workbook
set scroll row of active window to 1
end tell'
sleep 1
screencapture -x -l $WID dcf_overview.png
These methods were tested and confirmed to fail on macOS:
| Method | Error | Why It Fails |
|--------|-------|-------------|
| System Events → id of window | Error -1728 | System Events cannot access window IDs in the format screencapture needs |
| Python import Quartz (PyObjC) | ModuleNotFoundError | PyObjC not installed in system Python; don't attempt to install it — use Swift instead |
| osascript window id | Wrong format | Returns AppleScript window index, not CGWindowID needed by screencapture -l |
| Application | Window ID | AppleScript Control | Notes |
|------------|-----------|-------------------|-------|
| Microsoft Excel | Swift | Full (zoom, scroll, select, activate sheet) | Best supported |
| Google Chrome | Swift | Basic (activate, window management) | No scroll/zoom via AppleScript |
| Any macOS app | Swift | Basic (activate via tell application) | screencapture works universally |
AppleScript control depth varies by application. Excel has the richest AppleScript dictionary. For apps with limited AppleScript, use keyboard simulation via System Events as a fallback.
development
飞书知识库:管理知识空间、空间成员和文档节点。创建和查询知识空间、查看和管理空间成员、管理节点层级结构、在知识库中组织文档和快捷方式。当用户需要在知识库中查找或创建文档、浏览知识空间结构、查看或管理空间成员、移动或复制节点时使用。当用户给出 doubao.com 的 /wiki/ URL/token 时,也应直接使用本 skill,不要因为域名不是飞书而回退到 WebFetch;路由依据是 URL 路径模式和 token,而不是域名。
tools
飞书画板:查询和编辑飞书云文档中的画板。支持导出画板为预览图片、导出原始节点结构、使用 DSL(转成 OpenAPI 格式)、PlantUML/Mermaid 格式更新画板内容。 当用户需要查看画板内容、导出画板图片、编辑画板,或是需要可视化表达架构、流程、组织关系、时间线、因果、对比等结构化信息时使用此 skill,无论是否提及\"画板\"。 ⚠️ 原 `lark-whiteboard-cli` skill 已合并至本 skill,若 skill 列表中同时存在 `lark-whiteboard-cli`,请忽略它,统一使用本 skill(`lark-whiteboard`),并提示用户运行 `npx skills remove lark-whiteboard-cli -g` 删除旧 skill。
testing
飞书视频会议:搜索历史会议、查询会议纪要产物(总结、待办、章节、逐字稿)、查询会议参会人快照。1. 查询已经结束的会议数量或详情时使用本技能(如历史日期|昨天|上周|今天已经开过的会议等场景),查询未开始的会议日程使用 lark-calendar 技能。2. 支持通过关键词、时间范围、组织者、参与者、会议室等筛选条件搜索会议。3. 获取或整理会议纪要、逐字稿、录制产物时使用本技能。4. 查询“谁参加过某会议”“参会人列表”等参会人快照信息用 vc meeting get --with-participants(任意时点可查,含已结束会议)。注意:**Agent 真实入会/离会、感知正在进行中会议的实时事件**请使用 lark-vc-agent 技能,本技能不覆盖写操作和会中事件流。
data-ai
飞书会议机器人入会、离会和会中事件读取。