skills/computer-use/SKILL.md
Full desktop computer use for headless Linux servers. Xvfb + XFCE virtual desktop with xdotool automation. 17 actions (click, type, scroll, screenshot, drag, etc). Unlike OpenClaw's browser tool, operates at the X11 level so websites cannot detect automation. Includes VNC for live viewing.
npx skillsauth add pr-e/openclaw-master-skills computer-useInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Full desktop GUI control for headless Linux servers. Creates a virtual display (Xvfb + XFCE) so you can run and control desktop applications on VPS/cloud instances without a physical monitor.
:99Run the setup script to install everything (systemd services, flicker-free VNC):
./scripts/setup-vnc.sh
This installs:
:99All services auto-start on boot and auto-restart on crash.
| Action | Script | Arguments | Description |
|--------|--------|-----------|-------------|
| screenshot | screenshot.sh | — | Capture screen → base64 PNG |
| cursor_position | cursor_position.sh | — | Get current mouse X,Y |
| mouse_move | mouse_move.sh | x y | Move mouse to coordinates |
| left_click | click.sh | x y left | Left click at coordinates |
| right_click | click.sh | x y right | Right click |
| middle_click | click.sh | x y middle | Middle click |
| double_click | click.sh | x y double | Double click |
| triple_click | click.sh | x y triple | Triple click (select line) |
| left_click_drag | drag.sh | x1 y1 x2 y2 | Drag from start to end |
| left_mouse_down | mouse_down.sh | — | Press mouse button |
| left_mouse_up | mouse_up.sh | — | Release mouse button |
| type | type_text.sh | "text" | Type text (50 char chunks, 12ms delay) |
| key | key.sh | "combo" | Press key (Return, ctrl+c, alt+F4) |
| hold_key | hold_key.sh | "key" secs | Hold key for duration |
| scroll | scroll.sh | dir amt [x y] | Scroll up/down/left/right |
| wait | wait.sh | seconds | Wait then screenshot |
| zoom | zoom.sh | x1 y1 x2 y2 | Cropped region screenshot |
export DISPLAY=:99
# Take screenshot
./scripts/screenshot.sh
# Click at coordinates
./scripts/click.sh 512 384 left
# Type text
./scripts/type_text.sh "Hello world"
# Press key combo
./scripts/key.sh "ctrl+s"
# Scroll down
./scripts/scroll.sh down 5
ctrl+End to jump to page bottom in browsersWatch the desktop in real-time via browser or VNC client.
# SSH tunnel (run on your local machine)
ssh -L 6080:localhost:6080 your-server
# Open in browser
http://localhost:6080/vnc.html
# SSH tunnel
ssh -L 5900:localhost:5900 your-server
# Connect VNC client to localhost:5900
Add to ~/.ssh/config for automatic tunneling:
Host your-server
HostName your.server.ip
User your-user
LocalForward 6080 127.0.0.1:6080
LocalForward 5900 127.0.0.1:5900
Then just ssh your-server and VNC is available.
# Check status
systemctl status xvfb xfce-minimal x11vnc novnc
# Restart if needed
sudo systemctl restart xvfb xfce-minimal x11vnc novnc
xvfb → xfce-minimal → x11vnc → novnc
-noxdamage for stabilityexport DISPLAY=:99
# Chrome — only use --no-sandbox if the kernel lacks user namespace support.
# Check: cat /proc/sys/kernel/unprivileged_userns_clone
# 1 = sandbox works, do NOT use --no-sandbox
# 0 = sandbox fails, --no-sandbox required as fallback
# Using --no-sandbox when unnecessary causes instability and crashes.
if [ "$(cat /proc/sys/kernel/unprivileged_userns_clone 2>/dev/null)" = "0" ]; then
google-chrome --no-sandbox &
else
google-chrome &
fi
xfce4-terminal & # Terminal
thunar & # File manager
Note: Snap browsers (Firefox, Chromium) have sandbox issues on headless servers. Use Chrome .deb instead:
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo dpkg -i google-chrome-stable_current_amd64.deb
sudo apt-get install -f
If you prefer manual setup instead of setup-vnc.sh:
# Install packages
sudo apt install -y xvfb xfce4 xfce4-terminal xdotool scrot imagemagick dbus-x11 x11vnc novnc websockify
# Run the setup script (generates systemd services, masks xfdesktop, starts everything)
./scripts/setup-vnc.sh
If you prefer fully manual setup, the setup-vnc.sh script generates all systemd service files inline -- read it for the exact service definitions.
pgrep xfwm4sudo systemctl restart xfce-minimal/usr/bin/xfdesktop)--heartbeat 30 flag-noxdamage flag-noxdamage -noxfixes flagsInstalled by setup-vnc.sh:
xvfb xfce4 xfce4-terminal xdotool scrot imagemagick dbus-x11 x11vnc novnc websockify
development
Fetch and read transcripts from YouTube videos. Use when you need to summarize a video, answer questions about its content, or extract information from it.
devops
Fetch and summarize YouTube video transcripts. Use when asked to summarize, transcribe, or extract content from YouTube videos. Handles transcript fetching via residential IP proxy to bypass YouTube's cloud IP blocks.
content-media
# youtube-auto-captions - YouTube 自动字幕 ## 描述 自动为 YouTube 视频生成字幕,支持多语言翻译、时间轴校准。提升视频可访问性和 SEO。 ## 定价 - **按次收费**: ¥9/次 - 每视频最长 60 分钟 - 支持 50+ 语言 ## 用法 ```bash # 生成字幕 /youtube-auto-captions --video <video_id> --lang zh # 翻译字幕 /youtube-auto-captions --video <video_id> --translate en,ja,ko # 批量处理 /youtube-auto-captions --playlist <playlist_id> --lang zh # 导出字幕 /youtube-auto-captions --video <video_id> --export srt ``` ## 技能目录 `~/.openclaw/workspace/skills/youtube-auto-captions/` ## 作者 张 sir #
development
YouTube Data API integration with managed OAuth. Search videos, manage playlists, access channel data, and interact with comments. Use this skill when users want to interact with YouTube. For other third party apps, use the api-gateway skill (https://clawhub.ai/byungkyu/api-gateway).