plugins/chrome/skills/cdp-chrome/SKILL.md
Optional per-OS-user headed Chrome provider for browser automation. Use when an environment chooses cdp-chrome for visible GUI Chrome: social media, JS-rendered pages, logged-in sites, anti-bot pages, forms, screenshots, and live site inspection. Not required when an equivalent provider exists.
npx skillsauth add kanlac/agent-steroids cdp-chromeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Scope: Optional implementation of the abstract headed-browser capability. Do not force this plugin when the user already has an equivalent provider.
chrome-devtools-mcp can launch Chrome with automation flags such as --enable-automation, which sets navigator.webdriver = true. This plugin instead connects MCP to a normal GUI Chrome process with a persistent profile.
The process is shared across agents for the same OS user only. On multi-user machines, every OS user should configure a different port/profile so agents fail fast instead of connecting to another user's Chrome.
steroids config file:
~/.config/steroids.json%APPDATA%\steroids.json${APPDATA:-$HOME/.config}/steroids.jsonDefault config:
{ "cdp-chrome": { "port": 9224, "profile_dir": "~/.config/cdp-chrome/profile" } }
Existing configs with only port still work; profile_dir defaults to ~/.config/cdp-chrome/profile. Do not configure a shared download directory; Chrome default downloads are left alone.
Choose a unique cdp-chrome.port and profile_dir for this OS user in the steroids config file.
Run the plugin script:
plugins/chrome/skills/cdp-chrome/scripts/doctor.sh
It verifies current-user config, port ownership, and Chrome profile consistency. If it reports another OS user or another profile on the port, choose a different cdp-chrome.port and rerun it.
Start Chrome:
plugins/chrome/skills/cdp-chrome/scripts/start.sh
The script creates profile_dir, refuses occupied/wrong-user/wrong-profile listeners on macOS, and starts normal GUI Chrome without --enable-automation.
Log in manually to needed sites. Sessions persist in the configured profile.
Claude Code and Codex: installing the chrome plugin provides cdp-chrome through plugin-local .mcp.json. Claude Code documents ${CLAUDE_PLUGIN_ROOT} for plugin MCP paths; current Codex plugin loading has been verified to start plugin MCP entries with cwd: "." at the installed plugin root. The shared .mcp.json uses a small shell launcher to support both cases, then runs skills/cdp-chrome/scripts/mcp-launcher.sh; that launcher reads the current user's config, validates an existing listener when possible, then execs:
npx -y chrome-devtools-mcp@latest --browserUrl http://127.0.0.1:<port> --no-usage-statistics \
--no-category-performance --no-category-emulation --no-category-network
Hermes: plugin-local MCP config is not auto-loaded. Register an equivalent mcp_servers.cdp-chrome manually and point it at this plugin's mcp-launcher.sh or at the same chrome-devtools-mcp command with your configured port. Restart/reload MCP after config changes.
Do not create a duplicate project-level ./.mcp.json for the same server; duplicate MCP definitions can connect to different ports.
This skill is only satisfied when the agent is operating on the configured shared CDP endpoint. Similar-looking tools such as mcp__chrome_devtools__*, Playwright, Puppeteer, or browser-use are not substitutes unless they are explicitly registered as the cdp-chrome server for this plugin and proven to use the configured http://127.0.0.1:<port> endpoint. They can silently attach to a different browser or target, even when their API is based on Chrome DevTools.
If the expected cdp-chrome MCP namespace is not exposed, do not guess with another browser tool. First run doctor.sh, then use the configured endpoint directly:
curl -s "http://127.0.0.1:<port>/json/list"
Pick the intended target from /json/list and operate through its webSocketDebuggerUrl, or report that the cdp-chrome MCP namespace is missing. A page list from any other tool is not proof that this skill is attached to the shared Chrome.
cdp-chrome (mcp__cdp-chrome__* in Claude/Codex, mcp_cdp_chrome_* style in Hermes), or the direct configured CDP endpoint fallback above. Do not fall back to other Chrome/Playwright/Puppeteer MCP tools; they may launch automated Chrome or attach to a different Chrome target.start.sh if the configured instance is not running.doctor.sh when setup changed or when connection errors occur.take_screenshot (~800–1,600 vision tokens) to see the layout. Do NOT call take_snapshot for this purpose — its A11Y text tree costs 10K–540K chars (2.5K–135K text tokens) on complex pages and often exceeds tool limits. After the screenshot gives you spatial understanding, use evaluate_script for precise extraction/action.take_snapshot for simple pages only. Login forms, settings panels, confirmation dialogs — pages where the A11Y tree is expected to be < 5K chars. For anything else, screenshot + evaluate_script is both cheaper and more effective.evaluate_script results. When writing extraction JS, truncate or paginate output in-script (e.g. .slice(0, 100) for arrays, .slice(0, 8000) for text). Do not return unbounded DOM content or full page text.PORT=$(python3 - <<'PY'
import json, os
p=os.path.join(os.environ.get('APPDATA', os.path.join(os.environ['HOME'], '.config')), 'steroids.json')
try:
print(json.load(open(os.path.expanduser(os.path.expandvars(p)))).get('cdp-chrome', {}).get('port', 9224))
except FileNotFoundError:
print(9224)
PY
)
curl -s "http://127.0.0.1:$PORT/json/version"
curl -s "http://127.0.0.1:$PORT/json/list"
Red flags: another OS user owns the port, process args lack the configured --user-data-dir, --enable-automation, --remote-debugging-pipe, temp puppeteer_dev_chrome_profile-*, or unexpected logouts. Stop and fix config/MCP registration.
See references/page-interaction.md for detailed patterns and examples.
Tool selection guide:
| 目的 | 工具 | Token 成本 | 说明 |
|------|------|-----------|------|
| 看懂页面布局 | take_screenshot | ~800–1,600 vision tokens | 复杂/未知页面首选 |
| 精准提取/操作 | evaluate_script | ~650 text tokens(可控) | 主力工具 |
| 获取元素 UID | take_snapshot | 2.5K–135K text tokens | 仅限简单页面 |
| 导航 | navigate_page | ~190 text tokens | — |
devops
自建机场(代理服务端)搭建与运维。涵盖 VPS 初始化/加固、3X-UI 面板、Xray VLESS Reality+Vision 入站、多用户独立订阅(UUID/subId/到期/续期)、把节点渲染成 Clash/Mihomo 订阅 YAML、Profile 显示名与到期时间下发、域名/ACME 证书、出口测速、IP 被墙/被滥用风险判断、备份恢复。用户要在 VPS 上部署或维护自建节点/机场、调试 3X-UI 订阅、Reality 入站、订阅链接显示名/到期、面板安全、证书申请、速度低或 IP 风险时用这个 skill。客户端 Clash Verge / mihomo 的配置即代码、规则不命中、DNS 泄漏排查见 clash-verge-config skill。
development
Generate a local HTML dashboard for auditing installed Skills, token usage, description token budgets, duplicate Skill names, Skill paths, and selected Skill exports. Use when the user wants to inspect many Skills, decide which Skills to disable, compare duplicates, preview SKILL.md contents, sort by token usage, or export selected Skills as JSON.
tools
Turn a YouTube link into a polished single-file bilingual (Chinese + original) transcript reading page. Use when the user gives a YouTube URL and asks to "转录" "做转录稿" "生成转录页面" "中英对照" "bilingual transcript" "transcribe this video", or wants a readable HTML transcript with clickable timestamps, chapter navigation, highlighted key points, and proper-noun annotations. Fetches captions + chapters via yt-dlp, the agent translates and curates, then a script renders the HTML.
development
Use when a user asks the agent to "learn" from a file, example, correction, failed workflow, or feedback and persist that learning into skills or agent instructions. Guides semantic skill refactoring: extract the transferable behavior, update the owning skill so it becomes clearer and easier to execute, avoid append-only note dumping, and decide when not to create new reference files.