skills/agent-browser/SKILL.md
--- Skill name: agent-browser Skill description: Debug visual bugs and interact with web apps using agent-browser CLI. Use when debugging, inspecting, navigating, filling forms, clicking buttons, taking screenshots, scraping data, testing web apps, or automating browser tasks. Supports desktop browsers and iOS Simulator (Mobile Safari). 93% less context than Playwright MCP. roles: [frontend] allowed-tools: Bash(npx agent-browser:*), Bash(agent-browser:*) --- # Agent Browser Skill Fast, token-e
npx skillsauth add abhiroopb/synthetic-mind skills/agent-browserInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
4 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Fast, token-efficient browser automation for debugging and interaction.
STOP if agent-browser is not installed. See SETUP.md for installation.
agent-browser open <url>agent-browser snapshot -i --json — returns interactive elements with refs (@e1, @e2)Commands can be chained with && when you don't need intermediate output: agent-browser open <url> && agent-browser wait --load networkidle && agent-browser snapshot -i
Run commands separately when you need to parse output first (e.g., snapshot to discover refs, then interact).
# Navigation
agent-browser open <url> # Navigate (aliases: goto, navigate)
agent-browser back / forward / reload
agent-browser close
# Snapshot
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -i -C # Include cursor-interactive elements
agent-browser snapshot -i --json # Structured output (best for agents)
agent-browser snapshot -c -d 3 # Compact, depth-limited
agent-browser snapshot -s "#main" # Scope to CSS selector
# Interact (use @refs from snapshot)
agent-browser click @e1 # Click (--new-tab to open in new tab)
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key (Enter, Tab, Control+a)
agent-browser hover / check / uncheck / select @e1 "value"
agent-browser scroll down 500
agent-browser upload @e1 file.png # Upload files
# Get information
agent-browser get text @e1 / get value @e1
agent-browser get title / get url
# Screenshots
agent-browser screenshot # Base64 PNG to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full # Full page
agent-browser screenshot --annotate # Numbered labels on elements (vision mode)
agent-browser pdf output.pdf
# Wait
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --url "**/dash" # Wait for URL pattern
agent-browser wait --fn "expr" # Wait for JS condition
Refs (@e1, @e2) are invalidated when the page changes. Always re-snapshot after clicking links, form submissions, or dynamic content loading.
agent-browser click @e5 # Navigates to new page
agent-browser snapshot -i # MUST re-snapshot — old refs are gone
agent-browser click @e1 # Now use new refs
Investigate visual bug:
agent-browser open http://localhost:3000
agent-browser snapshot -i --json
agent-browser screenshot bug.png
agent-browser close
Form submission:
agent-browser open https://example.com/form
agent-browser snapshot -i --json
# e1=Email, e2=Password, e3=Submit
agent-browser fill @e1 "[email protected]" && agent-browser fill @e2 "secret123"
agent-browser click @e3
agent-browser wait --load networkidle && agent-browser snapshot -i --json
Save and restore auth state:
agent-browser open https://app.example.com/login && agent-browser snapshot -i --json
agent-browser fill @e1 "username" && agent-browser fill @e2 "secret" && agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Later: agent-browser state load auth.json
iOS Simulator (Mobile Safari):
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
agent-browser -p ios snapshot -i
agent-browser -p ios tap @e1 # Tap (alias for click)
agent-browser -p ios swipe up # Mobile gesture
agent-browser -p ios close
| File | Load when | |------|-----------| | SETUP.md | First-time install, iOS Simulator setup | | references/annotated-screenshots.md | Visual debugging, unlabeled buttons, canvas/chart elements | | references/advanced-interactions.md | Tabs, frames, semantic locators, dialogs, network mocking, browser settings | | references/javascript-eval.md | Running JS in the browser, extracting data via scripts | | references/sessions.md | Persisting auth, parallel sessions, connecting to existing Chrome | | references/advanced-features.md | Local files, slow pages, profiling, configuration |
playwright — Use when running Playwright e2e test suites, not ad-hoc browser interactiontesting
Track TV shows and movies with Trakt.tv. Search, get watchlist, history, up-next, recommendations, trending, calendar, ratings, stats, add/remove from watchlist, mark watched, rate, and check in. Use when asked about what to watch, TV shows, movies, watch history, or Trakt.
development
Send and receive SMS messages via Twilio API. Used for text message notifications, forwarding important alerts, and two-way SMS communication.
documentation
Organizes files in the local Downloads folder into proper folders. Use when asked to organize, sort, or file downloaded documents.
tools
Book and manage appointments on Sutter Health MyHealth Online portal. Uses browser automation via Playwright MCP to interact with the patient portal.