.claude/skills/browser-testing-with-screenshots/SKILL.md
Use when testing web applications with visual verification - automates Chrome browser interactions, element selection, and screenshot capture for confirming UI functionality
npx skillsauth add agentworkforce/relay browser-testing-with-screenshotsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Automate Chrome browser testing with visual verification using browser-tools. Connect to Chrome DevTools Protocol for navigation, interaction, and screenshot capture to confirm application functionality.
REQUIRED: Install agent-tools from https://github.com/badlogic/agent-tools
# Clone and install agent-tools
git clone https://github.com/badlogic/agent-tools.git
cd agent-tools
# Follow installation instructions in the repository
# Ensure all executables (browser-start.js, browser-nav.js, etc.) are in your PATH
Verify installation:
# Check that browser tools are available
which browser-start.js
which browser-nav.js
which browser-screenshot.js
All browser-* commands referenced in this skill come from the agent-tools repository and must be properly installed and accessible in your system PATH.
Use this skill when:
Don't use for:
| Task | Command | Purpose |
|------|---------|---------|
| Start browser | browser-start.js | Launch Chrome with debugging |
| Navigate | browser-nav.js http://localhost:5172/dashboard | Go to specific URL |
| Take screenshot | browser-screenshot.js | Capture current viewport |
| Pick elements | browser-pick.js "Select the login button" | Interactive element selection |
| Run JavaScript | browser-eval.js 'document.title' | Execute code in page context |
| Extract content | browser-content.js | Get readable page content |
| View cookies | browser-cookies.js | List session cookies |
# Launch Chrome with debugging enabled (preserves user profile)
browser-start.js
# Or start fresh (no cookies, clean state)
browser-start.js --fresh
Expected Result: Chrome opens on port 9222 with DevTools Protocol enabled
# Go to your application starting point
browser-nav.js http://localhost:5172/dashboard
Verify: Browser navigates to dashboard page
# Take initial screenshot to confirm page loaded
browser-screenshot.js
Output: Returns path to screenshot file (e.g., screenshot_20231203_141532.png)
#!/bin/bash
# Test login and dashboard functionality
echo "🚀 Starting browser test..."
# 1. Launch browser
browser-start.js --fresh
# 2. Navigate to login page
browser-nav.js http://localhost:5172/login
sleep 2
# 3. Take screenshot of login page
LOGIN_SHOT=$(browser-screenshot.js)
echo "📸 Login page: $LOGIN_SHOT"
# 4. Fill login form (interactive element picking)
browser-pick.js "Click the username field"
browser-eval.js 'document.activeElement.value = "testuser"'
browser-pick.js "Click the password field"
browser-eval.js 'document.activeElement.value = "password123"'
# 5. Screenshot filled form
FORM_SHOT=$(browser-screenshot.js)
echo "📸 Filled form: $FORM_SHOT"
# 6. Submit form
browser-pick.js "Click the login button"
sleep 3
# 7. Verify dashboard loaded
browser-nav.js http://localhost:5172/dashboard
DASHBOARD_SHOT=$(browser-screenshot.js)
echo "📸 Dashboard: $DASHBOARD_SHOT"
# 8. Verify specific dashboard elements
browser-pick.js "Select the navigation menu"
browser-eval.js 'console.log("Navigation found:", !!document.querySelector(".nav"))'
echo "✅ Test complete. Screenshots saved."
# Interactive element selection (best for dynamic content)
browser-pick.js "Select the submit button"
# User clicks element in browser → returns CSS selector
# Use returned selector for automation
SELECTOR=$(browser-pick.js "Select the submit button" | grep "selector:")
browser-eval.js "document.querySelector('$SELECTOR').click()"
# Take screenshot to verify action
browser-screenshot.js
# Check if element exists before interaction
browser-eval.js 'document.querySelector("#login-form") !== null'
# Wait for dynamic content
browser-eval.js '
new Promise(resolve => {
const check = () => {
if (document.querySelector(".loaded")) resolve(true);
else setTimeout(check, 100);
};
check();
})
'
# Extract form data
browser-eval.js 'JSON.stringify(Object.fromEntries(new FormData(document.querySelector("form"))))'
# Navigate and wait before screenshot
browser-nav.js http://localhost:5172/slow-page
sleep 5 # Wait for animations/loading
browser-screenshot.js
# Get page title
PAGE_TITLE=$(browser-eval.js 'document.title')
echo "Current page: $PAGE_TITLE"
# Extract readable content
browser-content.js > page_content.md
# Check for specific text
browser-eval.js 'document.body.textContent.includes("Welcome to Dashboard")'
| Mistake | Problem | Solution |
|---------|---------|----------|
| No sleep after navigation | Screenshots of loading page | Add sleep 2-5 after nav |
| Hardcoded selectors | Breaks when UI changes | Use browser-pick.js for selection |
| Missing Chrome setup | "Connection refused" errors | Run browser-start.js first |
| Wrong localhost port | Navigation fails | Verify application is running on correct port |
| Screenshot timing | Captures before content loads | Wait for page load or specific elements |
| Not preserving state | Login lost between commands | Use default profile, not --fresh |
# Check if Chrome is running
if ! browser-eval.js 'true' 2>/dev/null; then
echo "❌ Chrome not connected. Running browser-start.js..."
browser-start.js
fi
# Verify navigation succeeded
if browser-eval.js 'location.href.includes("dashboard")'; then
echo "✅ Navigation successful"
else
echo "❌ Navigation failed"
exit 1
fi
screenshot_YYYYMMDD_HHMMSS.png in current directorybrowser-content.jsbrowser-pick.js interactionbrowser-eval.js# Create test evidence directory
mkdir -p test-results/$(date +%Y%m%d_%H%M%S)
cd test-results/$(date +%Y%m%d_%H%M%S)
# Run tests with organized screenshots
browser-start.js
for page in login dashboard profile; do
browser-nav.js "http://localhost:5172/$page"
sleep 2
screenshot=$(browser-screenshot.js)
mv "$screenshot" "${page}_page.png"
echo "✅ $page page tested"
done
Benefits:
Results:
Key principle: Combine automated navigation with human element selection for robust, maintainable browser testing.
development
Run headless multi-agent orchestration sessions via Agent Relay. Use when spawning teams of agents, creating channels for coordination, managing agent lifecycle, and running parallel workloads across Claude/Codex/Gemini/Pi/Droid agents.
development
Use when you need Codex to coordinate multiple agents through Relaycast for peer-to-peer messaging, lead/worker handoffs, or shared status tracking across sub-agents and terminals.
development
Real-time messaging across OpenClaw instances (channels, DMs, threads, reactions, search).
development
Use when building multi-agent workflows with the relay broker-sdk - covers the WorkflowBuilder API, DAG step dependencies, agent definitions, step output chaining via {{steps.X.output}}, verification gates, evidence-based completion, owner decisions, dedicated channels, dynamic channel management (subscribe/unsubscribe/mute/unmute), swarm patterns, error handling, event listeners, step sizing rules, authoring best practices, and the lead+workers team pattern for complex steps