toolkit/packages/skills/webapp-testing/SKILL.md
Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.
npx skillsauth add stevengonsalvez/agents-in-a-box webapp-testingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
To test local web applications, write native Python Playwright scripts.
Helper Scripts Available:
scripts/with_server.py - Manages server lifecycle (supports multiple servers)Always run scripts with --help first to see usage. DO NOT read the source until you try running the script first and find that a customized solution is abslutely necessary. These scripts can be very large and thus pollute your context window. They exist to be called directly as black-box scripts rather than ingested into your context window.
The {{HOME_TOOL_DIR}}/skills/webapp-testing/bin/browser-tools utility provides lightweight, context-rot-proof browser automation using the Chrome DevTools Protocol directly (no MCP overhead).
Available Commands:
# Launch browser and get connection details
/.claude/skills/webapp-testing/bin/browser-tools start [--port PORT] [--headless]
# Navigate to URL
/.claude/skills/webapp-testing/bin/browser-tools nav <url> [--wait-for {load|networkidle|domcontentloaded}]
# Evaluate JavaScript
/.claude/skills/webapp-testing/bin/browser-tools eval "<javascript code>"
# Take screenshot
/.claude/skills/webapp-testing/bin/browser-tools screenshot <output.png> [--full-page] [--selector CSS_SELECTOR]
# Interactive element picker (returns selectors)
/.claude/skills/webapp-testing/bin/browser-tools pick
# Get console logs
/.claude/skills/webapp-testing/bin/browser-tools console [--level {log|warn|error|all}]
# Search page content
/.claude/skills/webapp-testing/bin/browser-tools search "<query>" [--case-sensitive]
# Extract page content (markdown, links, text)
/.claude/skills/webapp-testing/bin/browser-tools content [--format {markdown|links|text}]
# Get/set cookies
/.claude/skills/webapp-testing/bin/browser-tools cookies [--set NAME=VALUE] [--domain DOMAIN]
# Inspect element details
/.claude/skills/webapp-testing/bin/browser-tools inspect <css-selector>
# Terminate browser session
/.claude/skills/webapp-testing/bin/browser-tools kill
When to Use Browser-Tools vs Playwright:
✅ Use browser-tools when:
pick command)✅ Use Playwright when:
Example Workflow:
# 1. Start browser pointed at your app
/.claude/skills/webapp-testing/bin/browser-tools start --port 9222
# 2. Navigate to the page
/.claude/skills/webapp-testing/bin/browser-tools nav http://localhost:3000
# 3. Use interactive picker to find selectors
/.claude/skills/webapp-testing/bin/browser-tools pick
# Click on elements in the browser, get their selectors
# 4. Inspect specific elements
/.claude/skills/webapp-testing/bin/browser-tools inspect "button.submit"
# 5. Take screenshot for documentation
/.claude/skills/webapp-testing/bin/browser-tools screenshot /tmp/page.png --full-page
# 6. Check console for errors
/.claude/skills/webapp-testing/bin/browser-tools console --level error
# 7. Clean up
/.claude/skills/webapp-testing/bin/browser-tools kill
Benefits:
User task → Is it static HTML?
├─ Yes → Read HTML file directly to identify selectors
│ ├─ Success → Write Playwright script using selectors
│ └─ Fails/Incomplete → Treat as dynamic (below)
│
└─ No (dynamic webapp) → Is the server already running?
├─ No → Run: python scripts/with_server.py --help
│ Then use the helper + write simplified Playwright script
│
└─ Yes → Reconnaissance-then-action:
1. Navigate and wait for networkidle
2. Take screenshot or inspect DOM
3. Identify selectors from rendered state
4. Execute actions with discovered selectors
To start a server, run --help first, then use the helper:
Single server:
python scripts/with_server.py --server "npm run dev" --port 3000 -- python your_automation.py
Multiple servers (e.g., backend + frontend):
python scripts/with_server.py \
--server "cd backend && python server.py" --port 8000 \
--server "cd frontend && npm run dev" --port 3000 \
-- python your_automation.py
To create an automation script, include only Playwright logic (servers are managed automatically):
from playwright.sync_api import sync_playwright
APP_PORT = 3000 # Match the port from --port argument
with sync_playwright() as p:
browser = p.chromium.launch(headless=True) # Always launch chromium in headless mode
page = browser.new_page()
page.goto(f'http://localhost:{APP_PORT}') # Server already running and ready
page.wait_for_load_state('networkidle') # CRITICAL: Wait for JS to execute
# ... your automation logic
browser.close()
Inspect rendered DOM:
page.screenshot(path='/tmp/inspect.png', full_page=True)
content = page.content()
page.locator('button').all()
Identify selectors from inspection results
Execute actions using discovered selectors
❌ Don't inspect the DOM before waiting for networkidle on dynamic apps
✅ Do wait for page.wait_for_load_state('networkidle') before inspection
scripts/ can help. These scripts handle common, complex workflows reliably without cluttering the context window. Use --help to see usage, then invoke directly.sync_playwright() for synchronous scriptstext=, role=, CSS selectors, or IDspage.wait_for_selector() or page.wait_for_timeout()The skill now includes comprehensive utilities for common testing patterns:
utils/ui_interactions.py)Handle common UI patterns automatically:
from utils.ui_interactions import (
dismiss_cookie_banner,
dismiss_modal,
click_with_header_offset,
force_click_if_needed,
wait_for_no_overlay,
wait_for_stable_dom
)
# Dismiss cookie consent
dismiss_cookie_banner(page)
# Close welcome modal
dismiss_modal(page, modal_identifier="Welcome")
# Click button behind fixed header
click_with_header_offset(page, 'button#submit', header_height=100)
# Try click with force fallback
force_click_if_needed(page, 'button#action')
# Wait for loading overlays to disappear
wait_for_no_overlay(page)
# Wait for DOM to stabilize
wait_for_stable_dom(page)
utils/form_helpers.py)Intelligently handle form variations:
from utils.form_helpers import (
SmartFormFiller,
handle_multi_step_form,
auto_fill_form
)
# Works with both "Full Name" and "First/Last Name" fields
filler = SmartFormFiller()
filler.fill_name_field(page, "Jane Doe")
filler.fill_email_field(page, "[email protected]")
filler.fill_password_fields(page, "SecurePass123!")
filler.fill_phone_field(page, "+447700900123")
filler.fill_date_field(page, "1990-01-15", field_hint="birth")
# Auto-fill entire form
results = auto_fill_form(page, {
'email': '[email protected]',
'password': 'Pass123!',
'full_name': 'Test User',
'phone': '+447700900123',
'date_of_birth': '1990-01-15'
})
# Handle multi-step forms
steps = [
{'fields': {'email': '[email protected]', 'password': 'Pass123!'}, 'checkbox': True},
{'fields': {'full_name': 'Test User', 'date_of_birth': '1990-01-15'}},
{'complete': True}
]
handle_multi_step_form(page, steps)
utils/supabase.py)Database operations for Supabase-based apps:
from utils.supabase import SupabaseTestClient, quick_cleanup
# Initialize client
client = SupabaseTestClient(
url="https://project.supabase.co",
service_key="your-service-role-key",
db_password="your-db-password"
)
# Create test user
user_id = client.create_user("[email protected]", "password123")
# Create invite code
client.create_invite_code("TEST2024", code_type="general")
# Bypass email verification
client.confirm_email(user_id)
# Cleanup after test
client.cleanup_related_records(user_id)
client.delete_user(user_id)
# Quick cleanup helper
quick_cleanup("[email protected]", "db_password", "https://project.supabase.co")
utils/wait_strategies.py)Better alternatives to simple sleep():
from utils.wait_strategies import (
wait_for_api_call,
wait_for_element_stable,
smart_navigation_wait,
combined_wait
)
# Wait for specific API response
response = wait_for_api_call(page, '**/api/profile**')
# Wait for element to stop moving
wait_for_element_stable(page, '.dropdown-menu', stability_ms=1000)
# Smart navigation with URL check
page.click('button#login')
smart_navigation_wait(page, expected_url_pattern='**/dashboard**')
# Comprehensive wait (network + DOM + overlays)
combined_wait(page)
utils/smart_selectors.py) ⭐ NEWAutomatically try multiple selector strategies to find elements, reducing test brittleness:
from utils.smart_selectors import SelectorStrategies
# Find and fill email field (tries 7 different selector strategies)
success = SelectorStrategies.smart_fill(page, 'email', '[email protected]')
# Output: ✓ Found field via placeholder: input[placeholder*="email" i]
# ✓ Filled 'email' with value
# Find and click button (tries 8 different strategies)
success = SelectorStrategies.smart_click(page, 'Sign In')
# Output: ✓ Found button via case-insensitive text: button:text-matches("Sign In", "i")
# ✓ Clicked 'Sign In' button
# Manual control - find input field selector
selector = SelectorStrategies.find_input_field(page, 'password')
if selector:
page.fill(selector, 'my-password')
# Manual control - find button selector
selector = SelectorStrategies.find_button(page, 'Submit')
if selector:
page.click(selector)
# Try custom selectors with fallback
selectors = ['button#submit', 'button.submit-btn', 'input[type="submit"]']
selector = SelectorStrategies.find_any_element(page, selectors)
Selector Strategies Tried (in order):
For input fields:
[data-testid*="email"]input[aria-label*="email"]input[placeholder*="email"]input[name*="email"]input[type="email"]#emailinput[id*="email"]For buttons:
[data-testid*="sign-in"]button[name*="Sign In"]button:has-text("Sign In")button:text-matches("Sign In", "i")a:has-text("Sign In")input[type="submit"][value*="Sign In"][role="button"]:has-text("Sign In")When to use:
Performance:
utils/browser_config.py) ⭐ NEWAuto-configure browser context for testing environments:
from utils.browser_config import BrowserConfig
# Auto-detect CSP bypass for localhost
context = BrowserConfig.create_test_context(
browser,
'http://localhost:3000'
)
# Output:
# ============================================================
# Browser Context Configuration
# ============================================================
# Base URL: http://localhost:3000
# 🔓 CSP bypass: ENABLED (testing on localhost)
# ⚠️ HTTPS errors: IGNORED (self-signed certs OK)
# 📐 Viewport: 1280x720
# ============================================================
# Production testing (no CSP bypass)
context = BrowserConfig.create_test_context(
browser,
'https://production.example.com'
)
# Output: 🔒 CSP bypass: DISABLED (production mode)
# Mobile device emulation
context = BrowserConfig.create_mobile_context(
browser,
device='iPhone 12',
base_url='http://localhost:3000'
)
# Output: 📱 Mobile context: iPhone 12
# Viewport: {'width': 390, 'height': 844}
# 🔓 CSP bypass: ENABLED
# Manual override
context = BrowserConfig.create_test_context(
browser,
'http://localhost:3000',
bypass_csp=False, # Force disable even for localhost
record_video=True, # Record test session
extra_http_headers={'Authorization': 'Bearer token'}
)
Features:
When to use:
browser.new_context())utils/ui_interactions.py) ⭐ NEWAuto-detect and suggest fixes for Content Security Policy violations:
from utils.ui_interactions import setup_page_with_csp_handling
from utils.browser_config import BrowserConfig
context = BrowserConfig.create_test_context(browser, 'http://localhost:7160')
page = context.new_page()
# Enable CSP violation monitoring
setup_page_with_csp_handling(page)
page.goto('http://localhost:7160')
# If CSP violation occurs, you'll see:
# ======================================================================
# ⚠️ CSP VIOLATION DETECTED
# ======================================================================
# Message: Refused to execute inline script because it violates...
#
# 💡 SUGGESTION:
# For localhost testing, use:
#
# from utils.browser_config import BrowserConfig
# context = BrowserConfig.create_test_context(
# browser, 'http://localhost:3000'
# )
# # Auto-enables CSP bypass for localhost
#
# Or manually:
# context = browser.new_context(bypass_csp=True)
# ======================================================================
When to use:
See examples/multi_step_registration.py for a complete example showing:
Run it:
python examples/multi_step_registration.py
A specialized subagent is available for testing automation. Use it to keep your main conversation focused on development:
You: "Use webapp-testing agent to register [email protected] and verify the parent role switch works"
Main Agent: [Launches webapp-testing subagent]
Webapp-Testing Agent: [Runs complete automation, returns results]
Benefits:
examples/ - Examples showing common patterns:
element_discovery.py - Discovering buttons, links, and inputs on a pagestatic_html_automation.py - Using file:// URLs for local HTMLconsole_logging.py - Capturing console logs during automationmulti_step_registration.py - Complete registration flow example (NEW)utils/ - Reusable utility modules (NEW):
ui_interactions.py - Cookie banners, modals, overlays, stable waitsform_helpers.py - Smart form filling, multi-step automationsupabase.py - Database operations for Supabase appswait_strategies.py - Advanced waiting patternsdocumentation
Report reflect drain spend over a time window — tokens split by cached (cache_read), uncached writes (cache_creation), and io (input+output), with a $ estimate, grouped by day / outcome / model / transcript. Reads the drainer's cost log and surfaces outlier runs and cache-reuse health (the 41.5M-token failure mode = low cache reuse + high cache writes). Use to answer "what is reflection costing me" for the last day / week.
development
Show fleet status — every claude session running on the host, merged across ainb + claude-peers broker + background jobs. Use when you need to enumerate sessions before composing an action, see which sessions have a peer registered (broker-routable) vs tmux-only, check the `summary` of each session, or pipe the list into jq for filtering. Default output: text table. Pass --format json for LLM consumption.
testing
Ordered multi-step prompts to fleet targets, ack-gated between steps via JSONL assistant-turn-end detection. Use for cycles like disconnect→reconnect→verify, or any flow where step N+1 requires step N to have completed first. The skill BLOCKS until each target's transcript shows the next assistant turn finishing OR per-step timeout fires (default 300s).
development
Center control panel — enumerate every claude session that is blocked waiting on something: a user answer (AskUserQuestion fired), an API error retry, an idle assistant turn-end with no follow-up, or an explicit WAITING: marker. Returns rich JSON with signal kind + context per session. Use this when you've stepped away from the fleet and want one place to see everything that wants your attention and answer it.