Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

indigoai-us/agent-browser

Name: agent-browser
Author: indigoai-us

workers/public/qa-tester/skills/agent-browser/SKILL.md

npx skillsauth add indigoai-us/hq-core agent-browser

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Browser Automation with agent-browser

Architecture (v0.20+)

100% native Rust — no Node.js or Playwright dependency. 7MB install, 8MB memory. Direct CDP connection to Chromium.

Core Workflow

Every browser automation follows this pattern:

Navigate: agent-browser open <url>
Snapshot: agent-browser snapshot -i (get element refs like @e1, @e2)
Interact: Use refs to click, fill, select
Re-snapshot: After navigation or DOM changes, get fresh refs

agent-browser open https://example.com/form
agent-browser snapshot -i
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

agent-browser fill @e1 "[email protected]"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i  # Check result

Essential Commands

# Navigation
agent-browser open <url>              # Navigate (aliases: goto, navigate)
agent-browser close                   # Close browser

# Snapshot
agent-browser snapshot -i             # Interactive elements with refs (recommended)
agent-browser snapshot -i -C          # Include cursor-interactive elements (divs with onclick, cursor:pointer)
agent-browser snapshot -s "#selector" # Scope to CSS selector

# Interaction (use @refs from snapshot)
agent-browser click @e1               # Click element
agent-browser click @e1 --new-tab     # Click and open in new tab
agent-browser fill @e2 "text"         # Clear and type text
agent-browser type @e2 "text"         # Type without clearing
agent-browser select @e1 "option"     # Select dropdown option
agent-browser check @e1               # Check checkbox
agent-browser press Enter             # Press key
agent-browser scroll down 500         # Scroll page

# Get information
agent-browser get text @e1            # Get element text
agent-browser get url                 # Get current URL
agent-browser get title               # Get page title

# Wait
agent-browser wait @e1                # Wait for element
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --url "**/page"    # Wait for URL pattern
agent-browser wait 2000               # Wait milliseconds

# Capture
agent-browser screenshot              # Screenshot to temp dir
agent-browser screenshot --full       # Full page screenshot
agent-browser pdf output.pdf          # Save as PDF
agent-browser download @e1 ./file     # Download file by clicking element

New in 0.18–0.20

# Chrome DevTools inspector (0.18) — opens DevTools for active page
agent-browser inspect

# CDP WebSocket URL (0.18) — for external debugging tools
agent-browser get cdp-url

# Clipboard access (0.19)
agent-browser clipboard read          # Read clipboard
agent-browser clipboard write "text"  # Write to clipboard
agent-browser clipboard copy          # Ctrl+C
agent-browser clipboard paste         # Ctrl+V

# Screenshot configuration (0.19)
agent-browser screenshot --screenshot-dir ./captures
agent-browser screenshot --screenshot-quality 80
agent-browser screenshot --screenshot-format jpeg
# Also: AGENT_BROWSER_SCREENSHOT_DIR, _QUALITY, _FORMAT env vars

# Browserless.io provider (0.19)
agent-browser --provider browserless open https://example.com
# Env: BROWSERLESS_API_KEY, BROWSERLESS_API_URL

# Brave browser auto-discovery (0.20.7)
# Automatically finds Brave for CDP connections on macOS/Linux/Windows

# Annotated screenshots — bounding boxes + ref labels for vision models
agent-browser screenshot --annotate

# Lightpanda browser engine
agent-browser --engine lightpanda open https://example.com
# or: AGENT_BROWSER_ENGINE=lightpanda

# Dialog handling
agent-browser dialog dismiss

# Retina / device scale factor
agent-browser viewport 1440 900 --scale 2

# Headed mode via env var (no --headed flag needed)
# AGENT_BROWSER_HEADED=1

# Electron webview support — pages list shows target type
agent-browser pages                   # Shows webview targets

Common Patterns

Form Submission

agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "[email protected]"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle

Authentication with State Persistence

# Login once and save state
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json

# Reuse in future sessions
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard

Data Extraction

agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5           # Get specific element text
agent-browser get text body > page.txt  # Get all page text

# JSON output for parsing
agent-browser snapshot -i --json
agent-browser get text @e1 --json

Parallel Sessions

agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com

agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i

agent-browser session list

Visual Browser (Debugging)

agent-browser --headed open https://example.com
agent-browser highlight @e1          # Highlight element
agent-browser record start demo.webm # Record session

Local Files (PDFs, HTML)

# Open local files with file:// URLs
agent-browser --allow-file-access open file:///path/to/document.pdf
agent-browser --allow-file-access open file:///path/to/page.html
agent-browser screenshot output.png

iOS Simulator (Mobile Safari)

# List available iOS simulators
agent-browser device list

# Launch Safari on a specific device
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com

# Same workflow as desktop - snapshot, interact, re-snapshot
agent-browser -p ios snapshot -i
agent-browser -p ios tap @e1          # Tap (alias for click)
agent-browser -p ios fill @e2 "text"
agent-browser -p ios swipe up         # Mobile-specific gesture

# Take screenshot
agent-browser -p ios screenshot mobile.png

# Close session (shuts down simulator)
agent-browser -p ios close

Requirements: macOS with Xcode, Appium (npm install -g appium && appium driver install xcuitest)

Real devices: Works with physical iOS devices if pre-configured. Use --device "<UDID>" where UDID is from xcrun xctrace list devices.

Ref Lifecycle (Important)

Refs (@e1, @e2, etc.) are invalidated when the page changes. Always re-snapshot after:

Clicking links or buttons that navigate
Form submissions
Dynamic content loading (dropdowns, modals)

agent-browser click @e5              # Navigates to new page
agent-browser snapshot -i            # MUST re-snapshot
agent-browser click @e1              # Use new refs

Semantic Locators (Alternative to Refs)

When refs are unavailable or unreliable, use semantic locators:

agent-browser find text "Sign In" click
agent-browser find label "Email" fill "[email protected]"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click

Deep-Dive Documentation

| Reference | When to Use | |-----------|-------------| | references/commands.md | Full command reference with all options | | references/snapshot-refs.md | Ref lifecycle, invalidation rules, troubleshooting | | references/session-management.md | Parallel sessions, state persistence, concurrent scraping | | references/authentication.md | Login flows, OAuth, 2FA handling, state reuse | | references/video-recording.md | Recording workflows for debugging and documentation | | references/proxy-support.md | Proxy configuration, geo-testing, rotating proxies |

Ready-to-Use Templates

| Template | Description | |----------|-------------| | templates/form-automation.sh | Form filling with validation | | templates/authenticated-session.sh | Login once, reuse state | | templates/capture-workflow.sh | Content extraction with screenshots |

./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output

indigoai-us/agent-browser

workers/public/qa-tester/skills/agent-browser/SKILL.md

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.

tools

Updated Apr 23, 2026

$ install --global

skillsauth

npx skillsauth add indigoai-us/hq-core agent-browser

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 23, 2026, 3:27 PM88.3s11 files scanned

SKILL.md

name:: agent-browser
description:: Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.
allowed-tools:: Bash(agent-browser:*)

Browser Automation with agent-browser

Architecture (v0.20+)

100% native Rust — no Node.js or Playwright dependency. 7MB install, 8MB memory. Direct CDP connection to Chromium.

Core Workflow

Every browser automation follows this pattern:

Navigate: agent-browser open <url>
Snapshot: agent-browser snapshot -i (get element refs like @e1, @e2)
Interact: Use refs to click, fill, select
Re-snapshot: After navigation or DOM changes, get fresh refs

agent-browser open https://example.com/form
agent-browser snapshot -i
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

agent-browser fill @e1 "[email protected]"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i  # Check result

Essential Commands

# Navigation
agent-browser open <url>              # Navigate (aliases: goto, navigate)
agent-browser close                   # Close browser

# Snapshot
agent-browser snapshot -i             # Interactive elements with refs (recommended)
agent-browser snapshot -i -C          # Include cursor-interactive elements (divs with onclick, cursor:pointer)
agent-browser snapshot -s "#selector" # Scope to CSS selector

# Interaction (use @refs from snapshot)
agent-browser click @e1               # Click element
agent-browser click @e1 --new-tab     # Click and open in new tab
agent-browser fill @e2 "text"         # Clear and type text
agent-browser type @e2 "text"         # Type without clearing
agent-browser select @e1 "option"     # Select dropdown option
agent-browser check @e1               # Check checkbox
agent-browser press Enter             # Press key
agent-browser scroll down 500         # Scroll page

# Get information
agent-browser get text @e1            # Get element text
agent-browser get url                 # Get current URL
agent-browser get title               # Get page title

# Wait
agent-browser wait @e1                # Wait for element
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --url "**/page"    # Wait for URL pattern
agent-browser wait 2000               # Wait milliseconds

# Capture
agent-browser screenshot              # Screenshot to temp dir
agent-browser screenshot --full       # Full page screenshot
agent-browser pdf output.pdf          # Save as PDF
agent-browser download @e1 ./file     # Download file by clicking element

New in 0.18–0.20

# Chrome DevTools inspector (0.18) — opens DevTools for active page
agent-browser inspect

# CDP WebSocket URL (0.18) — for external debugging tools
agent-browser get cdp-url

# Clipboard access (0.19)
agent-browser clipboard read          # Read clipboard
agent-browser clipboard write "text"  # Write to clipboard
agent-browser clipboard copy          # Ctrl+C
agent-browser clipboard paste         # Ctrl+V

# Screenshot configuration (0.19)
agent-browser screenshot --screenshot-dir ./captures
agent-browser screenshot --screenshot-quality 80
agent-browser screenshot --screenshot-format jpeg
# Also: AGENT_BROWSER_SCREENSHOT_DIR, _QUALITY, _FORMAT env vars

# Browserless.io provider (0.19)
agent-browser --provider browserless open https://example.com
# Env: BROWSERLESS_API_KEY, BROWSERLESS_API_URL

# Brave browser auto-discovery (0.20.7)
# Automatically finds Brave for CDP connections on macOS/Linux/Windows

# Annotated screenshots — bounding boxes + ref labels for vision models
agent-browser screenshot --annotate

# Lightpanda browser engine
agent-browser --engine lightpanda open https://example.com
# or: AGENT_BROWSER_ENGINE=lightpanda

# Dialog handling
agent-browser dialog dismiss

# Retina / device scale factor
agent-browser viewport 1440 900 --scale 2

# Headed mode via env var (no --headed flag needed)
# AGENT_BROWSER_HEADED=1

# Electron webview support — pages list shows target type
agent-browser pages                   # Shows webview targets

Common Patterns

Form Submission

agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "[email protected]"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle

Authentication with State Persistence

# Login once and save state
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json

# Reuse in future sessions
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard

Data Extraction

agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5           # Get specific element text
agent-browser get text body > page.txt  # Get all page text

# JSON output for parsing
agent-browser snapshot -i --json
agent-browser get text @e1 --json

Parallel Sessions

agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com

agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i

agent-browser session list

Visual Browser (Debugging)

agent-browser --headed open https://example.com
agent-browser highlight @e1          # Highlight element
agent-browser record start demo.webm # Record session

Local Files (PDFs, HTML)

# Open local files with file:// URLs
agent-browser --allow-file-access open file:///path/to/document.pdf
agent-browser --allow-file-access open file:///path/to/page.html
agent-browser screenshot output.png

iOS Simulator (Mobile Safari)

# List available iOS simulators
agent-browser device list

# Launch Safari on a specific device
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com

# Same workflow as desktop - snapshot, interact, re-snapshot
agent-browser -p ios snapshot -i
agent-browser -p ios tap @e1          # Tap (alias for click)
agent-browser -p ios fill @e2 "text"
agent-browser -p ios swipe up         # Mobile-specific gesture

# Take screenshot
agent-browser -p ios screenshot mobile.png

# Close session (shuts down simulator)
agent-browser -p ios close

Requirements: macOS with Xcode, Appium (npm install -g appium && appium driver install xcuitest)

Real devices: Works with physical iOS devices if pre-configured. Use --device "<UDID>" where UDID is from xcrun xctrace list devices.

Ref Lifecycle (Important)

Refs (@e1, @e2, etc.) are invalidated when the page changes. Always re-snapshot after:

Clicking links or buttons that navigate
Form submissions
Dynamic content loading (dropdowns, modals)

agent-browser click @e5              # Navigates to new page
agent-browser snapshot -i            # MUST re-snapshot
agent-browser click @e1              # Use new refs

Semantic Locators (Alternative to Refs)

When refs are unavailable or unreliable, use semantic locators:

agent-browser find text "Sign In" click
agent-browser find label "Email" fill "[email protected]"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click

Deep-Dive Documentation

Ready-to-Use Templates

./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output

Related Skills

indigoai-us/hq-cowork

tools

VerifiedTrustedCommunity

Discovery + dispatch entry point for native HQ inside Cowork (or any sandboxed Claude Code plugin host). Enumerates every HQ capability available through hq-pack-cowork's host-side MCP server (identity, sync, qmd/search, secrets, vault files, team & membership, packages & modules, meeting intelligence, feedback, schema-backed runs, guarded long-tail CLI) and routes to the right `mcp__hq__*` tool while preserving default HQ behavior through a different transport. Use when the agent needs HQ but `hq`/`qmd` aren't reachable from its bash sandbox and isn't sure which tool to call.

39SKILL.mdUpdated Jun 4, 2026

indigoai-us/hq-cowork

indigoai-us/hq-cowork-sync

tools

VerifiedTrustedCommunity

Run a full HQ sync (all cloud-backed companies, bidirectional) from a sandboxed Claude Code plugin host (Cowork) by calling the host-side `hq_sync` MCP tool. Same engine as AppBar HQ Sync and the `/hq-sync` skill, but routed through the hq-pack-cowork MCP server so it works even when the `hq` binary and `~/.hq` auth are not reachable from the agent's bash sandbox.

39SKILL.mdUpdated Jun 4, 2026

indigoai-us/hq-cowork-sync

indigoai-us/hq-cowork-share

tools

VerifiedTrustedCommunity

Share an HQ vault path from a sandboxed Claude Code plugin host (Cowork) by calling the host-side `hq_share` MCP tool. Without `--with`, mints an encrypted single-use share-session URL (default 15-min expiry). With `--with`, grants direct ACL access to a person, group, or `@all`. Same capability as `/hq-share`, routed through hq-pack-cowork's MCP server so it works from a sandboxed agent.

39SKILL.mdUpdated Jun 4, 2026

indigoai-us/hq-cowork-share

indigoai-us/core/packages/hq-pack-cowork/skills/hq-cowork-secrets

tools

VerifiedTrustedCommunity

--- name: hq-cowork-secrets description: Use HQ secrets from a sandboxed Claude Code plugin host (Cowork). The host-side MCP server never returns a secret value itself: `mcp__hq__hq_secrets_exec` runs a command on the host with named secrets injected as env vars (only the command's output returns), and refuses to launch a shell or value-printing binary; `mcp__hq__hq_secrets_list` lists secret NAMES/metadata only. These tools run host commands with the user's privileges, so treat them as host-tru

39SKILL.mdUpdated Jun 4, 2026

indigoai-us/core/packages/hq-pack-cowork/skills/hq-cowork-secrets

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/indigoai-us/hq-core.git

# Copy into Claude Code skills folder (global)
cp -r hq-core/workers/public/qa-tester/skills/agent-browser ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

indigoai-us/hq-core

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT