Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

infquest/browser

Name: browser
Author: infquest

skills/browser/SKILL.md

npx skillsauth add infquest/vibe-ops-plugin browser

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Browser Automation

Browser automation that maintains page state across command executions. Write small, focused commands to accomplish tasks incrementally.

Choosing Your Approach

Local/source-available sites: Read the source code first to write selectors directly
Unknown page layouts: Use snapshot to discover elements, then select-ref to interact
Visual debugging: Take screenshot to see current page state

Prerequisites

# Check browser server running (Max must be open)
curl -s http://localhost:9222/ | head -1 || echo "SERVER_NOT_RUNNING"

Running Commands

All commands use client.py from the skill directory:

uv run skills/browser/client.py <command> [arguments]

⚠️ IMPORTANT: Always use uv run client.py, NOT uv run python client.py. The uv run command automatically handles Python and dependencies from pyproject.toml. Adding python breaks dependency resolution.

Workflow Loop

Follow this pattern for complex tasks:

Run a command to perform one action
Observe the output
Evaluate - did it work? What's the current state?
Decide - is the task complete or do we need another command?
Repeat until task is done

No TypeScript in Browser Context

Code passed to page.evaluate() runs in the browser, which doesn't understand TypeScript:

// ✅ Correct: plain JavaScript
const text = await page.evaluate(() => {
  return document.body.innerText;
});

// ❌ Wrong: TypeScript syntax will fail at runtime
const text = await page.evaluate(() => {
  const el: HTMLElement = document.body; // Type annotation breaks in browser!
  return el.innerText;
});

Waiting

uv run skills/browser/client.py wait-load main                # After navigation
uv run skills/browser/client.py wait-selector main ".results" # For specific elements
uv run skills/browser/client.py wait-url main "**/success"    # For specific URL

Scraping Data

For large datasets, intercept and replay API requests rather than scrolling DOM. See refs/scraping.md for the complete guide covering request capture, schema discovery, and paginated API replay.

Inspecting Page State

Screenshots

uv run skills/browser/client.py screenshot main screenshot.png
uv run skills/browser/client.py screenshot main full.png --full-page  # Capture entire scrollable page

ARIA Snapshot (Element Discovery)

Use snapshot to discover page elements. Returns YAML-formatted accessibility tree:

- banner:
  - link "Hacker News" [ref=e1]
  - navigation:
    - link "new" [ref=e2]
- main:
  - heading "Products" [ref=e3] [level=1]
  - list:
    - listitem:
      - link "Article Title" [ref=e4]
      - button "Add to Cart" [ref=e5]
    - listitem:
      - link "Another Article" [ref=e6]
      - button "Add to Cart" [ref=e7] [nth=1]
- contentinfo:
  - textbox [ref=e8]
    - /placeholder: "Search"

Interpreting refs:

[ref=eN] - Element reference for interaction
[nth=N] - Nth duplicate element with same role+name (0-indexed, first one omitted)
[checked], [disabled], [expanded] - Element states
[level=N] - Heading level
/url:, /placeholder: - Element properties

Interacting with refs:

# Get snapshot to find refs
uv run skills/browser/client.py snapshot main

# Only show interactive elements (buttons, links, inputs, etc.)
uv run skills/browser/client.py snapshot main -i

# Use ref to interact
uv run skills/browser/client.py select-ref main e2 click
uv run skills/browser/client.py select-ref main e7 click   # Click second "Add to Cart"
uv run skills/browser/client.py select-ref main e8 fill "search term"

Error Recovery

Page state persists after failures. Debug with:

# Take screenshot to see current state
uv run skills/browser/client.py screenshot main debug.png

# Get page info
uv run skills/browser/client.py info main

# Get text content
uv run skills/browser/client.py text main "body"

Command Reference

Page Management

uv run skills/browser/client.py list                          # List all pages
uv run skills/browser/client.py create main                   # Create a new page
uv run skills/browser/client.py create main "https://..."     # Create and navigate
uv run skills/browser/client.py goto main "https://..."       # Navigate existing page
uv run skills/browser/client.py close main                    # Close a page
uv run skills/browser/client.py info main                     # Get page URL and title

Element Interaction

uv run skills/browser/client.py click main "button.submit"    # Click element
uv run skills/browser/client.py fill main "input#email" "[email protected]"  # Fill input
uv run skills/browser/client.py hover main ".dropdown"        # Hover over element
uv run skills/browser/client.py keyboard main "Enter"         # Press key
uv run skills/browser/client.py text main "h1"                # Get element text

JavaScript Execution

uv run skills/browser/client.py evaluate main "document.title"
uv run skills/browser/client.py evaluate main "document.querySelectorAll('.item').length"

Python Script (Advanced)

For complex tasks requiring loops or page.on() event handlers, use heredoc with BrowserClient:

cd skills/browser && uv run python <<'EOF'
from client import BrowserClient

client = BrowserClient()
page = client.get_playwright_page("main")

# Full Playwright API available
page.goto("https://example.com")
page.click("button")

# Event handlers for request interception
page.on("response", lambda r: print(r.url))
EOF

The page object is a standard Playwright Page.

infquest/browser

skills/browser/SKILL.md

Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.

tools

Updated Apr 6, 2026

$ install --global

skillsauth

npx skillsauth add infquest/vibe-ops-plugin browser

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 6, 2026, 2:38 AM12.7s4 files scanned

SKILL.md

name:: browser
description:: Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.

Browser Automation

Browser automation that maintains page state across command executions. Write small, focused commands to accomplish tasks incrementally.

Choosing Your Approach

Local/source-available sites: Read the source code first to write selectors directly
Unknown page layouts: Use snapshot to discover elements, then select-ref to interact
Visual debugging: Take screenshot to see current page state

Prerequisites

# Check browser server running (Max must be open)
curl -s http://localhost:9222/ | head -1 || echo "SERVER_NOT_RUNNING"

Running Commands

All commands use client.py from the skill directory:

uv run skills/browser/client.py <command> [arguments]

⚠️ IMPORTANT: Always use uv run client.py, NOT uv run python client.py. The uv run command automatically handles Python and dependencies from pyproject.toml. Adding python breaks dependency resolution.

Workflow Loop

Follow this pattern for complex tasks:

Run a command to perform one action
Observe the output
Evaluate - did it work? What's the current state?
Decide - is the task complete or do we need another command?
Repeat until task is done

No TypeScript in Browser Context

Code passed to page.evaluate() runs in the browser, which doesn't understand TypeScript:

// ✅ Correct: plain JavaScript
const text = await page.evaluate(() => {
  return document.body.innerText;
});

// ❌ Wrong: TypeScript syntax will fail at runtime
const text = await page.evaluate(() => {
  const el: HTMLElement = document.body; // Type annotation breaks in browser!
  return el.innerText;
});

Waiting

uv run skills/browser/client.py wait-load main                # After navigation
uv run skills/browser/client.py wait-selector main ".results" # For specific elements
uv run skills/browser/client.py wait-url main "**/success"    # For specific URL

Scraping Data

For large datasets, intercept and replay API requests rather than scrolling DOM. See refs/scraping.md for the complete guide covering request capture, schema discovery, and paginated API replay.

Inspecting Page State

Screenshots

uv run skills/browser/client.py screenshot main screenshot.png
uv run skills/browser/client.py screenshot main full.png --full-page  # Capture entire scrollable page

ARIA Snapshot (Element Discovery)

Use snapshot to discover page elements. Returns YAML-formatted accessibility tree:

- banner:
  - link "Hacker News" [ref=e1]
  - navigation:
    - link "new" [ref=e2]
- main:
  - heading "Products" [ref=e3] [level=1]
  - list:
    - listitem:
      - link "Article Title" [ref=e4]
      - button "Add to Cart" [ref=e5]
    - listitem:
      - link "Another Article" [ref=e6]
      - button "Add to Cart" [ref=e7] [nth=1]
- contentinfo:
  - textbox [ref=e8]
    - /placeholder: "Search"

Interpreting refs:

[ref=eN] - Element reference for interaction
[nth=N] - Nth duplicate element with same role+name (0-indexed, first one omitted)
[checked], [disabled], [expanded] - Element states
[level=N] - Heading level
/url:, /placeholder: - Element properties

Interacting with refs:

# Get snapshot to find refs
uv run skills/browser/client.py snapshot main

# Only show interactive elements (buttons, links, inputs, etc.)
uv run skills/browser/client.py snapshot main -i

# Use ref to interact
uv run skills/browser/client.py select-ref main e2 click
uv run skills/browser/client.py select-ref main e7 click   # Click second "Add to Cart"
uv run skills/browser/client.py select-ref main e8 fill "search term"

Error Recovery

Page state persists after failures. Debug with:

# Take screenshot to see current state
uv run skills/browser/client.py screenshot main debug.png

# Get page info
uv run skills/browser/client.py info main

# Get text content
uv run skills/browser/client.py text main "body"

Command Reference

Page Management

uv run skills/browser/client.py list                          # List all pages
uv run skills/browser/client.py create main                   # Create a new page
uv run skills/browser/client.py create main "https://..."     # Create and navigate
uv run skills/browser/client.py goto main "https://..."       # Navigate existing page
uv run skills/browser/client.py close main                    # Close a page
uv run skills/browser/client.py info main                     # Get page URL and title

Element Interaction

uv run skills/browser/client.py click main "button.submit"    # Click element
uv run skills/browser/client.py fill main "input#email" "[email protected]"  # Fill input
uv run skills/browser/client.py hover main ".dropdown"        # Hover over element
uv run skills/browser/client.py keyboard main "Enter"         # Press key
uv run skills/browser/client.py text main "h1"                # Get element text

JavaScript Execution

uv run skills/browser/client.py evaluate main "document.title"
uv run skills/browser/client.py evaluate main "document.querySelectorAll('.item').length"

Python Script (Advanced)

For complex tasks requiring loops or page.on() event handlers, use heredoc with BrowserClient:

cd skills/browser && uv run python <<'EOF'
from client import BrowserClient

client = BrowserClient()
page = client.get_playwright_page("main")

# Full Playwright API available
page.goto("https://example.com")
page.click("button")

# Event handlers for request interception
page.on("response", lambda r: print(r.url))
EOF

The page object is a standard Playwright Page.

Related Skills

infquest/youtube-download

content-media

VerifiedTrustedCommunity

使用 yt-dlp 下载 YouTube 视频、音频或字幕。Use when user wants to 下载视频, 下载YouTube, youtube下载, 下载油管, download youtube, download video, 下载B站, bilibili下载.

SKILL.mdUpdated Apr 6, 2026

infquest/youtube-download

infquest/video-trim

tools

VerifiedTrustedCommunity

裁剪视频片段，支持压缩、音频控制等选项。Use when user wants to 剪辑视频, 裁剪视频, 截取视频, 视频剪切, 切视频, trim video, cut video, clip video, extract video segment.

SKILL.mdUpdated Apr 6, 2026

infquest/video-gen

data-ai

VerifiedTrustedCommunity

使用 AI 生成视频，支持 Veo/Sora 模型。Use when user wants to 生成视频, AI视频, 文生视频, 图生视频, generate video, create video, text to video, image to video, 做一个视频.

SKILL.mdUpdated Apr 6, 2026

infquest/video-concat

content-media

VerifiedTrustedCommunity

合并多个视频文件为一个视频。Use when user wants to 合并视频, 拼接视频, 视频合并, 视频拼接, 把视频合在一起, 连接视频, join videos, merge videos, combine videos, concatenate videos.

SKILL.mdUpdated Apr 6, 2026

infquest/video-concat

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/infquest/vibe-ops-plugin.git

# Copy into Claude Code skills folder (global)
cp -r vibe-ops-plugin/skills/browser ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

infquest/vibe-ops-plugin

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT