Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

wallacedobbs428/agent-evaluation

Name: agent-evaluation
Author: wallacedobbs428

.claude/skills/agent-evaluation/SKILL.md

npx skillsauth add wallacedobbs428/thecalltaker agent-evaluation

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

agent-browser - Browser Automation for AI Agents

When to use this skill

Open websites and automate UI actions
Fill forms, click controls, and verify outcomes
Capture screenshots/PDFs or extract content
Run deterministic web checks with accessibility refs
Execute parallel browser tasks via isolated sessions

Core workflow

Always use the deterministic ref loop:

agent-browser open <url>
agent-browser snapshot -i
interact with refs (@e1, @e2, ...)
agent-browser snapshot -i again after page/DOM changes

agent-browser open https://example.com/form
agent-browser wait --load networkidle
agent-browser snapshot -i
agent-browser fill @e1 "[email protected]"
agent-browser click @e2
agent-browser snapshot -i

Command patterns

Use && chaining when intermediate output is not needed.

# Good chaining: open -> wait -> snapshot
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i

# Separate calls when output is needed first
agent-browser snapshot -i
# parse refs
agent-browser click @e2

High-value commands:

Navigation: open, close
Snapshot: snapshot -i, snapshot -i -C, snapshot -s "#selector"
Interaction: click, fill, type, select, check, press
Verification: diff snapshot, diff screenshot --baseline <file>
Capture: screenshot, screenshot --annotate, pdf
Wait: wait --load networkidle, wait <selector|@ref|ms>

Verification patterns

Use explicit evidence after actions.

# Baseline -> action -> verify structure
agent-browser snapshot -i
agent-browser click @e3
agent-browser diff snapshot

# Visual regression
agent-browser screenshot baseline.png
agent-browser click @e5
agent-browser diff screenshot --baseline baseline.png

Safety and reliability

Refs are invalid after navigation or significant DOM updates; re-snapshot before next action.
Prefer wait --load networkidle or selector/ref waits over fixed sleeps.
For multi-step JS, use eval --stdin (or base64) to avoid shell escaping breakage.
For concurrent tasks, isolate with --session <name>.
Use output controls in long pages to reduce context flooding.
Optional hardening in sensitive flows: domain allowlist and action policies.

Optional hardening examples:

# Wrap page content with boundaries to reduce prompt-injection risk
export AGENT_BROWSER_CONTENT_BOUNDARIES=1

# Limit output volume for long pages
export AGENT_BROWSER_MAX_OUTPUT=50000

# Restrict navigation and network to trusted domains
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"

# Restrict allowed action types
export AGENT_BROWSER_ACTION_POLICY=./policy.json

Example policy.json:

{"default":"deny","allow":["navigate","snapshot","click","fill","scroll","wait","get"],"deny":["eval","download","upload","network","state"]}

CLI-flag equivalent:

agent-browser --content-boundaries --max-output 50000 --allowed-domains "example.com,*.example.com" --action-policy ./policy.json open https://example.com

Troubleshooting

command not found: install and run agent-browser install.
Wrong element clicked: run snapshot -i again and use fresh refs.
Dynamic SPA content missing: wait with --load networkidle or targeted wait selector.
Session collisions: assign unique --session names and close each session.
Large output pressure: narrow snapshots (-i, -c, -d, -s) and extract only needed text.

References

Deep-dive docs in this skill:

commands
snapshot-refs
session-management
authentication

Related resources:

https://github.com/vercel-labs/agent-browser
https://agent-browser.dev

Ready templates:

./templates/form-automation.sh
./templates/capture-workflow.sh

Metadata

Version: 1.1.0
Last updated: 2026-02-26
Scope: deterministic browser automation for agent workflows

wallacedobbs428/agent-evaluation

.claude/skills/agent-evaluation/SKILL.md

Browser automation CLI for AI agents. Use for website interaction, form automation, screenshots, scraping, and web app verification. Prefer snapshot refs (@e1, @e2) for deterministic actions.

tools

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add wallacedobbs428/thecalltaker agent-evaluation

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 17, 2026, 1:28 AM12.4s1 file scanned

SKILL.md

name:: agent-evaluation
description:: Browser automation CLI for AI agents. Use for website interaction, form automation, screenshots, scraping, and web app verification. Prefer snapshot refs (@e1, @e2) for deterministic actions.
allowed-tools:: Read Write Bash Grep Glob
tags:: browser-automation, headless-browser, ai-agent, web-testing, web-scraping, verification
platforms:: Claude, Gemini, Codex, ChatGPT
version:: 1.1.0
source:: vercel-labs/agent-browser

agent-browser - Browser Automation for AI Agents

When to use this skill

Open websites and automate UI actions
Fill forms, click controls, and verify outcomes
Capture screenshots/PDFs or extract content
Run deterministic web checks with accessibility refs
Execute parallel browser tasks via isolated sessions

Core workflow

Always use the deterministic ref loop:

agent-browser open <url>
agent-browser snapshot -i
interact with refs (@e1, @e2, ...)
agent-browser snapshot -i again after page/DOM changes

agent-browser open https://example.com/form
agent-browser wait --load networkidle
agent-browser snapshot -i
agent-browser fill @e1 "[email protected]"
agent-browser click @e2
agent-browser snapshot -i

Command patterns

Use && chaining when intermediate output is not needed.

# Good chaining: open -> wait -> snapshot
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i

# Separate calls when output is needed first
agent-browser snapshot -i
# parse refs
agent-browser click @e2

High-value commands:

Navigation: open, close
Snapshot: snapshot -i, snapshot -i -C, snapshot -s "#selector"
Interaction: click, fill, type, select, check, press
Verification: diff snapshot, diff screenshot --baseline <file>
Capture: screenshot, screenshot --annotate, pdf
Wait: wait --load networkidle, wait <selector|@ref|ms>

Verification patterns

Use explicit evidence after actions.

# Baseline -> action -> verify structure
agent-browser snapshot -i
agent-browser click @e3
agent-browser diff snapshot

# Visual regression
agent-browser screenshot baseline.png
agent-browser click @e5
agent-browser diff screenshot --baseline baseline.png

Safety and reliability

Refs are invalid after navigation or significant DOM updates; re-snapshot before next action.
Prefer wait --load networkidle or selector/ref waits over fixed sleeps.
For multi-step JS, use eval --stdin (or base64) to avoid shell escaping breakage.
For concurrent tasks, isolate with --session <name>.
Use output controls in long pages to reduce context flooding.
Optional hardening in sensitive flows: domain allowlist and action policies.

Optional hardening examples:

# Wrap page content with boundaries to reduce prompt-injection risk
export AGENT_BROWSER_CONTENT_BOUNDARIES=1

# Limit output volume for long pages
export AGENT_BROWSER_MAX_OUTPUT=50000

# Restrict navigation and network to trusted domains
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"

# Restrict allowed action types
export AGENT_BROWSER_ACTION_POLICY=./policy.json

Example policy.json:

{"default":"deny","allow":["navigate","snapshot","click","fill","scroll","wait","get"],"deny":["eval","download","upload","network","state"]}

CLI-flag equivalent:

agent-browser --content-boundaries --max-output 50000 --allowed-domains "example.com,*.example.com" --action-policy ./policy.json open https://example.com

Troubleshooting

command not found: install and run agent-browser install.
Wrong element clicked: run snapshot -i again and use fresh refs.
Dynamic SPA content missing: wait with --load networkidle or targeted wait selector.
Session collisions: assign unique --session names and close each session.
Large output pressure: narrow snapshots (-i, -c, -d, -s) and extract only needed text.

References

Deep-dive docs in this skill:

commands
snapshot-refs
session-management
authentication

Related resources:

https://github.com/vercel-labs/agent-browser
https://agent-browser.dev

Ready templates:

./templates/form-automation.sh
./templates/capture-workflow.sh

Metadata

Version: 1.1.0
Last updated: 2026-02-26
Scope: deterministic browser automation for agent workflows

Related Skills

wallacedobbs428/writer-memory

documentation

VerifiedTrustedCommunity

Agentic memory system for writers - track characters, relationships, scenes, and themes

SKILL.mdUpdated Apr 17, 2026

wallacedobbs428/writer-memory

wallacedobbs428/workflow-automation

tools

VerifiedTrustedCommunity

Automate repetitive development tasks and workflows. Use when creating build scripts, automating deployments, or setting up development workflows. Handles npm scripts, Makefile, GitHub Actions workflows, and task automation.

SKILL.mdUpdated Apr 17, 2026

wallacedobbs428/workflow-automation

wallacedobbs428/web-design-guidelines

development

VerifiedTrustedCommunity

Review UI code for Web Interface Guidelines compliance. Use when asked to "review my UI", "check accessibility", "audit design", "review UX", or "check my site against best practices". Fetches latest Vercel guidelines and checks files against all rules.

SKILL.mdUpdated Apr 17, 2026

wallacedobbs428/web-design-guidelines

wallacedobbs428/web-accessibility

development

VerifiedTrustedCommunity

Implement web accessibility (a11y) standards following WCAG 2.1 guidelines. Use when building accessible UIs, fixing accessibility issues, or ensuring compliance with disability standards. Handles ARIA attributes, keyboard navigation, screen readers, semantic HTML, and accessibility testing.

SKILL.mdUpdated Apr 17, 2026

wallacedobbs428/web-accessibility

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/wallacedobbs428/thecalltaker.git

# Copy into Claude Code skills folder (global)
cp -r thecalltaker/.claude/skills/agent-evaluation ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

wallacedobbs428/thecalltaker

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT