skills/dogfood/SKILL.md
Systematically explore and test a web application to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", "test this app/site/platform", or review the quality of a web application. Produces a structured report with full reproduction evidence -- step-by-step screenshots, repro videos, and detailed repro steps for every issue -- so findings can be handed directly to the responsible teams.
npx skillsauth add petekp/claude-code-setup dogfoodInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Systematically explore a web application, find issues, and produce a report with full reproduction evidence for every finding.
Only the Target URL is required. Everything else has sensible defaults -- use them unless the user explicitly provides an override.
| Parameter | Default | Example override |
|-----------|---------|-----------------|
| Target URL | (required) | vercel.com, http://localhost:3000 |
| Session name | Slugified domain (e.g., vercel.com -> vercel-com) | --session my-session |
| Output directory | ./dogfood-output/ | Output directory: /tmp/qa |
| Scope | Full app | Focus on the billing page |
| Authentication | None | Sign in to [email protected] |
If the user says something like "dogfood vercel.com", start immediately with defaults. Do not ask clarifying questions unless authentication is mentioned but credentials are missing.
Always use agent-browser directly -- never npx agent-browser. The direct binary uses the fast Rust client. npx routes through Node.js and is significantly slower.
1. Initialize Set up session, output dirs, report file
2. Authenticate Sign in if needed, save state
3. Orient Navigate to starting point, take initial snapshot
4. Explore Systematically visit pages and test features
5. Document Screenshot + record each issue as found
6. Wrap up Update summary counts, close session
mkdir -p {OUTPUT_DIR}/screenshots {OUTPUT_DIR}/videos
Copy the report template into the output directory and fill in the header fields:
cp {SKILL_DIR}/templates/dogfood-report-template.md {OUTPUT_DIR}/report.md
Start a named session:
agent-browser --session {SESSION} open {TARGET_URL}
agent-browser --session {SESSION} wait --load networkidle
If the app requires login:
agent-browser --session {SESSION} snapshot -i
# Identify login form refs, fill credentials
agent-browser --session {SESSION} fill @e1 "{EMAIL}"
agent-browser --session {SESSION} fill @e2 "{PASSWORD}"
agent-browser --session {SESSION} click @e3
agent-browser --session {SESSION} wait --load networkidle
For OTP/email codes: ask the user, wait for their response, then enter the code.
After successful login, save state for potential reuse:
agent-browser --session {SESSION} state save {OUTPUT_DIR}/auth-state.json
Take an initial annotated screenshot and snapshot to understand the app structure:
agent-browser --session {SESSION} screenshot --annotate {OUTPUT_DIR}/screenshots/initial.png
agent-browser --session {SESSION} snapshot -i
Identify the main navigation elements and map out the sections to visit.
Read references/issue-taxonomy.md for the full list of what to look for and the exploration checklist.
Strategy -- work through the app systematically:
At each page:
agent-browser --session {SESSION} snapshot -i
agent-browser --session {SESSION} screenshot --annotate {OUTPUT_DIR}/screenshots/{page-name}.png
agent-browser --session {SESSION} errors
agent-browser --session {SESSION} console
Use your judgment on how deep to go. Spend more time on core features and less on peripheral pages. If you find a cluster of issues in one area, investigate deeper.
Steps 4 and 5 happen together -- explore and document in a single pass. When you find an issue, stop exploring and document it immediately before moving on. Do not explore the whole app first and document later.
Every issue must be reproducible. When you find something wrong, do not just note it -- prove it with evidence. The goal is that someone reading the report can see exactly what happened and replay it.
Choose the right level of evidence for the issue:
These require user interaction to reproduce -- use full repro with video and step-by-step screenshots:
agent-browser --session {SESSION} record start {OUTPUT_DIR}/videos/issue-{NNN}-repro.webm
agent-browser --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-1.png
sleep 1
# Perform action (click, fill, etc.)
sleep 1
agent-browser --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-2.png
sleep 1
# ...continue until the issue manifests
sleep 2
agent-browser --session {SESSION} screenshot --annotate {OUTPUT_DIR}/screenshots/issue-{NNN}-result.png
agent-browser --session {SESSION} record stop
These are visible without interaction -- a single annotated screenshot is sufficient. No video, no multi-step repro:
agent-browser --session {SESSION} screenshot --annotate {OUTPUT_DIR}/screenshots/issue-{NNN}.png
Write a brief description and reference the screenshot in the report. Set Repro Video to N/A.
For all issues:
Append to the report immediately. Do not batch issues for later. Write each one as you find it so nothing is lost if the session is interrupted.
Increment the issue counter (ISSUE-001, ISSUE-002, ...).
Aim to find 5-10 well-documented issues, then wrap up. Depth of evidence matters more than total count -- 5 issues with full repro beats 20 with vague descriptions.
After exploring:
### ISSUE- block must be reflected in the totals.agent-browser --session {SESSION} close
snapshot -i — for finding clickable/fillable elements (buttons, inputs, links)snapshot (no flag) — for reading page content (text, headings, data lists)rm screenshots, videos, or the report mid-session. Do not close the session and restart. Work forward, not backward.type instead of fill -- it types character-by-character. Use fill only outside of video recording when speed matters.sleep 1 between actions and sleep 2 before the final result screenshot. Videos should be watchable at 1x speed -- a human reviewing the report needs to see what happened, not a blur of instant state changes.agent-browser commands in a single shell call when they are independent (e.g., agent-browser ... screenshot ... && agent-browser ... console). Use agent-browser --session {SESSION} scroll down 300 for scrolling -- do not use key or evaluate to scroll.| Reference | When to Read | |-----------|--------------| | references/issue-taxonomy.md | Start of session -- calibrate what to look for, severity levels, exploration checklist |
| Template | Purpose | |----------|---------| | templates/dogfood-report-template.md | Copy into output directory as the report file |
tools
Comprehensively manually test the Circuit plugin's user-facing surface in either Claude Code or Codex. Use this skill whenever the user asks to "manually test Circuit", "QA the Circuit plugin", "exercise the Circuit surface", "run the Circuit checklist", "smoke test Circuit", "find regressions in Circuit", "test the Claude Circuit plugin", "test the Codex Circuit plugin", or when preparing a Circuit release for marketplace publication. Argument is the host package to test — `claude` or `codex`. Produces a Markdown report with per-command pass/fail, exploratory findings ranked by severity, run-folder evidence links, and a concise terminal summary. Use even if the user does not say the word "test" — phrases like "go through every Circuit command" or "make sure Circuit still works end-to-end" should also trigger.
development
Turn the prompt supplied with this skill into a concise, auditable Codex Goal or explain why a Goal is not the right fit. Use when the user asks to draft, formulate, rewrite, tighten, or create a `/goal` from a plain-language task, especially for multi-step work that needs a durable objective, evidence-based completion, constraints, iteration policy, and a default adversarial review loop.
development
Give the human a fast, plain-English catch-up on what changed in the project: what the agents did, why, and what decisions need their input. Use this whenever the user asks to "catch me up", "what changed", "where are we", "recap", "brief me", "give me the rundown", "what did you do", "summarize the session", "fill me in", or otherwise signals they have been away and want to get back up to speed quickly. Built for someone steering several agent-driven projects at once who does not read the code closely but needs to grasp the core ideas, the choices made, and the open decisions well enough to steer. Trigger even if they do not use these exact words: any request to get oriented on recent progress should use this skill.
tools
Expert Unix and macOS systems engineer for shell scripting, system administration, command-line tools, launchd, Homebrew, networking, and low-level system tasks. Use when the user asks about Unix commands, shell scripts, macOS system configuration, process management, or troubleshooting system issues.