skills/qa/SKILL.md
Systematically QA test a web application and fix bugs found. Runs browser-based testing, iteratively fixes bugs in source code, commits each fix atomically, and re-verifies. Use when asked to 'qa', 'QA', 'test this site', 'find bugs', 'test and fix', or 'fix what's broken'. Three tiers: Quick (critical/high only), Standard (+medium, default), Exhaustive (+cosmetic). Produces before/after health scores, fix evidence, and a ship-readiness summary. For report-only mode, use /qa-only.
npx skillsauth add vltansky/skills qaInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a QA engineer AND a bug-fix engineer. Test web applications like a real user — click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence.
Browser: always use agent-browser. See references/browser-api.md for snippets.
Parse the user's request:
| Parameter | Default | Override |
|-----------|---------|---------|
| Target URL | auto-detect or required | https://myapp.com, http://localhost:3000 |
| Tier | Standard | --quick, --exhaustive |
| Mode | full or diff-aware | --regression .context/qa-reports/baseline.json |
| Output dir | .context/qa-reports/ | Output to /tmp/qa |
| Scope | Full app | Focus on the billing page |
| Auth | None | Sign in to [email protected] |
Tiers:
If no URL is given and on a feature branch: auto-enter diff-aware mode (see Modes).
Check for clean working tree:
git status --porcelain
If non-empty, STOP — ask user: commit/stash/abort before QA adds its own fix commits. Format: A) Commit my changes B) Stash C) Abort. RECOMMENDATION: A.
Verify agent-browser:
which agent-browser && agent-browser --version 2>/dev/null || echo "NEEDS_INSTALL"
If NEEDS_INSTALL: tell user to install agent-browser and stop.
Create output directories:
mkdir -p .context/qa-reports/screenshots
Copy references/qa-report-template.md to .context/qa-reports/qa-report-{domain}-{YYYY-MM-DD}.md.
Primary mode for developers verifying their work.
git diff main...HEAD --name-only
git log main..HEAD --oneline
agent-browser <<'EOF'
const page = await browser.getPage("qa-probe");
for (const port of [3000, 4000, 8080, 5173, 5000]) {
try {
await page.goto(`http://localhost:${port}`, { timeout: 3000 });
console.log(JSON.stringify({ found: true, url: page.url(), port }));
break;
} catch {}
}
EOF
If no local app found, check for staging URL in PR. If nothing, ask user for URL.TODOS.md for known issues related to changed files.Never skip browser testing — backend/config changes affect app behavior. Always verify.
Systematic exploration. Every reachable page. 5-10 well-evidenced issues. Health score.
--quick)30-second smoke: homepage + top 5 nav targets. Loads? Console errors? Broken links?
--regression <baseline>)Full mode + diff against baseline.json. Score delta, fixed vs. new issues.
REPORT_DIR=".context/qa-reports"
START_TIME=$(date +%s)
Start timer. Create report file from template. Page name convention: use "qa-main" for the primary test page.
agent-browser <<'EOF'
const page = await browser.getPage("qa-main");
await page.goto("https://yourapp.com/login");
const snap = await page.snapshotForAI();
console.log(snap.full);
EOF
Then fill form:
agent-browser <<'EOF'
const page = await browser.getPage("qa-main");
await page.fill('input[type="email"]', '[email protected]');
await page.fill('input[type="password"]', '[REDACTED]');
await page.click('button[type="submit"]');
await page.waitForURL('**/dashboard', { timeout: 5000 });
console.log(JSON.stringify({ url: page.url(), title: await page.title() }));
EOF
Never include real passwords in the report. Write [REDACTED].
If 2FA required: ask user for code and wait.
If CAPTCHA: tell user to complete it and tell you to continue.
agent-browser <<'EOF'
const page = await browser.getPage("qa-main");
await page.goto("TARGET_URL");
// Inject error capture (run once, early)
await page.evaluate(() => {
if (window.__qaErrors) return;
window.__qaErrors = [];
window.onerror = (msg, src, line) => window.__qaErrors.push({ msg, src, line });
window.addEventListener('unhandledrejection', e =>
window.__qaErrors.push({ msg: String(e.reason) }));
});
const snap = await page.snapshotForAI();
const buf = await page.screenshot();
const screenshotPath = await saveScreenshot(buf, "initial.png");
// Map navigation
const links = await page.$$eval('a[href]', els =>
els.map(e => ({ text: e.textContent.trim().slice(0, 60), href: e.href }))
.filter(l => l.href && !l.href.startsWith('javascript:') && !l.href.startsWith('mailto:')));
const errors = await page.evaluate(() => window.__qaErrors || []);
console.log(JSON.stringify({ url: page.url(), title: await page.title(), screenshotPath, links, errors }));
console.log(snap.full);
EOF
Read the screenshot file so the user can see it.
Detect framework: Look for __next / _next/data (Next.js), csrf-token meta (Rails), wp-content (WordPress), SPA (client-side routing with no reloads).
For each page, visit and check:
agent-browser <<'EOF'
const page = await browser.getPage("qa-main");
await page.goto("PAGE_URL");
const snap = await page.snapshotForAI();
const buf = await page.screenshot();
const path = await saveScreenshot(buf, "page-NAME.png");
const errors = await page.evaluate(() => window.__qaErrors || []);
console.log(JSON.stringify({ url: page.url(), title: await page.title(), screenshotPath: path, errors }));
console.log(snap.full);
EOF
After each page, read the screenshot inline. Per-page checklist (see references/issue-taxonomy.md):
window.__qaErrors after interactionsagent-browser --browser mobile <<'EOF'
const page = await browser.getPage("qa-mobile");
await page.goto("PAGE_URL");
const buf = await page.screenshot();
console.log(await saveScreenshot(buf, "page-NAME-mobile.png"));
EOF
Depth: Spend more time on core features (dashboard, checkout, search), less on static pages (about, terms).
Quick mode: Homepage + top 5 nav targets only. Check: loads? Errors? Broken links?
Document each issue immediately when found — don't batch.
Interactive bugs (broken flows, dead buttons, form failures):
# Before
agent-browser <<'EOF'
const page = await browser.getPage("qa-main");
const buf = await page.screenshot();
console.log(await saveScreenshot(buf, "issue-001-before.png"));
EOF
# Perform the action
agent-browser <<'EOF'
const page = await browser.getPage("qa-main");
await page.click('button#submit');
const buf = await page.screenshot();
const path = await saveScreenshot(buf, "issue-001-result.png");
const snap = await page.snapshotForAI({ track: "issue-001" });
console.log(JSON.stringify({ screenshotPath: path, errors: await page.evaluate(() => window.__qaErrors || []) }));
console.log(snap.incremental || snap.full);
EOF
Static bugs (typos, layout, missing images): single annotated snapshot, describe what's wrong.
Read every screenshot inline. Write each issue to the report using the template format.
references/health-score.md)baseline.json:
{
"date": "YYYY-MM-DD",
"url": "<target>",
"healthScore": N,
"issues": [{ "id": "ISSUE-001", "title": "...", "severity": "...", "category": "..." }],
"categoryScores": { "console": N, "links": N, "visual": N, "functional": N, "ux": N, "performance": N, "content": N, "accessibility": N }
}
Record baseline health score.
Sort issues by severity. Decide which to fix by tier:
Mark issues that can't be fixed from source code (third-party, infrastructure) as "deferred" regardless.
For each fixable issue (severity order):
grep -r "error message or component name" --include="*.ts" --include="*.tsx" --include="*.js" .
Read the source file, understand context, make the minimal fix. Do NOT refactor surrounding code.
git add <only-changed-files>
git commit -m "fix(qa): ISSUE-NNN — short description"
One commit per fix. Never bundle multiple fixes.
agent-browser <<'EOF'
const page = await browser.getPage("qa-main");
await page.goto("AFFECTED_URL");
const buf = await page.screenshot();
const path = await saveScreenshot(buf, "issue-NNN-after.png");
const snap = await page.snapshotForAI({ track: "fix-NNN" });
const errors = await page.evaluate(() => window.__qaErrors || []);
console.log(JSON.stringify({ screenshotPath: path, errors }));
console.log(snap.incremental || snap.full);
EOF
Read before/after screenshots inline.
git revert HEAD → mark issue as "deferred"WTF-likelihood:
Start at 0%
Each revert: +15%
Each fix touching >3 files: +5%
After fix 15: +1% per additional fix
All remaining Low severity: +10%
Touching unrelated files: +20%
If WTF > 20%: STOP. Show user what's been done. Ask whether to continue. Hard cap: 50 fixes. Stop regardless.
Re-run QA on all affected pages. Compute final health score. If final score is WORSE than baseline: WARN prominently — something regressed.
Write report to:
.context/qa-reports/qa-report-{domain}-{YYYY-MM-DD}.mdPer-issue additions beyond template:
Summary: total found, fixes applied (verified/best-effort/reverted), deferred, health score delta.
PR Summary line: "QA found N issues, fixed M, health score X → Y."
If repo has TODOS.md:
[REDACTED] for passwords.git revert HEAD if a fix makes things worse..context/qa-reports/
├── qa-report-{domain}-{YYYY-MM-DD}.md
├── screenshots/
│ ├── initial.png
│ ├── issue-001-step-1.png
│ ├── issue-001-result.png
│ ├── issue-001-before.png # after fix
│ ├── issue-001-after.png
│ └── ...
└── baseline.json
tools
Prepare a Hetzner Cloud VPS for secure Codex remote SSH access. Use when the user wants to create or configure a Hetzner server for Codex remote control, fix "No codex found in PATH" on a remote machine, install agent development tooling on a VPS, harden SSH access to a Hetzner server, or connect the server through Codex Settings, Connections, Add SSH.
data-ai
Summarize your GitHub activity from the last 24 hours across all repos. Use when user says "what did I do", "my activity", "standup", "recap", "summarize my day", "what-i-did", "git activity", "daily summary".
development
Test-driven development loop. Write failing test first, then implement to make it pass. Use when the user says 'tdd', 'test first', 'write the test first', 'failing test', 'red green refactor', or for any bug fix where the fix should be proven by a test. Also use when autopilot or other skills need test-first execution.
development
Review changed code for reuse, quality, and efficiency, then fix any issues found. Use when the user says "simplify", "simplify this", "review changes", "clean up my code", "check for duplicates", "code reuse review", or wants a post-change quality sweep.