skills/avad-qa/SKILL.md
Systematically QA test a web application and fix bugs found. Runs QA testing, then iteratively fixes bugs in source code, committing each fix atomically and re-verifying. Use when asked to "qa", "QA", "test this site", "find bugs", "test and fix", or "fix what's broken". Four modes: diff-aware (automatic on feature branches), full (systematic exploration), quick (30-second smoke test), regression (compare against baseline). Three tiers: Quick (critical/high only), Standard (+medium), Exhaustive (+cosmetic). Produces before/after health scores, fix evidence, and ship-readiness summary. Supports report-only mode — asks whether to fix or just report.
npx skillsauth add agwacom/avadbot avad-qaInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps.
Check if a PR already exists for this branch:
gh pr view --json baseRefName -q .baseRefName
If this succeeds, use the printed branch name as the base branch.
If no PR exists (command fails), detect the repo's default branch:
gh repo view --json defaultBranchRef -q .defaultBranchRef.name
If both commands fail, fall back to main.
Print the detected base branch name. In every subsequent git diff, git log,
git fetch, git merge, and gh pr create command, substitute the detected
branch name wherever the instructions say "the base branch."
You are a QA engineer AND a bug-fix engineer. Test web applications like a real user — click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence.
Parse the user's request for these parameters:
| Parameter | Default | Override example |
|-----------|---------|-----------------|
| Target URL | (auto-detect or required) | https://myapp.com, http://localhost:3000 |
| Mode | full | --quick, --regression .avadbot/qa-reports/baseline.json |
| Tier | Standard | --quick-tier, --exhaustive |
| Output dir | .avadbot/qa-reports/ | Output to /tmp/qa |
| Scope | Full app (or diff-scoped) | Focus on the billing page |
| Auth | None | Sign in to [email protected], Import cookies from cookies.json |
| Report only | No (asks after testing) | --report-only (skip fix decision, go straight to report) |
Tiers determine which issues get fixed:
If no URL is given and you're on a feature branch: Automatically enter diff-aware mode (see Modes below). This is the most common case — the user just shipped code on a branch and wants to verify it works.
Check working tree status:
DIRTY_TREE=""
if [ -n "$(git status --porcelain)" ]; then
DIRTY_TREE=1
fi
If the tree is dirty AND the user did NOT pass --report-only, warn but continue to Phase 6.5 — the check will be enforced before entering the fix path (Phase 7). If --report-only, proceed without warning.
Find the browse binary:
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
B=""
# Primary tiers — new name (7 tiers)
[ -n "$_ROOT" ] && [ -x "$_ROOT/skills/avad-browse/dist/avad-browse" ] && B="$_ROOT/skills/avad-browse/dist/avad-browse"
[ -z "$B" ] && [ -n "$_ROOT" ] && for _D in "$_ROOT"/*/skills/avad-browse/dist/avad-browse; do [ -x "$_D" ] && B="$_D" && break; done 2>/dev/null
[ -z "$B" ] && [ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/avad-browse/dist/avad-browse" ] && B="$_ROOT/.claude/skills/avad-browse/dist/avad-browse"
[ -z "$B" ] && [ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/avadbot/avad-browse/dist/avad-browse" ] && B="$_ROOT/.claude/skills/avadbot/avad-browse/dist/avad-browse"
[ -z "$B" ] && [ -x ~/.claude/skills/avad-browse/dist/avad-browse ] && B=~/.claude/skills/avad-browse/dist/avad-browse
[ -z "$B" ] && [ -x ~/.claude/skills/avadbot/avad-browse/dist/avad-browse ] && B=~/.claude/skills/avadbot/avad-browse/dist/avad-browse
[ -z "$B" ] && for _P in ~/.claude/plugins/*/skills/avad-browse/dist/avad-browse; do [ -x "$_P" ] && B="$_P" && break; done 2>/dev/null
# Legacy fallbacks — old name (backward compat, 6 tiers)
[ -z "$B" ] && [ -n "$_ROOT" ] && [ -x "$_ROOT/skills/browse/dist/browse" ] && B="$_ROOT/skills/browse/dist/browse"
[ -z "$B" ] && [ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/browse/dist/browse" ] && B="$_ROOT/.claude/skills/browse/dist/browse"
[ -z "$B" ] && [ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/avadbot/browse/dist/browse" ] && B="$_ROOT/.claude/skills/avadbot/browse/dist/browse"
[ -z "$B" ] && [ -x ~/.claude/skills/browse/dist/browse ] && B=~/.claude/skills/browse/dist/browse
[ -z "$B" ] && [ -x ~/.claude/skills/avadbot/browse/dist/browse ] && B=~/.claude/skills/avadbot/browse/dist/browse
[ -z "$B" ] && for _P in ~/.claude/plugins/*/skills/browse/dist/browse; do [ -x "$_P" ] && B="$_P" && break; done 2>/dev/null
if [ -x "$B" ]; then
echo "READY: $B"
else
echo "NEEDS_SETUP"
fi
If NEEDS_SETUP:
package.json with "name": "avadbot"): AVADBOT_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)/avadbot; [ -f "$AVADBOT_ROOT/package.json" ] || AVADBOT_ROOT=$(git rev-parse --show-toplevel 2>/dev/null); [ -f "$AVADBOT_ROOT/package.json" ] || echo "Cannot find avadbot root"cd "$AVADBOT_ROOT" && bun install && bun run buildbun is not installed: curl -fsSL https://bun.sh/install | bashCreate output directories:
REPORT_DIR=".avadbot/qa-reports"
mkdir -p "$REPORT_DIR/screenshots"
Before falling back to git diff heuristics, check for richer test plan sources:
~/.avadbot/projects/ for recent *-test-plan-*.md files for this repo
SLUG=$(git remote get-url origin 2>/dev/null | sed 's|.*[:/]\([^/]*/[^/]*\)\.git$|\1|;s|.*[:/]\([^/]*/[^/]*\)$|\1|' | tr '/' '-')
ls -t ~/.avadbot/projects/$SLUG/*-test-plan-*.md 2>/dev/null | head -1
/avad-plan-eng-review or /avad-plan-ceo-review produced test plan output in this conversationThis is the primary mode for developers verifying their work. When the user invokes QA without a URL and the repo is on a feature branch, automatically:
Analyze the branch diff to understand what changed:
git diff main...HEAD --name-only
git log main..HEAD --oneline
Identify affected pages/routes from the changed files:
$B js "await fetch('/api/...')"Detect the running app -- check common local dev ports:
$B goto http://localhost:3000 2>/dev/null && echo "Found app on :3000" || \
$B goto http://localhost:4000 2>/dev/null && echo "Found app on :4000" || \
$B goto http://localhost:8080 2>/dev/null && echo "Found app on :8080"
If no local app is found, check for a staging/preview URL in the PR or environment. If nothing works, ask the user for the URL.
Test each affected page/route:
snapshot -D before and after actions to verify the change had the expected effectCross-reference with commit messages and PR description to understand intent -- what should the change do? Verify it actually does that.
Report findings scoped to the branch changes:
If the user provides a URL with diff-aware mode: Use that URL as the base but still scope testing to the changed files.
Systematic exploration. Visit every reachable page. Document 5-10 well-evidenced issues. Produce health score. Takes 5-15 minutes depending on app size.
--quick)30-second smoke test. Visit homepage + top 5 navigation targets. Check: page loads? Console errors? Broken links? Produce health score. No detailed issue documentation.
--regression <baseline>)Run full mode, then load baseline.json from a previous run. Diff: which issues are fixed? Which are new? What's the score delta? Append regression section to report.
If the user specified auth credentials:
$B goto <login-url>
$B snapshot -i # find the login form
$B fill @e3 "[email protected]"
$B fill @e4 "[REDACTED]" # NEVER include real passwords in report
$B click @e5 # submit
$B snapshot -D # verify login succeeded
If the user provided a cookie file:
$B cookie-import cookies.json
$B goto <target-url>
If 2FA/OTP is required: Ask the user for the code and wait.
If CAPTCHA blocks you: Tell the user: "Please complete the CAPTCHA in the browser, then tell me to continue."
Get a map of the application:
$B goto <target-url>
$B snapshot -i -a -o "$REPORT_DIR/screenshots/initial.png"
$B links # map navigation structure
$B console --errors # any errors on landing?
Detect framework (note in report metadata):
__next in HTML or _next/data requests -> Next.jscsrf-token meta tag -> Railswp-content in URLs -> WordPressFor SPAs: The links command may return few results because navigation is client-side. Use snapshot -i to find nav elements (buttons, menu items) instead.
Visit pages systematically. At each page:
$B goto <page-url>
$B snapshot -i -a -o "$REPORT_DIR/screenshots/page-name.png"
$B console --errors
Then follow the per-page exploration checklist:
$B viewport 375x812
$B screenshot "$REPORT_DIR/screenshots/page-mobile.png"
$B viewport 1280x720
Depth judgment: Spend more time on core features (homepage, dashboard, checkout, search) and less on secondary pages (about, terms, privacy).
Quick mode: Only visit homepage + top 5 navigation targets from the Orient phase. Skip the per-page checklist -- just check: loads? Console errors? Broken links visible?
Document each issue immediately when found -- don't batch them.
Two evidence tiers:
Interactive bugs (broken flows, dead buttons, form failures):
snapshot -D to show what changed$B screenshot "$REPORT_DIR/screenshots/issue-001-step-1.png"
$B click @e5
$B screenshot "$REPORT_DIR/screenshots/issue-001-result.png"
$B snapshot -D
Static bugs (typos, layout issues, missing images):
$B snapshot -i -a -o "$REPORT_DIR/screenshots/issue-002.png"
Write each issue to the report immediately using the format below.
baseline.json with:
{
"date": "YYYY-MM-DD",
"url": "<target>",
"healthScore": N,
"issues": [{ "id": "ISSUE-001", "title": "...", "severity": "...", "category": "..." }],
"categoryScores": { "console": N, "links": N, ... }
}
Regression mode: After writing the report, load the baseline file. Compare:
Compute each category score (0-100), then take the weighted average.
Each category starts at 100. Deduct per finding:
| Category | Weight | |----------|--------| | Console | 15% | | Links | 10% | | Visual | 10% | | Functional | 20% | | UX | 15% | | Performance | 10% | | Content | 5% | | Accessibility | 15% |
score = sum (category_score x weight)
Hydration failed, Text content did not match)_next/data requests in network -- 404s indicate broken data fetchinggoto) -- catches routing issues/wp-json/)snapshot -i for navigation -- links command misses client-side routes[REDACTED] for passwords in repro steps.snapshot -C for tricky UIs. Finds clickable divs that the accessibility tree misses..avadbot/qa-reports/
|-- qa-report-{domain}-{YYYY-MM-DD}.md # Structured report
|-- screenshots/
| |-- initial.png # Landing page annotated screenshot
| |-- issue-001-step-1.png # Per-issue evidence
| |-- issue-001-result.png
| +-- ...
+-- baseline.json # For regression mode
Report filenames use the domain and date: qa-report-myapp-com-2026-03-12.md
Record baseline health score at end of Phase 6.
After completing the QA baseline, present the findings and ask the user:
Found {N} issues: {critical} critical, {high} high, {medium} medium, {low} low.
A) Fix them — triage by tier, fix in source code with atomic commits, re-verify B) Report only — write the report and stop. No source code changes.
Use AskUserQuestion to get the user's choice.
Shortcut: If the user passed --report-only in the original request, skip the question and go directly to report mode.
If "Report only":
If "Fix them":
Sort all discovered issues by severity, then decide which to fix based on the selected tier:
Mark issues that cannot be fixed from source code (e.g., third-party widget bugs, infrastructure issues) as "deferred" regardless of tier.
For each fixable issue, in severity order:
# Grep for error messages, component names, route definitions
# Glob for file patterns matching the affected page
git add <only-changed-files>
git commit -m "fix(qa): ISSUE-NNN — short description"
fix(qa): ISSUE-NNN — short descriptionsnapshot -D to verify the change had the expected effect$B goto <affected-url>
$B screenshot "$REPORT_DIR/screenshots/issue-NNN-after.png"
$B console --errors
$B snapshot -D
git revert HEAD → mark issue as "deferred"Skip if: classification is not "verified", OR the fix is purely visual/CSS with no JS behavior, OR no test framework was detected AND user declined bootstrap.
1. Study the project's existing test patterns:
Read 2-3 test files closest to the fix (same directory, same code type). Match exactly:
2. Trace the bug's codepath, then write a regression test:
Before writing the test, trace the data flow through the code you just fixed:
The test MUST:
// Regression: ISSUE-NNN — {what broke}
// Found by /qa on {YYYY-MM-DD}
// Report: .avadbot/qa-reports/qa-report-{domain}-{date}.md
Test type decision:
Generate unit tests. Mock all external dependencies (DB, API, Redis, file system).
Use auto-incrementing names to avoid collisions: check existing {name}.regression-*.test.{ext} files, take max number + 1.
3. Run only the new test file:
{detected test command} {new-test-file}
4. Evaluate:
git commit -m "test(qa): regression test for ISSUE-NNN — {desc}"5. WTF-likelihood exclusion: Test commits don't count toward the heuristic.
Every 5 fixes (or after any revert), compute the WTF-likelihood:
WTF-LIKELIHOOD:
Start at 0%
Each revert: +15%
Each fix touching >3 files: +5%
After fix 15: +1% per additional fix
All remaining Low severity: +10%
Touching unrelated files: +20%
If WTF > 20%: STOP immediately. Show the user what you've done so far. Ask whether to continue.
Hard cap: 50 fixes. After 50 fixes, stop regardless of remaining issues.
After all fixes are applied:
Write the report to both local and project-scoped locations:
Local: .avadbot/qa-reports/qa-report-{domain}-{YYYY-MM-DD}.md
Project-scoped: Write test outcome artifact for cross-session context:
SLUG=$(git remote get-url origin 2>/dev/null | sed 's|.*[:/]\([^/]*/[^/]*\)\.git$|\1|;s|.*[:/]\([^/]*/[^/]*\)$|\1|' | tr '/' '-')
mkdir -p ~/.avadbot/projects/$SLUG
Write to ~/.avadbot/projects/{slug}/{user}-{branch}-test-outcome-{datetime}.md
Per-issue additions (beyond standard report template):
Summary section:
PR Summary: Include a one-line summary suitable for PR descriptions:
"QA found N issues, fixed M, health score X → Y."
If the repo has a TODOS.md:
git status --porcelain is non-empty, report-only mode is allowed but fixes are blocked until the tree is clean.git revert HEAD immediately.data-ai
Clear the freeze boundary set by /avad-freeze, allowing edits to all directories again. Use when you want to widen edit scope without ending the session. Use when asked to "unfreeze", "unlock edits", "remove freeze", or "allow all edits".
testing
Ship workflow: validate branch state, sync with target branch, run tests, pre-landing review, push, and create PR. Project-aware — reads target branch, test commands, and review checklist from docs/GIT_WORKFLOW.md.
development
Pre-landing code review. Analyzes diff for structural issues using a project-specific checklist. Two modes: local (review current branch) or PR (review and comment on a GitHub PR by number). Proactively suggest when the user is about to merge or land code changes.
development
Weekly engineering retrospective. Analyzes commit history, work patterns, and code quality metrics with persistent history and trend tracking. Team-aware: breaks down per-person contributions with praise and growth areas.