agents/skills/qa-gstack/SKILL.md
Systematically QA test a web application and fix bugs found. Runs browser-based QA testing, then iteratively fixes bugs in source code, committing each fix atomically and re-verifying. Three tiers: Quick (critical/high only), Standard (+ medium), Exhaustive (+ cosmetic). Produces before/after health scores, fix evidence, and a ship-readiness summary. For report-only mode, use /qa-gstack-only. Use when asked to "qa", "test this site", "find bugs", "test and fix", or "fix what's broken".
npx skillsauth add carterdea/dots qa-gstackInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a QA engineer AND a bug-fix engineer. Test web applications like a real user -- click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence.
Default to agent-browser CLI. Before using any browser tool, detect what's available:
command -v agent-browser 2>/dev/null && echo "FOUND:agent-browser" || echo "MISSING:agent-browser"
Priority order:
agent-browser open, agent-browser snapshot -i, agent-browser click, agent-browser fill, agent-browser screenshot, etc. Invoke the /agent-browser skill for full command reference.Do NOT use Chrome MCP (mcp__claude-in-chrome__*) when agent-browser is installed. All browser commands in this skill (navigate, screenshot, snapshot, click, fill) should use agent-browser equivalents.
Parse the user's request for these parameters:
| Parameter | Default | Override example |
|-----------|---------|-----------------:|
| Target URL | (auto-detect or required) | https://myapp.com, http://localhost:3000 |
| Tier | Standard | --quick, --exhaustive |
| Mode | full | --regression baseline.json |
| Output dir | .qa-reports/ | Output to /tmp/qa |
| Scope | Full app (or diff-scoped) | Focus on the billing page |
| Auth | None | Sign in to [email protected], Import cookies from cookies.json |
Tiers determine which issues get fixed:
If no URL is given and you're on a feature branch: Automatically enter diff-aware mode (see Modes below).
Check for clean working tree:
git status --porcelain
If the output is non-empty (working tree is dirty), STOP and ask:
"Your working tree has uncommitted changes. /qa-gstack needs a clean tree so each bug fix gets its own atomic commit."
After the user chooses, execute their choice, then continue with setup.
Check test framework (bootstrap if needed):
Detect existing test framework and project runtime:
ls jest.config.* vitest.config.* playwright.config.* .rspec pytest.ini pyproject.toml phpunit.xml 2>/dev/null
ls -d test/ tests/ spec/ __tests__/ cypress/ e2e/ 2>/dev/null
If test framework detected: note conventions for regression test generation later. If none detected, offer to bootstrap one (vitest for Node/TS, pytest for Python, minitest for Ruby, etc.).
Create output directories:
mkdir -p .qa-reports/screenshots
This is the primary mode for developers verifying their work. When the user says /qa-gstack without a URL and the repo is on a feature branch, automatically:
Analyze the branch diff to understand what changed:
git diff main...HEAD --name-only
git log main..HEAD --oneline
Identify affected pages/routes from the changed files:
If no obvious pages/routes are identified from the diff: Do not skip browser testing. Fall back to Quick mode -- navigate to the homepage, follow the top 5 navigation targets, check console for errors, and test any interactive elements found.
Detect the running app -- check common local dev ports:
curl -s -o /dev/null -w "%{http_code}" http://localhost:3000 2>/dev/null
curl -s -o /dev/null -w "%{http_code}" http://localhost:4000 2>/dev/null
curl -s -o /dev/null -w "%{http_code}" http://localhost:5173 2>/dev/null
curl -s -o /dev/null -w "%{http_code}" http://localhost:8080 2>/dev/null
If no local app is found, ask the user for the URL.
Test each affected page/route with screenshots and console checks.
Cross-reference with commit messages and PR description to understand intent -- what should the change do? Verify it actually does that.
Report findings scoped to the branch changes.
Systematic exploration. Visit every reachable page. Document 5-10 well-evidenced issues. Produce health score.
--quick)30-second smoke test. Visit homepage + top 5 navigation targets. Check: page loads? Console errors? Broken links? Produce health score.
--regression <baseline>)Run full mode, then load baseline.json from a previous run. Diff: which issues are fixed? Which are new? What's the score delta?
If the user specified auth credentials, navigate to login page, fill credentials, submit. If 2FA/OTP is required, ask the user for the code. If CAPTCHA blocks, tell the user to complete it manually.
NEVER include real passwords in the report. Always write [REDACTED].
Get a map of the application:
Detect framework (note in report metadata):
__next in HTML or _next/data requests -> Next.jscsrf-token meta tag -> Railswp-content in URLs -> WordPressVisit pages systematically. At each page:
Depth judgment: Spend more time on core features (homepage, dashboard, checkout, search) and less on secondary pages.
Document each issue immediately when found -- don't batch them.
Interactive bugs (broken flows, dead buttons, form failures):
Static bugs (typos, layout issues, missing images):
baseline.json for future regression runsCompute each category score (0-100), then take the weighted average.
Each category starts at 100. Deduct per finding:
| Category | Weight | |----------|--------| | Console | 15% | | Links | 10% | | Visual | 10% | | Functional | 20% | | UX | 15% | | Performance | 10% | | Content | 5% | | Accessibility | 15% |
score = sum(category_score * weight)
Hydration failed, Text content did not match)_next/data requests in network -- 404s indicate broken data fetching/wp-json/)Sort all discovered issues by severity, then decide which to fix based on the selected tier:
Mark issues that cannot be fixed from source code (e.g., third-party widget bugs, infrastructure issues) as "deferred" regardless of tier.
For each fixable issue, in severity order:
Find the source file(s) responsible for the bug using Grep and Glob.
git add <only-changed-files>
git commit -m "fix(qa): ISSUE-NNN -- short description"
git revert HEAD -> mark issue as "deferred"Skip if: classification is not "verified", OR the fix is purely visual/CSS with no JS behavior, OR no test framework detected.
// Regression: ISSUE-NNN -- {what broke}git commit -m "test(qa): regression test for ISSUE-NNN"Every 5 fixes (or after any revert), compute the WTF-likelihood:
WTF-LIKELIHOOD:
Start at 0%
Each revert: +15%
Each fix touching >3 files: +5%
After fix 15: +1% per additional fix
All remaining Low severity: +10%
Touching unrelated files: +20%
If WTF > 20%: STOP immediately. Show the user what you've done so far. Ask whether to continue.
Hard cap: 50 fixes. After 50 fixes, stop regardless of remaining issues.
After all fixes are applied:
Write the report to .qa-reports/qa-report-{domain}-{YYYY-MM-DD}.md
Per-issue additions:
Summary section:
PR Summary: Include a one-line summary suitable for PR descriptions:
"QA found N issues, fixed M, health score X -> Y."
If the repo has a TODOS.md:
.qa-reports/
qa-report-{domain}-{YYYY-MM-DD}.md # Structured report
screenshots/
initial.png # Landing page
issue-001-step-1.png # Per-issue evidence
issue-001-result.png
issue-001-before.png # Before fix
issue-001-after.png # After fix
...
baseline.json # For regression mode
[REDACTED] for passwords in repro steps.git revert HEAD immediately.development
Ship a Trello ticket end to end on any web app (Vercel, Fly.io, or other host — no Shopify): pull the latest main, read the card including Figma links, implement the change in a worktree, run the project's own tests/lint/typecheck, run de-slop and code-simplifier and fold the worthwhile cleanups in, QA desktop and mobile on a local Portless preview URL, capture screenshots, open or update the GitHub PR, link the PR and Trello to each other, attach screenshots to both, comment on the card, and move it to review. Use this whenever the user points you at a Trello card or ticket for a code task and wants it delivered as a reviewable PR — phrases like 'do this Trello ticket', 'ship this card', 'pick up this ticket and open a PR', 'update the PR for this card', or names a card/list/board with a feature or bug to implement. This is the default Trello-to-PR workflow for non-Shopify projects; for Shopify theme work use shopify-trello-delivery instead.
tools
Install or upgrade a quality baseline for Shopify theme repos. Use this whenever the user asks to add Shopify theme linting, Biome, Theme Check, Playwright accessibility checks, Vitest, Vite build tooling, lefthook hooks, GitHub Actions CI, Shopify Lighthouse CI, Claude Code PR review workflows, or a context-efficient run_silent.sh setup across Shopify sites.
development
Run an extremely strict maintainability review for abstraction quality, giant files, and spaghetti-condition growth. Use for a thermo-nuclear code quality review, thermonuclear review, deep code quality audit, or especially harsh maintainability review.
development
Ship Shopify theme work from a Trello ticket end to end: inspect the card including Figma links, implement the theme change, deploy or update the correct preview/dev theme, browser-QA desktop and mobile against Figma when available, create or update the GitHub PR, attach screenshots, comment on Trello, and move the card forward. Use this whenever the user mentions a Shopify theme task with a Trello card, Figma design/artboard, preview theme, Customizer, dev theme, PR handoff, Ready for Review/Testing, or asks to update an existing Shopify PR from a ticket.