skills/qa/SKILL.md
Systematically QA test any application — web apps, native macOS apps, Electron apps, CLI tools, interactive REPLs, or anything on screen. Three modes: browser (chromux/CDP, fast, DOM-level), computer (MCP computer-use, screenshot + pixel clicks, any app), and cli (tmux, send-keys + capture-pane for interactive terminals). Auto-selects mode or accepts --browser / --computer / --cli override. Use when asked to "qa", "QA", "test this site", "test this app", "find bugs", "test and fix", "fix what's broken", "dogfood", "exploratory test", "bug hunt", "QA this app", "사이트 테스트", "앱 테스트", "브라우저 QA", "화면 보고 테스트해줘", "네이티브 앱 테스트", "screen test". Three tiers: Quick (critical/high only), Standard (+ medium), Exhaustive (+ cosmetic). Produces before/after health scores, fix evidence, and a ship-readiness summary.
npx skillsauth add team-attention/hoyeon qaInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a QA engineer AND a bug-fix engineer. Test applications like a real user — click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence.
| Parameter | Default | Override example |
|-----------|---------|-----------------|
| Target | (required) | URL, app name, CLI command, or "current branch" |
| Mode | auto-detect | --browser, --computer, --cli |
| Tier | Standard | --quick, --exhaustive |
| Report-only | false | --report-only (no fixes) |
| Output dir | .qa-reports/ | Output to /tmp/qa |
| Scope | Full app | Focus on the billing page |
| Signal | Mode | Why |
|--------|------|-----|
| URL provided (http/https/localhost) | browser | Web app, CDP gives DOM access |
| On feature branch, no URL | browser (diff-aware) | Verify branch changes locally |
| Native app name (Slack, Notes, Figma) | computer | Not a web app |
| Electron app | computer | Desktop app, even if web-based |
| CLI command, REPL, or interactive terminal | cli | Needs tmux send-keys + capture-pane |
| --browser flag | browser | User override |
| --computer flag | computer | User override |
| --cli flag | cli | User override |
| Ambiguous | AskUserQuestion | Let user decide |
Browser mode: Read references/browser-mode.md for chromux setup and interaction patterns.
Computer mode: Read references/computer-mode.md for MCP computer-use setup and interaction patterns.
CLI mode: Read references/cli-mode.md for tmux setup and interaction patterns.
If NOT --report-only and source code exists:
git status --porcelain
If dirty, use AskUserQuestion: commit / stash / abort.
mkdir -p .qa-reports/screenshots
Before touching the app, create a structured test plan. This ensures systematic coverage instead of random clicking.
If diff-aware (feature branch, no URL):
git diff main...HEAD --name-only
git log main..HEAD --oneline
Identify affected pages/routes from changed files.
If URL or app provided:
Create a test plan covering:
## Test Plan
### Target
- App: {name/URL}
- Mode: browser / computer
- Tier: quick / standard / exhaustive
- Scope: {full app or specific area}
### Screens to Test (priority order)
1. {Screen name} — {why: core feature / changed in diff / user-specified}
2. {Screen name} — {why}
3. ...
### Test Cases per Screen
For each screen, list what to verify:
- [ ] Page loads without errors
- [ ] Interactive elements respond (buttons, links, forms)
- [ ] Form validation works (empty, invalid, edge cases)
- [ ] Navigation in/out works
- [ ] Visual layout looks correct
- [ ] Empty/loading/error states handled
### Auth / Setup Required
- {Any login, data seeding, or preconditions}
### Out of Scope
- {What we're NOT testing and why}
Present the test plan briefly. For --quick mode, skip user approval and execute immediately. For standard/exhaustive, give the user a chance to adjust scope before proceeding.
Execute the first part of the test plan — get a map of the application.
Visit screens systematically in test plan order. At each screen:
references/issue-taxonomy.md:
Evidence collection:
Write each issue to the report using the template from templates/qa-report-template.md.
Quick mode: Only test the main screen + top 3-5 navigation targets. Skip the per-screen checklist.
Compute the baseline health score using the rubric at the bottom of this file.
Sort issues by severity, decide which to fix based on tier:
If --report-only or no source code: Skip Phase 6, go to Phase 7.
For each fixable issue, in severity order:
Use Grep/Glob to find the responsible source file(s).
Make the minimal fix. Do NOT refactor surrounding code.
git add <only-changed-files>
git commit -m "fix(qa): ISSUE-NNN — short description
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>"
One commit per fix. Never bundle.
Navigate back to affected screen, take before/after screenshots.
git revert HEAD -> mark as "deferred"Every 5 fixes (or after any revert), compute WTF-likelihood:
Start at 0%
Each revert: +15%
Each fix touching >3 files: +5%
After fix 15: +1% per additional fix
All remaining Low severity: +10%
Touching unrelated files: +20%
If WTF > 20%: STOP. Show progress. Ask user whether to continue. Hard cap: 50 fixes.
Write report to .qa-reports/qa-report-{target}-{YYYY-MM-DD}.md using the template.
Include:
Each category 0-100, then weighted average.
| Category | Weight | Scoring | |----------|--------|---------| | Console/Errors | 15% | 0 errors=100, 1-3=70, 4-10=40, 10+=10 | | Navigation | 10% | All works=100, each broken path -15 | | Visual | 10% | Start 100, critical -25, high -15, med -8, low -3 | | Functional | 20% | Same deduction scale | | UX | 15% | Same deduction scale | | Performance | 10% | Same deduction scale | | Content | 5% | Same deduction scale | | Accessibility | 15% | Same deduction scale |
score = sum(category_score * weight)
[REDACTED] for passwords.git revert HEAD immediately if a fix makes things worse.development
Run a full implementation verification pass after code or data changes. Use when the user asks to verify, QA, smoke test, run checks, validate a feature, inspect a local app in the browser, capture screenshots, or turn discovered QA issues into regression tests/checklists with user approval.
development
Hoyeon execution workflow for Codex. Use when the user invokes "$hoyeon-execute" or wants to execute a Hoyeon plan.json through the Bash-first Codex adapter. This adapter loads the canonical execute skill and follows its Codex runtime surface.
development
Plan-driven orchestrator. Reads plan.json (from /blueprint) or requirements.md, then dispatches workers to build the system. Use when: "/execute", "execute", "plan 실행", "blueprint 실행"
testing
"/clarify", "clarify this", "keep asking until clear", "remove ambiguity", "clarify requirements", "clarify design", "clarify the plan", "질문 계속해", "모호한 게 없게", "명확해질 때까지", "계속 물어봐", "Q&A로 정리", "질문답변 기록", "요구사항 명확화", "설계 명확화". Relentless ambiguity-resolution interview that records Q&A under .hoyeon/clarify/<topic>/ and hands off to specify/blueprint/docs when clear.