kramme-cc-workflow/skills/kramme:qa/SKILL.md
Structured QA testing with evidence capture. Runs smoke checks, diff-aware validation, or targeted route testing against a live app. Produces QA_REPORT.md with screenshots, repro steps, severity, and recommended fixes, or replies inline with --inline. Uses browser MCP when available and falls back to code-only analysis otherwise. Not for logging multiple bugs from a manual pass (use kramme:qa:intake) or tracing one bug's root cause (use kramme:debug:investigate).
npx skillsauth add abildtoft/kramme-cc-workflow kramme:qaInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run smoke checks, diff-aware validation, or targeted route testing against a live application. When browser MCP is available, capture screenshots, console output, and network activity; otherwise fall back to code-only analysis. Produce a QA report with findings, severity ratings, and recommended fixes.
Arguments: "$ARGUMENTS"
Extract from $ARGUMENTS:
http://localhost:3000, https://staging.example.com) or auto to discover a running local dev serverquick):
quick — landing page + 2-3 key routesdiff-aware — test routes affected by changed UI filestargeted <route> — test a specific route/page--base <branch> — explicit base branch for diff-aware mode--regression — compare results against previous QA baseline (see Step 8)--inline — reply with the QA report inline instead of writing QA_REPORT.md (still writes QA_BASELINE.json, which regression depends on; see Step 10)--legacy-console — relax the clean-console standard for legacy apps with known noisy consoles (see Step 4)Store parsed values:
TARGET_URL — the base URL to testTEST_MODE — quick, diff-aware, or targetedTARGET_ROUTE — specific route for targeted mode (e.g., /settings/profile)BASE_OVERRIDE — explicit base branch if providedREGRESSION_MODE — boolean (default: false)INLINE_MODE — boolean (default: false)LEGACY_CONSOLE_MODE — boolean (default: false)URL is required. If not provided, stop with:
Error: URL is required.
Usage: /kramme:qa <url|auto> [quick|diff-aware|targeted <route>] [--base <branch>]
Examples:
/kramme:qa http://localhost:3000
/kramme:qa auto
/kramme:qa http://localhost:4200 diff-aware --base develop
/kramme:qa http://localhost:3000 targeted /settings/profile
If URL is auto: Resolve it with the shared dev-server detector:
${CLAUDE_PLUGIN_ROOT}/scripts/dev-server/detect-url.sh auto
http://... or https://... — set TARGET_URL to that value and continue.__MULTIPLE_URLS__ — list the candidate URLs and ask the user to pick one; if the runtime cannot ask, hard stop with the candidate list.__NO_RUNNING_SERVER__ — hard stop with: Error: No running dev server detected. Start your dev server first, then re-run the command.Validate explicit or resolved URL format. If TARGET_URL does not begin with http:// or https://, stop with: Error: TARGET_URL must be an http:// or https:// URL, or auto. Got: $TARGET_URL.
Verify app is reachable with a curl health check:
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" --max-time 5 "$TARGET_URL")
2xx or 3xx — proceed
Connection refused — stop with:
Error: Connection refused at $TARGET_URL. Is the server running?
Start your dev server first, then re-run the command.
Timeout — stop with:
Error: Request to $TARGET_URL timed out after 5 seconds. Is the server running?
5xx — stop with:
Error: Server error ($HTTP_STATUS) at $TARGET_URL. Fix the server error before QA testing.
4xx — warn but proceed (page may require interaction or authentication)
quick mode:
Auto-detect routes from the project structure. Look for route definitions in:
pages/, app/ directories (Next.js, Nuxt, Remix file-based routing)routes/, views/, screens/ directoriesrouter.ts, routes.ts, app-routing.module.ts)package.json for framework hints (next, nuxt, remix, angular, vue-router, react-router)Select the landing page (/) plus 2-3 key routes that represent core functionality. Prefer routes that are:
If route detection fails, fall back to testing only the landing page (/).
diff-aware mode:
Read references/diff-scope.md to resolve BASE_BRANCH and identify changed files, then continue with UI-relevant filtering below.
Filter for UI-relevant files:
*.tsx, *.jsx, *.vue, *.svelte, *.component.ts, *.component.html*.html, *.hbs, *.ejs, *.pug*.css, *.scss, *.sass, *.less, *.styled.ts, *.module.csspages/, views/, screens/, routes/, app/ directoriesIf no UI-relevant files found:
No UI-relevant changes detected in this PR or local working tree.
Changed files: {list file types}
No routes to test. Use `quick` mode to test the app without a diff scope.
Action: Stop.
Map changed UI files to routes/pages:
pages/settings/profile.tsx maps to /settings/profile)Create a branch-diff-to-journey matrix before building the test checklist. Read references/diff-aware-journey-matrix.md and populate one row per route/screen and meaningful user journey using the columns defined there.
Mark speculative route mappings as UNVERIFIED rather than silently treating them as known routes. If a journey would mutate shared data, send external notifications, change billing, delete records, or otherwise be destructive, ask the user before executing it; if the runtime cannot ask, mark the row blocked.
targeted mode:
Use the user-specified route directly. The test scope is TARGET_URL + TARGET_ROUTE.
First use the shared project-type detector:
DETECTED_PROJECT_TYPE=$(${CLAUDE_PLUGIN_ROOT}/scripts/dev-server/detect-project-type.sh 2> /dev/null)
If the output is a single type (for example next, vite, or rails) or a single monorepo hit (next@apps/web), store the type as DETECTED_FRAMEWORK.
If the shared detector returns unknown or multiple, check package.json (if it exists) for framework dependencies to load framework-specific QA hints:
cat package.json 2> /dev/null | grep -oE '"(next|nuxt|@angular/core|react|vue|svelte|@sveltejs/kit|rails)"' | head -1
Also check project structure:
next.config.* → Next.jsangular.json → Angularnuxt.config.* → Nuxtsvelte.config.* → SvelteKitconfig/routes.rb or Gemfile with rails → Railswp-config.php or wp-content/ → WordPressIf a framework is detected, store as DETECTED_FRAMEWORK and read references/framework-hints.md for framework-specific checks to add to the test plan.
For each identified page/route, read the QA rubric from references/qa-rubric.md.
For diff-aware mode, build the checklist from the journey matrix rows created in Step 3. Prioritize matrix rows whose changed files are closest to user-facing behavior, then rows covering edge states. Keep uncertain rows in the plan with UNVERIFIED assumptions so the final report shows what was and was not proven.
Create a test checklist for each route:
Clean-console standard:
LEGACY_CONSOLE_MODE (true): zero console errors is still required; warnings demote to Info-level findings rather than Minor/Major.Accessibility ladder (run for every tested route):
h1; heading levels do not skip.Each failed a11y check becomes a finding in the Accessibility category (see references/health-score-rubric.md).
Prioritize test items by severity impact. Blockers first, then major, then minor.
If DETECTED_FRAMEWORK is set: Add framework-specific checks from references/framework-hints.md to the test plan. For example, if Next.js is detected, add hydration error checks and _next/data monitoring.
Before performing interaction checks in any mode, identify actions that could mutate shared data, submit forms, send external notifications, change billing, delete records, or otherwise be destructive/non-idempotent. Ask the user before executing those actions; if the runtime cannot ask, mark the interaction blocked and continue with read-only evidence.
For each route in the test plan, invoke /kramme:browse via the Skill tool:
skill: "kramme:browse", args: "<TARGET_URL><route> --screenshot --console --network"
This captures:
After navigation, perform basic interaction checks:
If browse fails (no browser MCP available):
Degrade to code-only analysis. First select which source files to read, by mode:
TARGET_ROUTE back to its source file(s) by reversing the Step 3 route-detection logic (file-based routing: route → file path; config-based routing: search the router config for the route, then read the component it references).Then analyze the selected files for potential issues:
Report all findings as "code-only mode" with a clear warning:
Warning: No browser MCP detected. Running in code-only mode.
Findings are based on static code analysis only — no live testing performed.
For full QA with screenshots and live testing, install a browser MCP:
- Claude in Chrome extension (recommended)
- Chrome DevTools MCP
- Playwright MCP
For each tested page/route, collect:
Network triage ladder (apply to every failed or anomalous request):
| Signal | Interpretation | Action |
| --- | --- | --- |
| 4xx | Client sent wrong data (shape, auth, validation) | Capture the request payload + route; Major unless expected (e.g. 401 on a logged-out probe) |
| 5xx | Server error | Capture the response body after redacting tokens; Blocker |
| CORS failure | Origin or headers mismatch | Capture origin + Access-Control-* response headers; Major |
| Timeout | Response exceeded the time budget (> 3s default) | Capture URL + elapsed; Major unless route is known-slow |
| Missing | A request that was expected never fired | Capture route context; Major — this is often a regression signal |
Store evidence per route for inclusion in the QA report.
Rate each issue found using severity levels:
When assessing, consider:
Read references/health-score-rubric.md and compute a weighted health score (0-100).
Clean-console rule: if any console error is present and LEGACY_CONSOLE_MODE is false, the Console category receives an automatic Blocker deduction in addition to per-finding deductions. Warnings follow the rule set in Step 4 (Minor/Major by default, Info under --legacy-console).
Store as HEALTH_SCORE and HEALTH_LABEL (Excellent/Good/Fair/Poor/Critical).
Skip if REGRESSION_MODE is false.
This runs before the report is written (Step 9) so the comparison can be included in it.
Check if QA_BASELINE.json exists from a previous run:
"No previous baseline found. Skipping regression comparison. This run's results will be saved as the new baseline." and continue to Step 9.Comparison logic:
current_score - baseline_score (positive = improvement, negative = regression)Prepare a ## Regression section for the QA report (rendered in Step 9):
## Regression (vs. baseline from {baseline_date})
**Score delta:** {current_score} vs. {baseline_score} ({+N / -N})
### Fixed ({N})
- QA-{NNN}: {title} (was {severity})
### New ({N})
- QA-{NNN}: {title} ({severity})
### Persistent ({N})
- QA-{NNN}: {title} ({severity})
Use the template from assets/qa-report-template.md.
Populate all sections:
diff-aware runs, with route/screen, journey, state, expected behavior, evidence, result, and follow-up## Regression section prepared in Step 8 when a regression comparison ranNumbering convention: Findings are numbered QA-001, QA-002, etc.
QA does not auto-fix or auto-commit findings in the default flow. Record recommended fixes and follow-up issue references, then leave implementation to the user or a separate fix workflow.
If INLINE_MODE=true:
QA_REPORT.mdOtherwise:
QA_REPORT.md at the project root/kramme:workflow-artifacts:cleanupAfter regression comparison (or if skipped), save a machine-readable baseline for future runs. This runs regardless of INLINE_MODE — --inline suppresses only QA_REPORT.md, not the baseline, so regression comparisons keep working across runs.
Write QA_BASELINE.json at the project root using the shape in assets/qa-baseline-schema.md.
This file is a working artifact. It will be cleaned up by /kramme:workflow-artifacts:cleanup.
After writing the report, display an inline summary:
## QA Summary: $TARGET_URL
**Mode:** {quick | diff-aware | targeted}
**Routes Tested:** {N}
**Journey Matrix Rows:** {N, if diff-aware}
**Browser:** {claude-in-chrome | chrome-devtools | playwright | code-only}
**Framework:** {DETECTED_FRAMEWORK or "not detected"}
**Health Score:** {HEALTH_SCORE}/100 ({HEALTH_LABEL})
### Verdict: {READY | NOT READY | READY WITH CAVEATS}
{If NOT READY: list blockers with brief description}
{If READY WITH CAVEATS: list major issues with brief description}
{If READY: confirm no blockers or major issues found}
- Blockers: {N}
- Major: {N}
- Minor: {N}
- Info: {N}
{If REGRESSION_MODE and baseline found:}
### Regression vs. {baseline_date}
Score: {baseline_score} -> {current_score} ({+N / -N})
Fixed: {N} | New: {N} | Persistent: {N}
Report output: {inline reply | QA_REPORT.md}
{If blockers found: "Fix blockers and re-run: /kramme:qa <url>"}
Before producing the QA report, read references/addy-conventions.md and apply:
STACK DETECTED, UNVERIFIED, NOTICED BUT NOT TOUCHING, CHANGES MADE / THINGS I DIDN'T TOUCH / POTENTIAL CONCERNS, CONFUSION, MISSING REQUIREMENT, PLAN) to section headers, summary callouts, and inline flags.Common Rationalizations / Red Flags — STOP / Verification epilogue as a pre-handoff checklist against this run.| Error | Behavior |
| --- | --- |
| No URL provided | Hard stop with usage instructions |
| auto finds no running server | Hard stop with instructions to start app |
| URL unreachable (connection refused) | Hard stop with diagnostic |
| URL unreachable (timeout) | Hard stop with diagnostic |
| URL returns 5xx | Hard stop with server error diagnostic |
| URL returns 4xx | Warn and proceed |
| No browser MCP | Degrade to code-only analysis |
| Browse fails on a route | Log error, continue with remaining routes |
| No UI changes (diff-aware) | Report and stop |
| Base branch not found | Hard stop, suggest --base flag |
| Route detection fails (quick) | Fall back to landing page only |
/kramme:qa http://localhost:3000 # quick smoke test (default)
/kramme:qa auto # auto-detect a running local dev server
/kramme:qa http://localhost:4200 diff-aware --base develop # test routes affected by changes
/kramme:qa http://localhost:3000 targeted /settings/profile # one specific route
/kramme:qa https://staging.myapp.com # staging URL
/kramme:qa http://localhost:3000 --regression # compare against previous baseline
/kramme:qa http://localhost:3000 --inline # reply inline, no QA_REPORT.md
development
Runs kramme:pr:code-review as a closeout review loop for local or PR branch changes before commit, ship, or final response. Use when the user asks for autoreview, second-model review, or a final code-review pass after non-trivial edits. Not for UX, visual, accessibility, or product review.
development
Guides topic-level understanding verification for a PR, branch, feature, document, spec, design decision, bug fix, or other concrete subject. Use when the user asks to confirm, quiz, drill, teach-and-check, or verify that they understand a topic. Maintains a topic-specific checklist artifact and requires demonstrated understanding before marking the topic complete. Not for ordinary explanations without verification, end-of-session summaries, or code/test correctness checks.
testing
Design a CI/CD pipeline with quality gates, a <10-minute budget, feature-flag lifecycle, and an exit checklist. Use when adding a new CI pipeline, changing gate configuration, or planning a rollout for a new service. Complementary to kramme:pr:fix-ci (which fixes failures in an existing pipeline). Covers gate ordering, secrets storage, branch protection, rollback mechanism, and staged-rollout guardrails — not a rollout-execution runbook.
tools
--- name: kramme:visual:demo-reel description: Capture local demo evidence for observable product behavior: screenshots, before/after image sets, browser reels, terminal recordings, and short GIF/video proof. Use when shipping UI changes, CLI features, or any change where PR reviewers would benefit from visual or behavioral evidence. argument-hint: "[what to capture] [--url <url>|auto] [--tier static|before-after|browser-reel|terminal-recording]" disable-model-invocation: true user-invocable: tr