skills/manual-testing/SKILL.md
Guide users through targeted manual verification after code changes. Use when asked to "test this", "verify it works", "QA this", "walk me through testing", "smoke test", "sanity check", "regression test", "acceptance test", or after implementing a feature or bug fix that still needs human validation. Favor this skill for focused verification of the current change; use a broader exploratory-testing skill for open-ended bug hunting across an entire app.
npx skillsauth add petekp/claude-skills manual-testingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Finish work with a tight verification loop: prove everything possible with tools first, then ask the user to verify only what requires human eyes, hands, devices, or judgment.
Do not make the user invent the test plan. Lead them through it.
Before asking the user to do anything, know:
Never ask the user to "poke around" or "let me know if it works." Give concrete actions, a specific screen or command, the expected result, and a short set of likely outcomes to reply with.
Translate the change into a short verification plan before running anything.
For each changed behavior, capture:
Use a simple internal checklist like:
| Behavior | Happy path | Edge/regression | Verified by | |----------|------------|-----------------|-------------| | Save settings | Form saves | Validation error still works | Tool + user |
Keep the matrix small and focused on the current change.
Exhaust automated verification before involving the user.
Prefer to verify these yourself:
Only hand work to the user when the result depends on:
If an automated check fails, stop and address it before asking for manual verification.
Set the user up so they can perform the check with minimal effort.
Provide:
If the user needs an already-running app, point them to the exact place to open. If you can safely prepare state, data, or fixtures first, do that yourself.
Run manual verification as a guided sequence, not a dump of vague instructions.
Prefer one atomic step at a time. For a tiny smoke test, bundle at most 2-3 closely related checks.
Use this structure:
Testing: [feature or fix]
Progress: Step N of M
Action: [exact thing to click, type, or inspect]
Expected: [what should happen]
Reply with one:
1. [expected outcome]
2. [common failure mode]
3. [second common failure mode]
4. Other
If structured question tools are available, convert those reply options into a structured prompt. Otherwise ask the question in plain text with the options inline.
Example:
Testing: profile photo upload
Progress: Step 2 of 3
Action: Open `/settings/profile`, upload a PNG under 2 MB, and wait for the save state to finish.
Expected: The new avatar appears in the header and no error message is shown.
Reply with one:
1. Upload worked and the new avatar is visible
2. Upload finished but the avatar did not update
3. I saw an error message or spinner got stuck
4. Other
Always test the changed path first, then cover the most likely place it could fail.
Use these prompts as a calibration checklist.
For UI changes:
For bug fixes:
For API or backend changes:
For CLI or local-tool changes:
When the user reports a problem:
Before changing code, generate 2-3 plausible hypotheses for the failure so the next debugging step is deliberate instead of guess-driven.
If the failure blocks confidence in the change, stop the manual test and switch into diagnosis.
Close with a short verification summary that separates what is proven from what is still assumed.
Include:
tools
Comprehensively manually test the Circuit plugin's user-facing surface in either Claude Code or Codex. Use this skill whenever the user asks to "manually test Circuit", "QA the Circuit plugin", "exercise the Circuit surface", "run the Circuit checklist", "smoke test Circuit", "find regressions in Circuit", "test the Claude Circuit plugin", "test the Codex Circuit plugin", or when preparing a Circuit release for marketplace publication. Argument is the host package to test — `claude` or `codex`. Produces a Markdown report with per-command pass/fail, exploratory findings ranked by severity, run-folder evidence links, and a concise terminal summary. Use even if the user does not say the word "test" — phrases like "go through every Circuit command" or "make sure Circuit still works end-to-end" should also trigger.
development
Turn the prompt supplied with this skill into a concise, auditable Codex Goal or explain why a Goal is not the right fit. Use when the user asks to draft, formulate, rewrite, tighten, or create a `/goal` from a plain-language task, especially for multi-step work that needs a durable objective, evidence-based completion, constraints, iteration policy, and a default adversarial review loop.
development
Give the human a fast, plain-English catch-up on what changed in the project: what the agents did, why, and what decisions need their input. Use this whenever the user asks to "catch me up", "what changed", "where are we", "recap", "brief me", "give me the rundown", "what did you do", "summarize the session", "fill me in", or otherwise signals they have been away and want to get back up to speed quickly. Built for someone steering several agent-driven projects at once who does not read the code closely but needs to grasp the core ideas, the choices made, and the open decisions well enough to steer. Trigger even if they do not use these exact words: any request to get oriented on recent progress should use this skill.
tools
Expert Unix and macOS systems engineer for shell scripting, system administration, command-line tools, launchd, Homebrew, networking, and low-level system tasks. Use when the user asks about Unix commands, shell scripts, macOS system configuration, process management, or troubleshooting system issues.