.claude/skills/pair-capability-execute-manual-tests/SKILL.md
Executes a project's manual test suite against released artifacts, producing a structured report. Resolves test variables, iterates through critical paths, records PASS/FAIL per test case, and generates the report from the manual-test-report template. Invocable independently or composed by /pair-process-review (post-merge validation).
npx skillsauth add foomakers/pair pair-capability-execute-manual-testsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Execute a manual test suite against released or deployed artifacts (website, CLI packages, registries). Produces a structured report following the manual-test-report-template.
Reads test case files from the project's manual test suite directory. Each test case follows the format defined in manual-test-case-template.
For the organizational context (who, when, which areas), see manual-verification.md. For test case design principles, see manual-testing.md.
| Argument | Required | Description |
| --- | --- | --- |
| $suite | No | Path to the test suite directory. Default: auto-detect qa/ at project root. |
| $version | No | Version under test. If omitted, derived from the artifact (e.g. pair-cli --version or release tag). |
| $base-url | No | Production website URL. If omitted, derived from deployment config or adoption files. |
| $scope | No | Limit execution to specific critical paths: CP1, CP2, ..., all (default: all). Comma-separated for multiple. |
| $priority | No | Minimum priority to execute: P0, P1, P2 (default: P2 — run all). Set P0 to run only blockers. |
Execute in sequence. For every step, follow the check → skip → act → verify pattern.
$suite point to a directory containing test case files (CP*.md)?$suite argument → HALT: "No manual test suite found. Run /pair-capability-design-manual-tests to generate one from your project's artifacts, then re-invoke /pair-capability-execute-manual-tests."$suite directory exists but contains zero CP*.md files → HALT: "Suite directory exists but contains no critical path files. Run /pair-capability-design-manual-tests --output {$suite} to populate it."README.md for variable definitions and execution order. List all CP*.md files.$VERSION, $BASE_URL, $WORKDIR, $RELEASE_URL, $REGISTRY)?$VERSION: extract from artifact (--version flag) or release tag.$BASE_URL: read from deployment config, adoption files, or ask the user.$WORKDIR: create isolated temp directory: mktemp -d /tmp/manual-test.XXXXX.$RELEASE_URL: derive from $VERSION and repo URL.$REGISTRY: read from adoption files or default.README.md Variables table (e.g. auth tokens, config files). Follow the "How to resolve" column for each.VARIABLES RESOLVED:
├── VERSION: [value]
├── BASE_URL: [value]
├── WORKDIR: [value]
├── RELEASE_URL: [value]
├── REGISTRY: [value]
└── [additional]: [per suite README]
Ask: "Proceed with these values?"
| Action | Preferred Tool | Fallback |
| --- | --- | --- |
| Website page load + interaction | agent-browser skill | Playwright MCP (browser_navigate, browser_snapshot) |
| Bulk HTTP status checks | WebFetch or curl -sI via Bash | agent-browser (slower) |
| CLI command execution | Bash | — |
| File existence / content | Read tool or Bash test -f | — |
| Checksum verification | Bash sha256sum / shasum -a 256 | — |
| Search UI interaction | agent-browser (fill form, click, screenshot) | Playwright MCP (browser_press_key, browser_fill_form) |
| Responsive viewport | agent-browser (resize, screenshot) | Playwright MCP (browser_resize, browser_take_screenshot) |
| Report generation | Write tool | — |
For each critical path file (in order: CP1, CP2, ..., CPN):
$scope? Are there test cases at or above $priority?MT-{CPNN} in the file:
a. Check preconditions: if a required precondition test failed → mark BLOCKED.
b. Execute steps: run each step using the selected tool. Capture output/evidence. Follow any setup instructions in the test case (e.g. config files to copy, auth to configure).
c. Evaluate expected result: compare actual vs expected. Determine PASS or FAIL.
d. Record result: store test ID, status, evidence (command output, HTTP status, screenshot path).CP{N} COMPLETE: [X pass | Y fail | Z skip | W blocked] of N total
.tmp/manual-test-reports/{suite-name}-{VERSION}-{YYYY-MM-DD}.md.
$WORKDIR temp directory.MANUAL TEST EXECUTION:
├── Suite: [{suite path}]
├── Version: [{VERSION}]
├── Scope: [{scope}]
├── Priority: [≥{priority}]
├── Results:
│ ├── CP1: [X/Y pass]
│ ├── CP2: [X/Y pass]
│ └── ...
├── Total: [N pass | N fail | N skip | N blocked] of N
├── Result: [PASS | FAIL]
└── Report: [{report path}]
When composed by /pair-process-review (Phase 6, post-merge):
/pair-capability-execute-manual-tests after merge as optional post-release validation.$scope = P0 (blockers only) for fast post-merge validation. Full suite runs standalone.When composed by /pair-capability-verify-done (Step 5.5, optional):
When invoked independently:
To maximize reliability when executed by AI agents:
waitUntil: networkidle), not sleep.data-testid, semantic HTML — not CSS classes.$WORKDIR must be outside the repo to avoid workspace interference.--no-workspaces, ensure no .npmrc inheritance from parent dirs./pair-capability-design-manual-tests first.agent-browser not available: Fall back to Playwright MCP. If Playwright MCP also unavailable, fall back to WebFetch/curl for HTTP checks. Mark interactive tests (search, responsive) as BLOCKED.$WORKDIR is created once and shared across all CPs, then cleaned up at the end.development
Creates or updates a Product Requirements Document through structured template analysis, hypothesis-driven information gathering, and iterative review. Idempotent — detects existing PRD and offers selective section update.
development
Reviews a pull request through a structured 6-phase process: validation, technical review, adoption compliance, completeness check, decision, and optional merge with parent cascade. Composes /verify-quality, /verify-done, /record-decision, /assess-debt (required) and /verify-adoption, /assess-stack (optional with graceful degradation). Output follows the code review template. Idempotent — re-invocation resumes from incomplete phases.
tools
Refines a user story from Todo to Refined state through structured phases: selection, requirements analysis (Given-When-Then), technical analysis, sprint readiness, and documentation. Section-level idempotency — detects partial refinement and resumes. Composes /write-issue for PM tool updates.
testing
Breaks a refined user story into implementation tasks. Task-level idempotency: detects existing tasks and creates only missing ones. Appends condensed Technical Analysis + Task Breakdown (checklist, Dependency Graph, AC Coverage table, detailed tasks) to the story body. Composes /write-issue to update the story issue body. Tasks are documented inline in the story — no separate task issues are created.