skills/acceptance-testing/SKILL.md
Evidence-gated acceptance testing with three-agent separation of concerns. Writer designs test plans, Executor collects artifacts, Reviewer evaluates independently. Eliminates false positives from self-grading. Use when: "run acceptance tests", "verify it works", "did it pass", "test this scenario", "acceptance criteria", "validate the feature"
npx skillsauth add mikeparcewski/wicked-garden acceptance-testingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Deprecated: This skill was replaced in v6 by
/wicked-garden:qe:acceptance(three-agent pipeline) and thetest-designeragent. Use those instead.
Three-agent pipeline that separates test writing, execution, and review for higher-fidelity acceptance testing.
When the same agent executes and grades tests, it pattern-matches "something happened" as success:
Result: 80%+ false positive rate on qualitative criteria.
Writer ──→ Test Plan ──→ Executor ──→ Evidence ──→ Reviewer ──→ Verdict
| Agent | Role | What it catches | |-------|------|-----------------| | Writer | Reads scenario + implementation code → structured test plan with evidence gates | Specification bugs — scenario expects X, code does Y | | Executor | Follows plan step-by-step → collects artifacts, no judgment | Runtime bugs — crashes, missing files, timeouts | | Reviewer | Evaluates cold evidence against assertions | Semantic bugs — everything ran but output is wrong |
# Full pipeline on a scenario
/wicked-garden:qe:acceptance path/to/scenario.md
# Generate test plan only (inspect before running)
/wicked-garden:qe:acceptance scenario.md --phase write
# Run all scenarios for a plugin
/wicked-garden:qe:acceptance wicked-garden:mem --all
scenarios/*.md formatEvery step in the test plan requires the executor to produce specific artifacts. No evidence = no verdict (INCONCLUSIVE, not auto-PASS).
| Type | Example | Auto-evaluable |
|------|---------|----------------|
| CONTAINS | stdout contains "success" | Yes |
| MATCHES | output matches score: \d+ | Yes |
| EXISTS | file at path exists | Yes |
| JSON_PATH | $.status equals "ok" | Yes |
| HUMAN_REVIEW | "Is output actionable?" | No — flagged for human |
| Cause | Who fixes |
|-------|-----------|
| IMPLEMENTATION_BUG | Developer |
| SPECIFICATION_BUG | Scenario author |
| ENVIRONMENT_ISSUE | DevOps/setup |
| TEST_DESIGN_ISSUE | Test writer |
| Agent | Purpose | |-------|---------| | acceptance-test-writer | Transforms scenarios into evidence-gated test plans | | acceptance-test-executor | Executes plans, collects artifacts, no judgment | | acceptance-test-reviewer | Evaluates evidence against assertions independently |
When running within a crew test phase, the executor compiles an evidence package at phases/test/evidence/report.md containing:
The review phase consumes this package for informed sign-off decisions. Pass/fail alone is insufficient — reviewers need visual proof and execution context.
Use /wicked-garden:qe:report to generate structured evidence from scenario execution results.
/wicked-garden:qe:run --json for machine-readable execution artifacts. Writer understands E2E scenario format natively. Falls back to inline bash execution when scenarios plugin is not installed./wicked-garden:qe:acceptance as the single acceptance pipeline. QE owns Writer/Executor/Reviewer end-to-end.development
--- name: large-scale-migration description: How to execute a LARGE MECHANICAL change across any codebase with LEVERAGE instead of an agent-grind or hand-edits — a cross-cutting migration, refactor, rename, dialect/framework/DB port, library adoption, or bulk transform. The map→transform→gate pattern: a deterministic transform driven by a source-of-truth map, proven by a differential-equivalence gate. Use when the work is "migrate all X to Y", "rename Z everywhere", "port to a new DB/dialect/fra
testing
v11 LLM-based work-shape classifier. Replaces the regex archetype detector with the model's own reasoning. Reads the user's prompt, picks the right archetype(s) from the catalog, identifies signals (blast_radius, novelty, reversibility, etc.), and persists to SessionState so subsequent turns steer correctly. Use when: the prompt_submit hook emitted a `<wg classify-due />` directive, OR explicitly invoked at session start, OR when re-classifying after the user changes scope mid-session.
tools
v11 work-shape archetype runner. When a prompt has been routed to one of the 9 archetypes (triage, explore, specify, decide, ship, review, incident, build, migrate), this skill is the entry point. It picks the right per-archetype playbook from refs/ and executes the phase shape declared in `.claude-plugin/archetypes.json`. Use when: a `<wg archetype="X">` or `<wg archetypes>` system-reminder tag appears, an explicit "let's run the X archetype" request, or when one of the per-archetype slash commands resolves to this skill.
development
Show or set the session intent variable. Intent gates how loud the framework is — simple-edit (silent), feature/research (synthesis directive), rigor (full crew context). Auto-detected on turn 1; this skill overrides explicitly. Sticky for the session. Use when: "set intent", "intent override", "/wicked-garden:intent", "make the framework quiet", "force rigor", "what's my intent".