skills/atdd/SKILL.md
Use to drive feature work through the Acceptance Test Driven Development workflow — Given/When/Then specs before code, a project-specific test pipeline, and two parallel test streams (acceptance + unit). Triggers — "/atdd", "build a feature", "implement a feature", "add functionality", "start development", "write acceptance tests", "write specs", "use ATDD", "use TDD with acceptance tests".
npx skillsauth add swingerman/atdd atddInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Enforce the ATDD workflow for feature development. This methodology is adapted from Robert C. Martin's acceptance test approach.
"The two different streams of tests cause Claude to think much more deeply about the structure of the code." — Robert C. Martin
Two test streams constrain development:
Both must pass. Neither alone is sufficient.
Follow these steps strictly, in order. Do not skip steps.
Before Step 1, create one TodoWrite todo per step of this workflow (Steps 1–7),
all at once — the full list up front, as a roadmap. Flip each todo to
in_progress / completed as you go. See
${CLAUDE_PLUGIN_ROOT}/references/progress-indicator.md.
Before writing anything, understand what is being built:
Write the feature's spec.md in standard Gherkin (DAE Foundation §7):
Feature: <feature name>
Scenario: <behavior being specified>
Given <precondition in domain language>
And <another precondition if needed>
When <the action the user/system takes>
Then <observable outcome>
And <another observable outcome if needed>
Scenario Outline: <a behavior with varying data>
Given <a step with a <parameter>>
...
Examples:
| parameter | expected |
| value | result |
spec.md is markdown — prose and headings around the Gherkin are fine;
the parser ignores non-Gherkin lines.
Migrating from the legacy
;=== .txtformat? Run the converter:dae_gherkin_convert.py specs/feature.txt features/NNN-slug/spec.md. The.txtformat is deprecated; new specs are Gherkinspec.md.
Format rules:
Scenario: names one behavior; Scenario Outline: + Examples: for varying dataGiven sets preconditions; When the action (one per scenario, ideally); Then the observable outcomeAnd continues the previous keywordThe spec-leakage rule — CRITICAL:
Specs must describe external observables only. Never reference:
BAD: Given the UserService has an empty userRepository
GOOD: Given there are no registered users
BAD: When a POST request is sent to /api/users
GOOD: When a new user registers with email "[email protected]"
BAD: Then the database contains 1 row in the users table
GOOD: Then there is 1 registered user
Present specs to the user for approval before proceeding. Specs are co-authored, but the human has final approval — ferociously defended.
The pipeline's front end is portable and shipped — you don't generate it:
dae_gherkin.py parses spec.md → .build/spec.json, the
fixed JSON IR (see the engineer plugin's references/spec-ir.md).Invoke the pipeline-builder agent to generate the project-specific half:
.build/spec.json, produces executable test files
for the project's framework (pytest, Jest, JUnit, Go testing, RSpec, etc.)The generator must have deep knowledge of the system internals. This is NOT Cucumber — it produces complete, runnable tests that call into the system, not stubs requiring manual fixtures.
pipeline-builder also generates a runner so the user can run:
# parse spec.md → IR → generate tests → run tests
./run-acceptance-tests.sh
Run the generated acceptance tests. They should fail — this confirms the specs describe behavior that doesn't exist yet.
If they pass, either:
Now implement the feature using standard TDD:
Faster iteration with impact analysis: if the project has
acceptance.impact_analysis: on, the runner's impact-run mode (dae_impact.py select) runs only the scenarios your change affects — use it for the tight
TDD loop. The full acceptance run still gates Step 5's completion: do not
mark the feature done until every scenario passes a full run.
Both streams must pass:
After implementation, invoke the spec-guardian agent to review all
spec files for implementation details that may have crept in during
development.
If leakage is found, clean the specs back to domain language.
Return to Step 1 for the next feature. Each iteration adds specs only for the current feature — never design the whole system upfront.
These rules govern how spec files and the pipeline are handled. They are non-negotiable.
spec.md without explicit user permission.
Specs are the user's contract. Always ask before changing them..build/generated/.
Only delete and regenerate them by re-running the pipeline from spec.md..build/ is gitignored — the IR and generated tests are artifacts.
The project-specific generator and step handlers ARE committed source.spec.md
is newer than .build/spec.json or the generated tests, re-parse and
regenerate before running.spec.md and the failing
scenario name. Traceability back to the spec is critical.No. Specs first, always. The spec-before-code hook will warn about this.
Then the specs need to be more specific. Break the feature into smaller observable behaviors. Each spec should describe one concrete scenario.
This is the perverse incentive. Fight it. The generator should be smart enough to map domain language to system internals. If it can't, improve the generator — don't pollute the specs.
No. Two streams constrain development differently. Acceptance tests alone leave internal structure unchecked. Unit tests alone miss integration.
DAE stores specs and pipeline artifacts per feature folder:
project-root/
├── features/
│ └── NNN-slug/
│ ├── spec.md # acceptance specs (standard Gherkin) — committed
│ └── .build/ # GITIGNORED (regenerated)
│ ├── spec.json # — the IR (from dae_gherkin.py)
│ └── generated/ # — the generated acceptance tests
├── acceptance/ # project-specific pipeline — committed
│ ├── generator.* # — emits tests from the IR
│ └── handlers.* # — step handlers bound to system internals
└── run-acceptance-tests.sh # pipeline runner — committed
Commit these (source of truth):
features/NNN-slug/spec.md — the acceptance specsacceptance/generator.* — the project-specific generatoracceptance/handlers.* — the step handlersrun-acceptance-tests.sh — the pipeline runner scriptGitignore these (regenerated from spec.md):
features/*/.build/ — the IR and generated testsAdd to the project's .gitignore:
.build/
The parser (dae_gherkin.py) is portable and shipped with the plugin —
it is not part of the project's committed source.
After setting up the pipeline, add an Acceptance Tests section to
the project's CLAUDE.md (or create one if it doesn't exist). This
ensures Claude Code understands the ATDD setup in every session:
## Acceptance Tests
Acceptance specs are `spec.md` files (standard Gherkin) under
`features/NNN-slug/`.
### Pipeline
spec.md → dae_gherkin.py → .build/spec.json (IR) → generator → tests
1. **Parse:** `dae_gherkin.py` — `spec.md` → `.build/spec.json` (portable, shipped)
2. **Generate:** [generate command] — reads the IR, produces tests in `.build/generated/`
3. **Run:** [test command] — executes the generated tests
Full pipeline: `./run-acceptance-tests.sh`
### Rules
- Never modify a `spec.md` without explicit permission.
- Never modify generated tests — only delete and regenerate via the pipeline.
- `.build/` is gitignored — do not commit the IR or generated tests.
- Before a push, run the full acceptance test pipeline.
- On failure, report the `spec.md` and the failing scenario name.
Adapt the commands and paths to match the project's language and test framework. The pipeline-builder agent generates this CLAUDE.md section automatically when creating the pipeline.
data-ai
Use immediately after a PR is merged to clean up the local feature branch and resync main. Triggers — "/engineer.post-merge", "did we merge", "did we push", "PR merged", "post-merge cleanup", or right after a `gh pr merge` succeeds in the same session.
data-ai
Use to drive a bug fix from first report through close, with a "why didn't we catch it?" loop at the end. Triggers — "/engineer.fix", "a bug came in", "this is broken", "a user reported X", "there's a defect", "we have a regression", "this needs a fix", "another report", "more issues", "still failing", "validation failed again", "another bug", "next defect", "more fixes".
testing
Use mid-task when the working thread is lost — after a context compaction, a long agent run, or coming back to a feature unsure of the role, the current checkpoint, or the next action. Triggers — "/engineer.reorient", "reorient", "re-anchor", "what should I be doing right now", "I lost track", "where was I".
development
Use to check a feature's code against the charter's architecture rules — dependency layering, cycles, forbidden patterns, file naming, file size. Triggers — "/engineer.arch-check", "architecture check", "check architecture fitness", "does this follow the charter", "check layering".