look-before-you-leap/skills/test-driven-development/SKILL.md
Test-Driven Development workflow enforcing red-green-refactor cycles. Use when writing new features, adding behavior, or implementing functions where tests should drive design. Requires explicit test-first prompting because Claude naturally writes implementation first. Integrates with writing-plans (TDD rhythm in Progress items) and engineering-discipline (verification). Do NOT use when: fixing a bug in existing tested code (use systematic-debugging), writing tests for existing untested code (characterization tests are a different workflow), refactoring without behavior change (use refactoring), or the project has no test infrastructure.
npx skillsauth add miospotdevteam/claude-control test-driven-developmentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Claude naturally writes implementation first, then tests. TDD requires the inverse: tests drive design. This skill enforces the red-green-refactor cycle through explicit structure.
Announce at start: "I'm using the TDD skill to write tests before implementation."
Prerequisite: The project must have test infrastructure (a test runner, a test framework). If none exists, set it up first or ask the user.
Every feature unit follows this cycle. No shortcuts, no combining phases.
RED -> GREEN -> REFACTOR
test implement clean up
fails minimally tests stay green
Write a test that describes the desired behavior. The test MUST fail because the implementation doesn't exist yet.
.test.ts, .spec.ts, _test.go, test_*.py, etc.)Test naming: Describe behavior, not implementation. Use
should_<behavior>_when_<condition> or the project's existing convention.
What to test:
Do NOT:
Write the dumbest possible code that makes the failing tests pass. This is the hardest phase for Claude — your instinct is to write correct, general code. Fight that instinct. The goal is to let tests drive generalization, not to anticipate it.
The Minimality Test: After writing your GREEN code, ask: "Would this implementation also pass tests I haven't written yet?" If yes, you're being too general. Scale back. Use hardcoded values, if/else chains, early returns — anything that handles only the tested cases. Generalization happens in REFACTOR or when the next RED cycle's tests demand it.
Rules:
[{status: "todo", tasks: [task1]}],
return exactly that structure with a hardcoded key — don't write a generic
loop that handles all statuses. The loop comes when the next test demands it.if checks — don't create a transition map. The map comes
in REFACTOR after all transitions are tested.return true, write
return true — you can generalize when more tests demand it.Why this matters: If your Cycle 1 GREEN is already general enough to pass Cycle 2 and Cycle 3 tests, then those later cycles are hollow — they confirm existing behavior rather than driving new implementation. The feedback loop that makes TDD valuable (test reveals a gap → implementation fills the gap → next test reveals the next gap) only works when each GREEN is deliberately constrained to the current tests.
Improve code quality while keeping all tests green.
Common refactoring targets:
After refactoring, commit the working state before starting the next Red phase.
When the writing-plans skill creates a masterPlan, each step's Progress items encode the TDD rhythm:
- **Progress**:
- [ ] Write failing test
- [ ] Run test — verify it fails
- [ ] Implement minimal code to pass
- [ ] Run tests — verify they pass
- [ ] Commit
Follow these Progress items mechanically. Update the plan after each phase (the checkpoint rule from persistent-plans still applies).
This is where TDD either works or degrades into test-first waterfall. The difference: in TDD, you write tests for one behavior at a time, implement it, then move to the next. In test-first waterfall, you write all tests upfront then implement everything at once — that defeats the purpose because you're guessing about edge cases before the implementation teaches you what matters.
The incremental cycle (minimum 3 cycles per feature):
return price * 0.9 (hardcoded 10% discount). Don't write
a generic applyDiscount(type, amount) yet.Cycle quality check: After each RED phase, before moving to GREEN, verify: "Did at least one new test fail?" If all new tests passed immediately, pause and consider:
A cycle where tests pass immediately with zero implementation changes is a wasted cycle — it confirms existing behavior rather than driving design. Two consecutive hollow cycles means you've lost the TDD feedback loop.
Why this matters: Each green phase teaches you something about the implementation that makes the next red phase's tests better. Writing all 29 tests upfront means you're guessing about boundary conditions before you've written a single line of implementation. The first few cycles are easy to predict, but by cycle 3-4 you'll discover edge cases you never would have thought of from the outside. But this only works if each GREEN is constrained enough that the next RED actually breaks something.
Plan integration: When the writing-plans skill creates steps with TDD, the progress items encode these cycles explicitly (Cycle 1 RED, Cycle 1 GREEN, Cycle 2 RED, etc.). Follow them mechanically — each RED item adds tests for one behavior slice, each GREEN item extends the implementation.
When modifying untested legacy code, write characterization tests first:
This is the inverse of normal TDD: you're not driving new design, you're documenting existing behavior as a safety net.
For functions with mathematical properties (sort, serialize/deserialize, encode/decode), consider property-based tests:
decode(encode(x)) === xf(f(x)) === f(x)Use the project's property testing library (fast-check, hypothesis, proptest, etc.) if available.
| Anti-Pattern | Why It Fails | Correct Approach | |---|---|---| | Write implementation + tests together | Tests validate existing code, don't drive design | Separate Red and Green into distinct phases | | Write all tests upfront in one batch | Speculative — you'll guess wrong about edge cases, and you lose the feedback loop that makes TDD valuable | Iterate: 1-3 tests per cycle, implement, repeat for at least 3 cycles | | Over-general GREEN implementation | Later cycles pass immediately with zero code changes — the feedback loop is hollow, tests confirm existing behavior instead of driving design | Hardcode first, use if/else chains, constrain to tested cases. Generalize in REFACTOR, not GREEN | | Test implementation details | Brittle — breaks on any refactor | Test behavior (inputs -> outputs) | | Skip the Red phase verification | You don't know if the test actually tests anything | Always run and confirm failure first | | Skip the Refactor phase | Technical debt accumulates silently | Always refactor after Green | | Over-mock everything | Tests pass but integration is broken | Mock at boundaries, not within units | | One giant test per feature | Hard to diagnose failures | One behavior per test |
development
Use after discovery to write implementation plans with TDD-granularity steps. Produces plan.json (immutable definition, frozen after approval), progress.json (mutable execution state), and masterPlan.md (user-facing proposal for Orbit review). Every step is one component/feature; TDD rhythm (test, verify fail, implement, verify pass, commit) lives in its progress items. Consumes discovery.md from exploration phase. Make sure to use this skill whenever the user says discovery is done, exploration is finished, discovery.md is ready, or asks to write/create/draft the implementation plan — even if they don't mention plan.json or masterPlan.md by name. Also use when the user references completed exploration findings, blast radius analysis, or consumer mappings and wants them converted into actionable steps. Do NOT use when: the user says 'just do it' or 'no plan', resuming or executing an existing plan, during exploration or brainstorming (discovery not yet complete), debugging, or code review.
tools
End-to-end webapp testing with Playwright MCP integration. Use when: writing Playwright tests, E2E testing, browser testing, webapp testing, visual regression testing, accessibility testing with axe-core, testing user flows through a web UI, verifying frontend behavior in a real browser. Integrates with test-driven-development skill for test-first browser tests and engineering-discipline for verification. Do NOT use when: unit tests only (no browser UI involved), API tests without UI, mobile native testing (use react-native-mobile), testing CLI tools, or writing backend-only integration tests.
development
Use when encountering any bug, test failure, or unexpected behavior. Enforces root cause investigation before fixes. Four phases: investigate, analyze patterns, form hypotheses, implement. Prevents guess-and-check thrashing. Use ESPECIALLY when under pressure or when 'just one quick fix' seems obvious. Do NOT use for: learning unfamiliar APIs (use exploration), performance optimization without a specific regression, or code review without a reported bug.
development
Generate distinctive, production-quality SVG artwork inline in code — decorative backgrounds, abstract illustrations, generative patterns, filter effects, section dividers, brand marks, data visualizations, and animated elements. Pure hand-coded SVG with no external image assets or libraries. Use this skill whenever the user asks for: SVG illustrations, decorative SVG backgrounds, SVG patterns, SVG textures, grain/noise effects, generative art, abstract shapes, blob shapes, topographic patterns, mesh gradients, hero illustrations, SVG icons, section dividers, SVG filters, duotone effects, glow effects, SVG data visualization, sparklines, inline charts, or any request where visual art should be created as SVG code rather than imported as an image. Also trigger when frontend-design produces a design that calls for decorative artwork, custom illustrations, or textured backgrounds. Do NOT use for: GSAP-driven SVG animation (use immersive-frontend), raster image editing, CSS-only effects that don't need SVG, or simple geometric shapes that don't require artistic direction.