.skills/tdd/SKILL.md
Use before writing any code, for any reason. Enforces strict test-driven development — RED, GREEN, REFACTOR.
npx skillsauth add swissarmyhammer/swissarmyhammer tddInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Write the test first. Watch it fail. Write correct, well-designed code to pass.
Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.
Take your time and do your best work. There is no reward for speed. There is every reward for correctness.
Violating the letter of the rules is violating the spirit of the rules.
Always:
Thinking "skip TDD just this once"? Stop. That's rationalization.
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Did you write code before the test? Delete it. Start over.
No exceptions:
Write one minimal test showing what should happen.
Requirements:
MANDATORY. Never skip.
Run the test. Confirm:
Test passes immediately? You're testing existing behavior. Fix the test.
Test errors? Fix the error, re-run until it fails correctly.
Write correct, well-designed code that passes the test and follows the codebase's prevailing patterns.
MANDATORY.
Run the test. Confirm:
New test fails? Fix the code, not the test.
Other tests broke? Fix them now.
Only after green — this is where you make the code excellent:
Keep tests green throughout. Don't add new behavior, but do make the existing behavior bulletproof.
Next failing test for the next behavior.
| Excuse | Reality | |--------|---------| | "Too simple to test" | Simple code breaks. Test takes 30 seconds. | | "I'll test after" | Tests passing immediately prove nothing. | | "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" | | "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. | | "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. | | "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. | | "Need to explore first" | Fine. Throw away exploration, start with TDD. | | "Test hard = design unclear" | Listen to the test. Hard to test = hard to use. | | "TDD will slow me down" | TDD is faster than debugging. Pragmatic = test-first. | | "Manual test faster" | Manual doesn't prove edge cases. You'll re-test every change. | | "Existing code has no tests" | You're improving it. Add tests for the code you touch. |
All of these mean: Delete the code. Start over with TDD.
"I'll write tests after to verify it works"
Tests written after code pass immediately. Passing immediately proves nothing:
Test-first forces you to see the test fail, proving it actually tests something.
"Deleting X hours of work is wasteful"
Sunk cost fallacy. The time is already gone. Your choice now:
The "waste" is keeping code you can't trust.
"TDD is dogmatic, being pragmatic means adapting"
TDD IS pragmatic:
"Pragmatic" shortcuts = debugging in production = slower.
| Quality | Good | Bad |
|---------|------|-----|
| Minimal | One thing. "and" in name? Split it. | test('validates email and domain and whitespace') |
| Clear | Name describes behavior | test('test1') |
| Shows intent | Demonstrates desired API | Obscures what code should do |
| Real code | Tests actual behavior | Tests mock behavior |
| Problem | Solution | |---------|----------| | Don't know how to test | Write the wished-for API. Write the assertion first. Ask the user. | | Test too complicated | Design too complicated. Simplify the interface. | | Must mock everything | Code too coupled. Use dependency injection. | | Test setup huge | Extract helpers. Still complex? Simplify design. |
Bug found? Write a failing test that reproduces it. Follow the TDD cycle. The test proves the fix and prevents regression.
Never fix bugs without a test.
Before marking work complete:
Can't check all boxes? You skipped TDD. Start over.
Production code → test exists and failed first
Otherwise → not TDD
No exceptions without the user's explicit permission.
research
Create a single, well-researched kanban task. Use when the user wants to add a task, track an idea, or capture work without entering full plan mode.
testing
Drive kanban tasks from ready to done by looping implement → test → review until each task is clean. Supports single-task mode (one task id) and scoped-batch mode (all ready tasks in a tag/project/filter). Uses ralph to prevent stopping between iterations.
tools
Use when starting any conversation - establishes how to find and use skills, requiring Skill tool invocation before ANY response including clarifying questions
testing
Run tests and analyze results. Use when the user wants to run the test suite or test specific functionality. Test runs produce verbose output — automatically delegates to a tester subagent.