plugins/agent-execution-disciplines/skills/test-driven-development/SKILL.md
Use when implementing any feature or bugfix, before writing implementation code
npx skillsauth add richfrem/agent-plugins-skills test-driven-developmentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Source: Ported from obra/superpowers by Jesse Vincent. Adapted for the
agent-plugins-skillsecosystem. Original concepts and Iron Laws credit belongs to Jesse.
Write the test first. Watch it fail. Write minimal code to pass.
Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.
Violating the letter of the rules is violating the spirit of the rules.
Always:
Exceptions (ask your human partner):
Thinking "skip TDD just this once"? Stop. That's rationalization.
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Write code before the test? Delete it. Start over.
No exceptions:
Implement fresh from tests. Period.
digraph tdd_cycle {
rankdir=LR;
red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];
verify_red [label="Verify fails\ncorrectly", shape=diamond];
green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];
verify_green [label="Verify passes\nAll green", shape=diamond];
refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"];
next [label="Next", shape=ellipse];
red -> verify_red;
verify_red -> green [label="yes"];
verify_red -> red [label="wrong failure"];
green -> verify_green;
verify_green -> refactor [label="yes"];
verify_green -> green [label="no"];
refactor -> verify_green [label="stay green"];
verify_green -> next;
next -> red;
}
Write one minimal test showing what should happen.
One behavior. Clear name. Real code (no mocks unless unavoidable).
MANDATORY. Never skip.
Run the test suite. Confirm:
Test passes? You're testing existing behavior. Fix test.
Test errors? Fix error, re-run until it fails correctly.
Write simplest code to pass the test.
Don't add features, refactor other code, or "improve" beyond the test.
MANDATORY.
Confirm:
Test fails? Fix code, not test.
After green only:
Keep tests green. Don't add behavior.
Next failing test for next feature.
| Quality | Good | Bad |
|---------|------|-----|
| Minimal | One thing. "and" in name? Split it. | test('validates email and domain and whitespace') |
| Clear | Name describes behavior | test('test1') |
| Shows intent | Demonstrates desired API | Obscures what code should do |
"I'll write tests after to verify it works"
Tests written after code pass immediately. Passing immediately proves nothing:
Test-first forces you to see the test fail, proving it actually tests something.
"Deleting X hours of work is wasteful"
Sunk cost fallacy. The time is already gone. Working code without real tests is technical debt.
| Excuse | Reality | |--------|---------| | "Too simple to test" | Simple code breaks. Test takes 30 seconds. | | "I'll test after" | Tests passing immediately prove nothing. | | "Already manually tested" | Ad-hoc is not systematic. No record, can't re-run. | | "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. | | "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. | | "TDD will slow me down" | TDD faster than debugging. |
All of these mean: Delete code. Start over with TDD.
Before marking work complete:
Can't check all boxes? You skipped TDD. Start over.
| Problem | Solution | |---------|----------| | Don't know how to test | Write wished-for API. Write assertion first. Ask your human partner. | | Test too complicated | Design too complicated. Simplify interface. | | Must mock everything | Code too coupled. Use dependency injection. | | Test setup huge | Extract helpers. Still complex? Simplify design. |
Bug found? Write failing test reproducing it. Follow TDD cycle. Test proves fix and prevents regression.
Never fix bugs without a test.
When adding mocks or test utilities, read references/testing-anti-patterns.md to avoid common pitfalls:
Production code -> test exists and failed first
Otherwise -> not TDD
No exceptions without your human partner's permission.
tools
Ingests repository files into the ChromaDB vector store. Builds or updates the vector index from a manifest or directory scan using ingest.py. Use when new files need to be indexed or the vector store is out of date. <example> user: "Index these new plugin files into the vector database" assistant: "I'll use vector-db-ingest to add them to the vector store." </example> <example> user: "The vector store is missing recent files -- update it" assistant: "I'll use vector-db-ingest to re-index the changes." </example>
data-ai
Removes stale and orphaned chunks from the ChromaDB vector store for files that have been deleted or renamed. Use after files are removed or moved to keep the vector index in sync with the filesystem. <example> user: "Clean up the vector store after I deleted some files" assistant: "I'll use vector-db-cleanup to remove orphaned chunks." </example> <example> user: "The vector database has chunks for files that no longer exist" assistant: "I'll run vector-db-cleanup to prune them." </example>
testing
Audit Vector DB coverage -- compares the live filesystem manifest against the ChromaDB index to identify coverage gaps.
development
3-Phase Knowledge Search strategy for the RLM Factory ecosystem. Auto-invoked when tasks involve finding code, documentation, or architecture context in the repository. Enforces the optimal search order: RLM Summary Scan (O(1)) -> Vector DB Semantic Search -> Grep/Exact Match. Never skip phases.