core/capabilities/execution/tdd/SKILL.md
Enforces test-driven development: write failing test, write minimal code to pass, refactor. Mandatory for all implementation work. Use when writing any production code, implementing features, fixing bugs, refactoring, or when the user says "write code", "implement", "fix this", or "add a feature". Code written before its test is deleted.
npx skillsauth add xoai/sage tddInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Write the test first. Watch it fail. Write minimal code to pass. Refactor. Commit.
The Iron Law: If you didn't watch the test fail, you don't know if it tests the right thing. There are no exceptions. Violating the letter of these rules IS violating the spirit.
ALWAYS. Every time you write production code. Every feature. Every bug fix. Every refactor that changes behavior. The urge to skip "just this once" is a rationalization signal, not a legitimate exception.
RED → VERIFY FAIL → GREEN → VERIFY PASS → REFACTOR → VERIFY STILL PASS → COMMIT → NEXT
Write one minimal test that describes the expected behavior.
Rules:
rejects expired tokens not test authRun the test. It MUST fail. If it passes on first run, the test is suspect — it's either testing something that already works (useless) or testing the wrong thing (dangerous). Investigate before proceeding.
The failure must be the RIGHT failure — the test should fail because the feature doesn't exist yet, not because of a syntax error or wrong import.
Write the SIMPLEST code that makes the test pass. Nothing more.
Rules:
Run ALL tests. The new test must pass. All existing tests must still pass. If any test fails, fix it before proceeding. Never leave tests red.
Now — and ONLY now — you may improve the code. Remove duplication, extract helpers, rename for clarity, simplify logic.
Rules:
Commit with a semantic message referencing what behavior was added. Then start the next RED cycle for the next behavior.
If production code was written before its corresponding test existed and was observed failing: DELETE THE CODE. Implement fresh from tests only.
Delete means delete. Not "comment out." Not "keep as reference." Delete. Then write the test. Watch it fail. Then rewrite the code.
This is not punitive. Code written without tests cannot be trusted. The cost of rewriting with TDD is lower than the cost of debugging code you can't verify.
"But I already wrote 200 lines!" — Sunk cost fallacy. The time is gone. Your choice now:
The "waste" is keeping code you can't trust.
Tests written after code answer: "What does this code do?" Tests written before code answer: "What should this code do?"
Tests-after pass immediately. Passing immediately proves nothing — you never see them fail, so you don't know if they test the right thing. Tests-after also test implementation details instead of behavior, making them brittle.
Before declaring implementation complete, verify ALL of these:
Can't check all boxes? You skipped TDD. Start over from the first unchecked item.
| Rationalization | Why It's Wrong | What to Do | |----------------|---------------|------------| | "This is too simple to test" | Simple bugs cause outages. Tests are fast for simple code. | Write the test. It'll take 30 seconds. | | "I'll write tests after" | Tests-after test implementation, not behavior. They prove nothing. | Delete code. Write test first. | | "Just this once" | That's what everyone says every time. It's never just once. | Follow the process. It's faster than debugging. | | "I'm just refactoring" | If behavior changes, you need a test. If it doesn't, existing tests suffice. | Run existing tests. If they break, you're changing behavior — write a test. | | "Manual testing is fine" | You think you tested everything. You didn't. You can't remember what you tested tomorrow. | Automated tests are repeatable. Manual tests are guesses. | | "The deadline is tight" | TDD is faster than debugging untested code under deadline pressure. | Skipping TDD makes deadlines worse, not better. | | "It's just a config change" | Config bugs are the hardest to debug and the easiest to test. | Write a test that verifies the config does what you expect. |
The ONLY way to skip TDD is an explicit human override:
Human: "skip-tdd for this change"
The waiver is logged in .sage/decisions.md with reason and scope.
The resulting code is flagged as untested technical debt for follow-up.
The agent MUST NOT suggest or encourage waivers.
RED: test('login rejects expired JWT tokens', () => { ... })
→ Run → FAILS (expected: 401, got: 200) ✓ Correct failure
GREEN: Add expiry check to auth middleware
→ Run → PASSES ✓
→ All other tests → PASS ✓
REFACTOR: Extract token validation to shared utility
→ Run → All PASS ✓
COMMIT: "fix: reject expired JWT tokens"
Agent writes auth middleware with expiry check → 200 lines
Agent writes tests after → All pass immediately
PROBLEM: Tests pass because they test what the code does, not what it should do.
Edge cases missed. Expiry check has off-by-one error. Nobody will find
it until production.
CORRECT ACTION: Delete the code. Start with the test.
tools
Captures agent mistakes, corrections, and discovered gotchas so they are not repeated. Use when: (1) a command or operation fails unexpectedly, (2) the user corrects the agent, (3) the agent discovers non-obvious behavior through debugging, (4) an API or tool behaves differently than expected, (5) a better approach is found for a recurring task. Also searches past learnings before starting tasks to avoid known pitfalls. Activate alongside the sage-memory skill — they share the same MCP backend but serve different purposes (sage-memory = codebase knowledge, sage-self-learning = agent mistakes and gotchas).
development
Typed knowledge graph stored in sage-memory. Use when creating or querying structured entities (Person, Project, Task, Event, Document), linking related objects, checking dependencies, planning multi-step actions as graph transformations, or when skills need to share structured state. Trigger on "remember that X is Y", "what do I know about", "link X to Y", "show dependencies", "what blocks X", entity CRUD, cross-skill data access, or any request involving structured relationships between things.
tools
Integrates sage-memory into Sage workflows. Teaches the agent when to remember (store findings during work), when to recall (search memory at session start and task start), and how to learn (structured knowledge capture via sage learn). Use when the user mentions memory, remember, recall, learn, capture knowledge, onboard to codebase, or when starting any session where sage-memory MCP tools are available.
tools
Captures agent mistakes, corrections, and discovered gotchas so they are not repeated. Use when: (1) a command or operation fails unexpectedly, (2) the user corrects the agent, (3) the agent discovers non-obvious behavior through debugging, (4) an API or tool behaves differently than expected, (5) a better approach is found for a recurring task. Also searches past learnings before starting tasks to avoid known pitfalls. Activate alongside the sage-memory skill — they share the same MCP backend but serve different purposes (sage-memory = codebase knowledge, sage-self-learning = agent mistakes and gotchas).