plugins/cc10x/skills/test-driven-development/SKILL.md
Internal skill. Use cc10x-router for all development tasks.
npx skillsauth add romiluz13/cc10x test-driven-developmentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Write the test first. Watch it fail. Write minimal code to pass.
Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.
Violating the letter of the rules is violating the spirit of the rules.
Read only the references needed for the current test cycle:
references/testing-patterns.md for naming, AAA structure, near-miss negatives, behavioral focus, and anti-pattern checksreferences/test-data-and-mocks.md for factories, mock boundaries, common boundary mocks, and env/time handlingreferences/integration-and-live-proof.md when unit tests are not enough, or the plan requires real APIs, seeded data, browser flows, or stress proofAlways:
Exceptions (ask your human partner):
Thinking "skip TDD just this once"? Stop. That's rationalization.
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Write code before the test? Delete it. Start over.
No exceptions:
Implement fresh from tests. Period.
Problem: Test runners (Vitest, Jest) default to watch mode, leaving processes hanging indefinitely.
Mandatory Rules:
npx vitest run (NOT npx vitest)CI=true npx jest or npx jest --watchAll=falseCI=true npm test or npm test -- --runCI=true npm testpgrep -f "vitest|jest" || echo "Clean"pkill -f "vitest" 2>/dev/null || true ┌─────────┐ ┌─────────┐ ┌───────────┐
│ RED │──────>│ GREEN │──────>│ REFACTOR │
│ (Fail) │ │ (Pass) │ │ (Clean) │
└─────────┘ └─────────┘ └───────────┘
^ │
│ │
└────────────────────────────────────┘
Next Feature
WRONG (horizontal — all tests then all code):
RED: test1, test2, test3, test4, test5
GREEN: impl1, impl2, impl3, impl4, impl5
RIGHT (vertical — one feature at a time):
RED->GREEN: test1->impl1
RED->GREEN: test2->impl2
RED->GREEN: test3->impl3
DO NOT write all tests first, then all implementation. This produces bad tests:
Correct approach: One test → one implementation → repeat. Each test responds to what you learned from the previous cycle.
Write one minimal test showing what should happen.
Good:
test('retries failed operations 3 times', async () => {
let attempts = 0;
const operation = () => {
attempts++;
if (attempts < 3) throw new Error('fail');
return 'success';
};
const result = await retryOperation(operation);
expect(result).toBe('success');
expect(attempts).toBe(3);
});
Clear name, tests real behavior, one thing
Bad:
test('retry works', async () => {
const mock = jest.fn()
.mockRejectedValueOnce(new Error())
.mockRejectedValueOnce(new Error())
.mockResolvedValueOnce('success');
await retryOperation(mock);
expect(mock).toHaveBeenCalledTimes(3);
});
Vague name, tests mock not code
Requirements:
MANDATORY. Never skip.
CI=true npm test path/to/test.test.ts
Confirm:
Test passes? You're testing existing behavior. Fix test.
Test errors? Fix error, re-run until it fails correctly.
Write simplest code to pass the test.
Good:
async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {
for (let i = 0; i < 3; i++) {
try {
return await fn();
} catch (e) {
if (i === 2) throw e;
}
}
throw new Error('unreachable');
}
Just enough to pass
Bad:
async function retryOperation<T>(
fn: () => Promise<T>,
options?: {
maxRetries?: number;
backoff?: 'linear' | 'exponential';
onRetry?: (attempt: number) => void;
}
): Promise<T> {
// YAGNI - You Ain't Gonna Need It
}
Over-engineered
Don't add features, refactor other code, or "improve" beyond the test. Don't hard-code test values - implement general logic that works for ALL inputs.
MANDATORY.
CI=true npm test path/to/test.test.ts
Confirm:
Test fails? Fix code, not test.
Other tests fail? Fix now.
After green only:
Keep tests green. Don't add behavior.
Next failing test for next feature.
| Quality | Good | Bad |
|---------|------|-----|
| Minimal | One thing. "and" in name? Split it. | test('validates email and domain and whitespace') |
| Clear | Name describes behavior | test('test1') |
| Shows intent | Demonstrates desired API | Obscures what code should do |
For deeper test structure, near-miss negative tests, behavior-vs-internals, and
smell checks, read references/testing-patterns.md.
For factories, mocks, and env/time handling, read
references/test-data-and-mocks.md.
"I'll write tests after to verify it works"
Tests written after code pass immediately. Passing immediately proves nothing:
Test-first forces you to see the test fail, proving it actually tests something.
"I already manually tested all the edge cases"
Manual testing is ad-hoc. You think you tested everything but:
Automated tests are systematic. They run the same way every time.
"Deleting X hours of work is wasteful"
Sunk cost fallacy. The time is already gone. Your choice now:
The "waste" is keeping code you can't trust. Working code without real tests is technical debt.
If you catch yourself:
All of these mean: Delete code. Start over with TDD.
| Excuse | Reality | |--------|---------| | "Too simple to test" | Simple code breaks. Test takes 30 seconds. | | "I'll test after" | Tests passing immediately prove nothing. | | "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" | | "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. | | "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. | | "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. | | "Need to explore first" | Fine. Throw away exploration, start with TDD. | | "Test hard = design unclear" | Listen to test. Hard to test = hard to use. | | "TDD will slow me down" | TDD faster than debugging. Pragmatic = test-first. | | "Manual test faster" | Manual doesn't prove edge cases. You'll re-test every change. | | "Existing code has no tests" | You're improving it. Add tests for existing code. |
Bug: Empty email accepted
RED
test('rejects empty email', async () => {
const result = await submitForm({ email: '' });
expect(result.error).toBe('Email required');
});
Verify RED
$ npm test
FAIL: expected 'Email required', got undefined
GREEN
function submitForm(data: FormData) {
if (!data.email?.trim()) {
return { error: 'Email required' };
}
// ...
}
Verify GREEN
$ npm test
PASS
REFACTOR Extract validation for multiple fields if needed.
Before marking work complete:
Can't check all boxes? You skipped TDD. Start over.
When the accepted plan or risk profile goes beyond local behavior, read
references/integration-and-live-proof.md.
Unit tests are not enough when the task depends on:
In those cases, keep TDD for the inner loop and escalate verification depth for the outer proof.
Target: 80%+ code coverage across:
Verify with: npm run test:coverage or equivalent.
Below threshold? Add missing tests before claiming completion.
80% coverage means deliberate choices about what to test first. Focus effort on:
Do NOT skip tests because code "looks simple" — simple code breaks too. The 80% target is a floor, not a ceiling.
| Problem | Solution | |---------|----------| | Don't know how to test | Write wished-for API. Write assertion first. Ask your human partner. | | Test too complicated | Design too complicated. Simplify interface. | | Must mock everything | Code too coupled. Use dependency injection. | | Test setup huge | Extract helpers. Still complex? Simplify design. |
If tests are hard to write, the interface needs work:
Accept dependencies, don't create them
function processOrder(order, paymentGateway) {}function processOrder(order) { const gw = new StripeGateway(); }Return results, don't produce side effects
function calculateDiscount(cart): Discount {}function applyDiscount(cart): void { cart.total -= discount; }Small surface area — fewer methods = fewer tests needed, fewer params = simpler setup
Test how objects collaborate, not what they contain. If a test inspects .state, .length, or private fields, it is testing structure — and will break when internals change without behavior changing.
| Test target | Correct | Wrong |
|-------------|---------|-------|
| Function output | expect(calculate(input)).toBe(result) | expect(calculator.internalCache).toContain(...) |
| Component behavior | expect(screen.getByText('Saved')).toBeTruthy() | expect(component.state.saved).toBe(true) |
| Service interaction | expect(response.status).toBe(201) | expect(service.callCount).toBe(1) |
This is already implied by the "Testing implementation" smell in the Test Smells table. Make it the default lens: every assertion should answer "what did the user/caller observe?" not "what happened inside?"
When CC10x routes work across multiple agents (planner writes test specs, builder implements, reviewer verifies), the test file IS the contract:
Do not duplicate the contract in prose. If the test file expresses the requirement, the test file is the requirement.
## TDD Cycle
### Requirements
[What functionality is being built]
### RED Phase
- Test: [test name]
- Command: `npm test -- --grep "test name"`
- Result: exit 1 (FAIL as expected)
- Failure reason: [function not defined / expected X got Y]
### GREEN Phase
- Implementation: [summary]
- File: [path:line]
- Command: `npm test -- --grep "test name"`
- Result: exit 0 (PASS)
### REFACTOR Phase
- Changes: [what was improved]
- Command: `npm test`
- Result: exit 0 (all tests pass)
Production code → test exists and failed first
Otherwise → not TDD
No exceptions without your human partner's permission.
tools
Safe cc10x upgrade that preserves local modifications. Stashes diffs, pulls upstream, rebuilds cache, rebases patches. Use this skill when: updating cc10x, upgrading, pulling latest cc10x, syncing plugin, refreshing cache, or checking for new versions. Triggers: update cc10x, upgrade cc10x, pull cc10x, sync plugin, refresh cc10x, check for updates, new version, update plugin, upgrade plugin.
development
Use when a BUILD phase completes, a commit is staged, or a PR is about to be created, and the diff has not yet been reflected in documentation. Also use when the user says "update docs", "sync docs", "document this", or asks whether documentation is up to date.
development
Use when a bug, flaky test, or runtime/build failure needs root-cause tracing and a nearby duplicate-pattern scan before any fix.
development
THE ONLY ENTRY POINT FOR CC10X. Activate this skill for build, debug, review, and plan requests. Use when the user asks to implement, fix, review, plan, test, refactor, or continue code work. Trigger keywords: build, implement, create, write, add, review, audit, debug, fix, error, bug, broken, plan, design, architect, spec, brainstorm, test, refactor, optimize, update, change, research, cc10x, c10x. CRITICAL: Route and execute immediately. Do not stop at describing capabilities.