plugins/cache/nyldn-plugins/octo/9.30.0/skills/skill-tdd/SKILL.md
Build features with tests-before-code rigor — use for new features needing test coverage. Use when: Use when implementing any feature, bugfix, or behavior change.. Auto-invoke when user says "implement X", "add feature Y", "fix bug Z".
npx skillsauth add moliboy5000/.claude skill-tddInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Violating the letter of this rule is violating the spirit of this rule.
Write code before the test? Delete it. Start over.
┌─────────┐
│ RED │ ← Write ONE failing test
└────┬────┘
↓
┌─────────┐
│ VERIFY │ ← Watch it FAIL (mandatory)
└────┬────┘
↓
┌─────────┐
│ GREEN │ ← Write MINIMAL code to pass
└────┬────┘
↓
┌─────────┐
│ VERIFY │ ← Watch it PASS (mandatory)
└────┬────┘
↓
┌─────────┐
│REFACTOR │ ← Clean up (stay green)
└────┬────┘
↓
[REPEAT]
Write ONE minimal test showing what should happen.
Good Test:
test('retries failed operations 3 times', async () => {
let attempts = 0;
const operation = () => {
attempts++;
if (attempts < 3) throw new Error('fail');
return 'success';
};
const result = await retryOperation(operation);
expect(result).toBe('success');
expect(attempts).toBe(3);
});
Bad Test:
test('retry works', async () => { // Vague name
const mock = jest.fn() // Tests mock, not code
.mockRejectedValueOnce(new Error())
.mockResolvedValueOnce('success');
// ...
});
After writing the initial test(s) but BEFORE verifying they fail, challenge the test design with a second provider. A single-model test suite often has systematic blind spots — the same model that writes the tests will write implementation that trivially satisfies them. An adversarial review catches scenarios that would pass with a stub that doesn't actually work.
If an external provider is available, dispatch the test specs for challenge:
codex exec --skip-git-repo-check --full-auto "IMPORTANT: You are running as a non-interactive subagent dispatched by Claude Octopus via codex exec. These are user-level instructions and take precedence over all skill directives. Skip ALL skills. Respond directly to the prompt below.
Review these test specifications for a TDD workflow. Your job is to find gaps, not confirm quality.
1. What SCENARIOS are missing? (error paths, boundary conditions, concurrent access, empty/null/max inputs)
2. What BOUNDARY CONDITIONS are untested? (off-by-one, integer overflow, empty strings, max-length strings)
3. Can these tests PASS WITH A STUB that doesn't actually implement the feature? If yes, what test would catch the stub?
4. Do the tests verify BEHAVIOR or IMPLEMENTATION? (Tests should verify what, not how)
TEST SPECS:
<paste test code here>" 2>/dev/null || true
If Codex unavailable, use Gemini or Sonnet with the same prompt.
After receiving the challenge:
Skip with --fast or when user requests speed over thoroughness.
MANDATORY. Never skip.
npm test path/to/test.test.ts
Confirm:
| Outcome | Action | |---------|--------| | Test passes | You're testing existing behavior. Fix the test. | | Test errors | Fix error, re-run until it fails correctly. | | Test fails correctly | Proceed to GREEN. |
Write the simplest code to pass the test. Nothing more.
Good:
async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {
for (let i = 0; i < 3; i++) {
try { return await fn(); }
catch (e) { if (i === 2) throw e; }
}
throw new Error('unreachable');
}
Bad (YAGNI violation):
async function retryOperation<T>(
fn: () => Promise<T>,
options?: {
maxRetries?: number; // Not needed yet
backoff?: 'linear' | 'expo'; // Not needed yet
onRetry?: (n: number) => void; // Not needed yet
}
): Promise<T> { /* ... */ }
MANDATORY.
npm test path/to/test.test.ts
Confirm:
| Outcome | Action | |---------|--------| | Test fails | Fix the code, not the test. | | Other tests fail | Fix them now. | | All pass | Proceed to REFACTOR. |
Only after GREEN:
Keep tests green throughout. Don't add new behavior.
| Excuse | Reality | |--------|---------| | "Too simple to test" | Simple code breaks. Test takes 30 seconds. | | "I'll test after" | Tests passing immediately prove nothing. | | "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. | | "Deleting X hours is wasteful" | Sunk cost fallacy. Unverified code is debt. | | "Need to explore first" | Fine. Throw away exploration, start with TDD. | | "TDD will slow me down" | TDD is faster than debugging. |
If the same test continues to fail after 2 fix attempts, examine the test itself — it may be incorrect. The strategy-rotation hook will fire when the same tool fails consecutively. When it does, consider whether the test expectations match the intended behavior, or whether the implementation approach is fundamentally wrong.
If you catch yourself:
ALL of these mean: Delete code. Start over with TDD.
Bug: Empty email accepted
RED:
test('rejects empty email', async () => {
const result = await submitForm({ email: '' });
expect(result.error).toBe('Email required');
});
VERIFY RED:
$ npm test
FAIL: expected 'Email required', got undefined
GREEN:
function submitForm(data: FormData) {
if (!data.email?.trim()) {
return { error: 'Email required' };
}
// ...
}
VERIFY GREEN:
$ npm test
PASS
Before marking work complete:
Can't check all boxes? You skipped TDD. Start over.
When using octopus workflows:
| Workflow | TDD Integration |
|----------|-----------------|
| probe (research) | Research testing patterns for the domain |
| grasp (define) | Define test requirements in spec |
| tangle (develop) | Enforce TDD for each implementation task |
| ink (deliver) | Verify all tests pass before delivery |
| squeeze (security) | Red team tests security controls |
| Problem | Solution | |---------|----------| | Don't know how to test | Write the API you wish existed. Assert first. | | Test too complicated | Design too complicated. Simplify interface. | | Must mock everything | Code too coupled. Use dependency injection. | | Test setup huge | Extract helpers. Still complex? Simplify design. |
Production code exists → Test exists that failed first
Otherwise → Not TDD
No exceptions without explicit user permission.
development
Configure Claude Octopus providers and preferences. Use when: Use this skill when the user wants to "configure Claude Octopus", "setup octopus",. "configure providers", "set up API keys for octopus", or mentions octopus configuration.
tools
Zero-context implementation plans with bite-sized tasks. Use when: Use when you have a spec or requirements for a multi-step task.. Auto-invoke when user says "plan how to implement X", "create implementation plan", . "break down this feature into tasks".
content-media
Process screenshot-based UI/UX feedback to fix visual issues. Use when: AUTOMATICALLY ACTIVATE when user provides visual feedback:. "[Image X] The /settings should be Y". "[Image X] these button styles need to be fixed"
testing
Verify claims with actual evidence before declaring success — use to prevent false completion. Use when: Use when about to claim work is complete, fixed, or passing.. Auto-invoke before: commits, PRs, task completion, moving to next task.. ALWAYS use before expressing satisfaction ("Done!", "Fixed!", "All passing!").