skills/generating-tdd-tests/SKILL.md
TDD with RGRC cycle and Baby Steps. Use when: TDD, テスト駆動, Red-Green-Refactor, Baby Steps.
npx skillsauth add thkt/dotclaude generating-tdd-testsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Test behavior, not implementation. Assert on what the code does (observable output, return values, side effects), not how it does it (internal calls, private state, execution order).
A test that fails for the wrong reason (syntax error, wrong import) is not a valid Red. Fix the test first, then verify the failure matches the intended behavior gap.
Default: Classical (Detroit) style. Use real objects. Mock only at system boundaries.
| Principle | Rule | | --- | --- | | Behavior over implementation | Test public API output, not internal calls | | State verification preferred | Assert on result values, not "was X called" | | Real objects first | Use real dependencies. Mock only external I/O | | Black-box perspective | Treat the unit as a black box via its public interface | | Sociable tests | Let collaborators participate. Isolate only at boundaries |
When to use mocks (London style exceptions):
| Phase | Goal | Rule | | -------- | ------------ | ------------------------ | | Red | Failing test | Verify failure reason | | Green | Pass test | "You can sin" - dirty OK | | Refactor | Clean code | Keep tests green | | Commit | Save state | All checks pass |
30s: Write failing test → 1min: Make pass → 10s: Run tests → 30s: Tiny refactor → 20s: Commit if green
| Technique | Use For | Example | | ------------------------ | --------------------- | ---------------------- | | Equivalence Partitioning | Group same behavior | Age: <18, 18-120 | | Boundary Value | Test edges | 17, 18, 120, 121 | | Decision Table | Multi-condition logic | isLoggedIn × isPremium |
Every test must verify a specific outcome. Weak assertions alone are forbidden.
| Category | Matchers | When acceptable | | --- | --- | --- | | Weak (existence) | toBeTruthy, toBeDefined, toBeFalsy, toBeNull, toBeUndefined | Only with a meaningful assertion in the same test | | Meaningful (value) | toBe, toEqual, toStrictEqual, toMatch, toContain, toThrow, toHaveLength | Always preferred | | Meaningful (call) | toHaveBeenCalledWith, toHaveBeenCalledTimes, toHaveReturnedWith | When verifying side effects |
Bad: expect(result).toBeTruthy()
Good: expect(result).toEqual({ id: 1, name: "Alice" })
One test, one concept. If two tests assert the same function with the same
argument pattern, merge or parameterize with test.each.
| Rule | Threshold | | --- | --- | | Mock count per test | Must not exceed assertion count | | Mock scope | External dependencies only (API, DB, file system) | | Mock target | Never mock the module under test |
| Anti-Pattern | Problem | Instead | | --------------------------- | ------------------------------------------- | ------------------------------------------ | | Assert mock was called | Tests mock behavior, not component behavior | Assert on observable output or side effect | | Test-only production method | Pollutes production API for test access | Extract to test utility or use public API | | Mock before understanding | Hides real dependency behavior | Understand dependency first, then mock | | Partial mock structure | Missing fields cause false passes | Mirror complete real API structure | | Mock overuse | More mocks than assertions = testing wiring | Reduce mocks or add meaningful assertions |
See rules/development/THRESHOLDS.md for canonical values.
| Level | Pattern |
| ----- | ------------------------------------------------ |
| Suite | describe("[Target]", ...) |
| Group | describe("[Method]", ...) |
| Test | it("when [condition], should [expected]", ...) |
| Condition | Framework |
| ------------------ | --------- |
| vitest in deps | Vitest |
| jest in deps | Jest |
| bun as runtime | Bun test |
| No framework found | Vitest |
| Topic | File |
| -------------- | --------------------------------------------------------- |
| Feature-driven | ${CLAUDE_SKILL_DIR}/references/feature-driven.md |
| Bug-driven | ${CLAUDE_SKILL_DIR}/references/bug-driven.md |
| Flaky tests | ${CLAUDE_SKILL_DIR}/references/flaky-test-management.md |
tools
Internal helper for /think Step 11. Renders SOW.md + Spec.md as an integrated Astro view and returns a dev server URL.
development
Extract repository spec while detecting bugs, spec gaps, and consistency drift via dual-purpose documentation. OUTCOME.md-axis question-driven exploration with ephemeral output. Do NOT use for code review (use /audit or /polish), feature implementation (use /code), planning only (use /think), or single-bug fix (use /fix).
development
Discover undocumented design decisions and challenge each candidate via critic-design before promotion. Rank by impact and reversibility, produce ADR promotion candidates. Treat each candidate as a position arguing for ADR status, not a fact to be filed. Pairs with audit-adr-drift, which scans existing ADRs for drift against code.
development
Scan ADR Decision sections against current code and report drift with modification direction and priority. Do NOT use for repos without ADRs (use audit-adr-gaps instead).