open-weight/skills/tdd/SKILL.md
Use when writing any code — functions, modules, APIs, UI components, scripts, or any other implementation. Use when asked to "implement", "build", "write", "add", "create", or "refactor" anything that involves code. Also covers TypeScript/Vitest testing patterns, factories, mocking, and integration tests with testcontainers.
npx skillsauth add jon23d/skillz tddInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Write the test first. Run it. Watch it fail. Then write the code.
Step 1 — Write the test file only. The implementation file must not exist. Write tests that describe the required behavior from the outside.
Step 2 — Run the tests and show the failure output. Do not proceed until you have run the test command and shown the failure.
Step 3 — Write the minimum implementation to make the tests pass.
Step 4 — Run the tests again and show them passing.
Step 5 — Refactor if needed, keeping tests green.
Writing tests and implementation in the same step is not tdd.
Run every test that CI will run — locally, before reporting back. No test suite is "CI only." This includes unit tests (npx vitest run), integration tests, type checking, and linting (pnpm lint or npm run lint). If the project has e2e tests, run them too. Zero errors required across all of them.
Once everything is clean, invoke @reviewer. It will run git diff main...HEAD to determine what changed. If it returns "fail", resolve all critical and major issues and re-invoke before continuing. Do not report back until the reviewer returns "pass" or "pass_with_issues" with no critical or major issues.
Step 1 — Write a regression test that exposes the bug. The test fails with an assertion error (wrong output), not "module not found".
Step 2 — Run and show the failure.
Step 3 — Fix the code.
Step 4 — Run and show all tests passing.
Step 1 — Write tests for the new feature only. Existing code stays untouched.
Step 2 — Run: existing pass, new tests fail.
Step 3 — Add minimum implementation.
Step 4 — Run all tests: all pass.
The rule: if you create it, you test it. A new class extracted from an existing function is new code. It doesn't matter that the logic existed before — the unit is new.
"Existing tests pass" is Step 3. It is not Step 6.
fetch, TanStack Query hooks) → integration tests with MSW. Mock nothing at the code level — MSW intercepts the network.Do not mock Prisma in unit tests — if the code calls Prisma, it belongs in a repository with an integration test.
Do not mock fetch or stub HTTP clients with vi.fn() — if the code makes HTTP requests, use MSW to intercept them at the network level.
Install: npm install --save-dev @testcontainers/postgresql testcontainers
import { PostgreSqlContainer, StartedPostgreSqlContainer } from '@testcontainers/postgresql'
import { PrismaClient } from '@prisma/client'
import { execSync } from 'child_process'
let container: StartedPostgreSqlContainer
let prisma: PrismaClient
beforeAll(async () => {
container = await new PostgreSqlContainer('postgres:16-alpine').start()
const url = container.getConnectionUri()
execSync('npx prisma migrate deploy', { env: { ...process.env, DATABASE_URL: url } })
prisma = new PrismaClient({ datasources: { db: { url } } })
await prisma.$connect()
}, 60_000)
afterAll(async () => {
await prisma.$disconnect()
await container.stop()
})
Each test runs inside an interactive transaction that is never committed:
let tx: Prisma.TransactionClient
let rollback: (err: Error) => void
beforeEach(async () => {
await new Promise<void>((resolve, reject) => {
rollback = reject
prisma.$transaction(async (t) => {
tx = t
resolve()
await new Promise<never>(() => {})
}).catch(() => {})
})
})
afterEach(() => {
rollback(new Error('rollback'))
})
All queries within a test must use tx, not the global prisma.
Install: npm install --save-dev msw
import { setupServer } from 'msw/node'
import { http, HttpResponse } from 'msw'
const server = setupServer()
beforeAll(() => server.listen({ onUnhandledRequest: 'error' }))
afterEach(() => server.resetHandlers())
afterAll(() => server.close())
onUnhandledRequest: 'error' makes any request without a handler fail the test — no silent network leaks.
Define handlers inside each test (or beforeEach for a shared happy path). Keep handlers close to the assertions that depend on them:
it('returns the user profile', async () => {
server.use(
http.get('https://api.example.com/users/:id', ({ params }) => {
return HttpResponse.json({
id: params.id,
name: 'Jane Doe',
email: '[email protected]',
})
}),
)
const profile = await userService.getProfile('user-1')
expect(profile).toEqual({
id: 'user-1',
name: 'Jane Doe',
email: '[email protected]',
})
})
Use server.use() to override the happy path for individual tests:
it('throws on server error', async () => {
server.use(
http.get('https://api.example.com/users/:id', () => {
return new HttpResponse(null, { status: 500 })
}),
)
await expect(userService.getProfile('user-1')).rejects.toThrow('Server error')
})
it('handles network failure', async () => {
server.use(
http.get('https://api.example.com/users/:id', () => {
return HttpResponse.error()
}),
)
await expect(userService.getProfile('user-1')).rejects.toThrow()
})
setupServer() per test file. Do not share server instances across files.onUnhandledRequest: 'error' is non-negotiable. Silent passthrough hides real bugs.handlers.ts for a large API surface where every test uses the same happy path — but per-test overrides via server.use() still go in the test.HttpResponse.json(), HttpResponse.text(), or new HttpResponse() — never return plain objects.Every domain type has a factory in test_utils/factories/. Never define factory functions inside a test file. Always use randomUUID() for IDs.
BaseFactory:
export abstract class BaseFactory<T> {
abstract build(overrides?: Partial<T>): T
buildList(count: number, overrides?: Partial<T>): T[] {
return Array.from({ length: count }, () => this.build(overrides))
}
}
Domain factory:
class UserFactory extends BaseFactory<User> {
build(overrides: Partial<User> = {}): User {
return {
id: randomUUID(),
name: 'Test User',
email: `test-${randomUUID()}@example.com`,
isAdmin: false,
tier: 'free',
...overrides,
}
}
admin(overrides: Partial<User> = {}): User {
return this.build({ isAdmin: true, ...overrides })
}
}
export const userFactory = new UserFactory()
For integration tests, use a thin create helper that inserts via tx:
async function createUser(overrides: Partial<User> = {}) {
return tx.user.create({ data: userFactory.build(overrides) })
}
it. Multiple assertions OK if same logical outcome.returns false when order is shipped.Mock at module boundaries only: external services, database clients, filesystem. Prefer dependency injection over vi.mock. Create vi.fn() mocks inside each it block. For HTTP APIs, use MSW instead of vi.fn() — see the MSW section above.
Always await async calls. Never use done callbacks.
it.each([
['free', 100, 100],
['pro', 100, 90],
['enterprise', 100, 80],
] as const)('applies correct discount for %s tier', (tier, input, expected) => {
const user = userFactory.build({ tier })
expect(applyDiscount(input, user)).toBe(expected)
})
Use React Testing Library. Query by accessible role, label, or visible text. Never getByTestId. Use userEvent (not fireEvent). Test all three data states: loading, error, success.
If the project has both API endpoint tests and RTL+MSW component tests, the existing layers already cover the seams e2e is meant to catch. Do not add e2e tests by default.
Before writing any e2e test, answer: does this scenario require a real browser against a real backend, and would it be caught by nothing lower in the stack?
Legitimate e2e scenarios:
Not legitimate — use RTL+MSW or endpoint tests instead:
If none of the legitimate cases apply, do not write the e2e test. Write or improve the RTL or endpoint test instead.
Do not chase numbers. Aim for tests that would catch real regressions.
This skill governs the red-green-refactor mechanics within a single test cycle. When a task involves multiple collaborating modules or services, also load the outside-in-double-loop skill — it governs the order in which you build those modules (outer test first, stub dependencies, then build each stub via its own tdd cycle).
fetch or stub an HTTP client with vi.fn() instead of using MSW@reviewerdevelopment
Use when adding or modifying environment variable handling in TypeScript projects or monorepos — especially when using process.env directly, missing startup validation, sharing env schemas across packages, or encountering "undefined is not a string" errors at runtime from missing env vars.
testing
Use when creating a new skill, editing an existing skill, writing a SKILL.md, or verifying a skill works before deployment.
development
React UI design principles and conventions. Load when building or modifying any user interface or React components. Covers application type detection, visual standards, component design and structure, Mantine (business apps) and Tailwind (consumer apps), accessibility, responsiveness, state management, data fetching, testing, and in-app help patterns.
development
Use when setting up ESLint and/or Prettier in a TypeScript project, adding linting to an existing TypeScript codebase, or configuring typescript-eslint, eslint-config-prettier, or related packages.