skills/testing-discipline/SKILL.md
Test quality standards and TDD practices. Use this skill whenever writing tests, reviewing test quality, deciding what to mock, choosing test strategy, or evaluating test coverage. Also use when you notice tests that are brittle, slow, or always passing regardless of implementation. Complements `writing-tests` (which handles test execution) with the principles that make tests actually useful.
npx skillsauth add maestria-co/ai-playbook testing-disciplineInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Tests that exist but don't catch bugs are worse than no tests — they create false
confidence. This skill focuses on test quality over test quantity: TDD workflow,
test structure, mocking strategy, anti-patterns, and when coverage numbers lie.
Use alongside writing-tests for the full testing workflow.
Write the test before the implementation. This sounds simple but is frequently skipped.
| Situation | Use TDD? | | -------------------------------------- | ---------------------------------------------------------- | | New function with clear inputs/outputs | Yes | | Bug fix | Yes — write the failing test that reproduces the bug first | | Complex algorithm | Yes — build it test by test | | UI/visual changes | Usually no — visual regression tools are better | | Exploratory prototyping | No — but write tests before promoting to production | | Configuration changes | No — integration test after the change |
Every test follows this pattern:
// Arrange — set up preconditions
const user = createTestUser({ role: 'admin' });
const service = new AuthService(mockRepo);
// Act — perform the action under test
const result = service.canAccess(user, '/admin/settings');
// Assert — verify the outcome
expect(result).toBe(true);
"admin user can access admin settings" not "test auth"if, for, switch in test code. Tests are linear.| Dependency | Mock? | | ----------------------------------------- | ------------------------------------- | | External API (HTTP, database, filesystem) | Yes — always | | Time (Date.now, timestamps) | Yes — for deterministic tests | | Random values | Yes — seed or inject | | Internal utility functions | No — test through them | | Same-module functions | No — test the public surface | | Configuration/environment | Depends — mock if flaky, real if fast |
// BAD — mocking everything makes the test meaningless
const mockA = mock(ServiceA);
const mockB = mock(ServiceB);
const mockC = mock(ServiceC);
const result = handler(mockA, mockB, mockC);
// What are you even testing?
// GOOD — mock only the external boundary
const mockDB = createFakeDatabase();
const service = new UserService(mockDB);
const result = service.getActiveUsers();
expect(result).toHaveLength(3);
Coverage is a signal, not a goal. 100% coverage with bad tests is worse than 70% coverage with good tests.
| Code type | Minimum coverage | | ----------------------- | ------------------------------ | | Business logic | 80%+ | | Utility functions | 90%+ | | API handlers | 70%+ (integration tests) | | Configuration/glue code | 50%+ | | Generated code | 0% (don't test generated code) |
// BAD — tests the internal method call chain
expect(service.internalHelper).toHaveBeenCalledWith(42);
// GOOD — tests the observable behavior
expect(service.calculate(42)).toBe(84);
// BAD — snapshot of a complex object (breaks on any change)
expect(result).toMatchSnapshot();
// GOOD — assert on specific properties that matter
expect(result.status).toBe('active');
expect(result.items).toHaveLength(3);
// BAD
it('test 1', ...)
it('should work', ...)
it('handles the case', ...)
// GOOD
it('returns 404 when user does not exist', ...)
it('retries failed payment up to 3 times', ...)
it('sends welcome email after successful registration', ...)
// BAD — shared mutable state across tests
let sharedUser;
beforeEach(() => { sharedUser = createUser(); });
// GOOD — each test creates what it needs
it('deactivates user', () => {
const user = createUser({ active: true });
deactivate(user);
expect(user.active).toBe(false);
});
development
Writes and runs a test suite for a piece of code, covering happy path, edge cases, error cases, and security cases. Use when: implementation is complete and needs test coverage, a bug needs a reproduction test and fix validation, or code needs coverage before a refactor. Do not use when: the code under test is not yet implemented, or the spec is still unclear.
testing
Use when creating a new skill, editing an existing skill, or helping a user author a skill for this system. Covers structure, discoverability, quality, and discipline hardening.
development
Evidence-based verification process to run before marking any task complete. Use this skill every time you're about to report that work is done — for features, bug fixes, refactoring, or any code change. This catches the most common failure mode: declaring "done" without proof. If you're finishing up and about to tell the user the task is complete, run this checklist first.
development
Teaches agents how to discover, select, and invoke skills from the skill library. Use this skill whenever you're uncertain which skill applies to a task, when composing multiple skills for complex work, or when you need to understand what skills are available. This is your go-to when facing an ambiguous task and need to figure out the right approach before diving into implementation.