skills/testing-strategy/SKILL.md
Test selection strategy and scope guidance. Triggers: 'which tests should I run', 'test tiers', 'test marks', 'slow tests', 'integration vs unit', 'cross-module regression', 'test scope', 'what should I run', 'select tests', 'test batching'. NOT for: writing tests (use test-driven-development) or fixing broken tests (use fixing-tests).
npx skillsauth add axiomantic/spellbook testing-strategyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
| Tier | Time | What | When | |------|------|------|------| | Unit | <1s each | Pure logic, no I/O, no external deps | After every change | | Integration | 1-5s each | Real resources (DB, filesystem, network) | After completing a logical unit of work | | E2E / Slow | >5s each | Full pipelines, large data, real services | Once per feature branch, before PR |
src/auth/login.py changed? Run tests/test_login.py.Batching: Write code for task 1, run targeted tests, write code for task 2, run targeted tests, run full suite once at end.
Mock expensive resources in unit tests. Use smallest possible inputs. Never sleep in tests. One assertion focus per test. No fixtures heavier than the test itself.
Apply marks proactively when writing tests. A test that calls a GPU kernel is a GPU test even if it is fast today.
| Mark | Meaning |
|------|---------|
| slow | >5 seconds. Skip during rapid iteration. |
| gpu, hardware | Requires specific hardware. Skip on machines without it. |
| network / external | Calls external services. Skip in offline/fast modes. |
| integration | Requires multiple components working together. |
| smoke | Minimal sanity checks. Run first. |
If a project lacks marks, infer tiers from --durations=0 (pytest) or equivalent: >5s is slow, >1s is integration, the rest are unit.
When the full suite fails after targeted tests passed: check failed test imports against your changed modules, then investigate shared mutable state, test ordering, or resource contention.
testing
Use when creating new skills, editing existing skills, or verifying skills work before deployment. Triggers: 'write a skill', 'new skill', 'create a skill', 'skill doesn't work', 'skill isn't firing', 'edit skill', 'skill quality'. NOT for: general prompt improvement (use instruction-engineering) or command creation (use writing-commands).
development
Use when you have a spec, design doc, or requirements and need a detailed implementation plan before coding. Triggers: 'write a plan', 'create implementation plan', 'plan this out', 'break this down into steps', 'convert design to tasks', 'implementation order'. Also invoked by develop during planning. NOT for: reviewing existing plans (use reviewing-impl-plans).
testing
Use when creating new commands, editing existing commands, or reviewing command quality. Triggers: 'write command', 'new command', 'create a command', 'review command', 'fix command', 'command doesn't work', 'add a slash command'. NOT for: skill creation (use writing-skills).
development
Use when about to claim discovery during debugging. Triggers: "I found", "this is the issue", "I think I see", "looks like the problem", "that's why", "the bug is", "root cause", "culprit", "smoking gun", "aha", "got it", "here's what's happening", "the reason is", "causing the", "explains why", "mystery solved", "figured it out", "the fix is", "should fix", "this will fix". Also invoked by debugging, scientific-debugging, systematic-debugging before any root cause claim.