plugins/backend-toolkit/skills/test-strategy/SKILL.md
Backend testing pyramid — unit for pure logic, integration against a real DB (Testcontainers), and consumer-driven contract testing (Pact) for service boundaries. Use before a feature, after a bug fix, or when services break each other on deploy. Not for load testing (use performance-profiling) or security testing (use backend-security-audit).
npx skillsauth add jaykim88/claude-ai-engineering test-strategyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Test the backend where bugs actually live — integration against a real database, contract tests at service boundaries — not a wall of mocked unit tests that pass while production breaks.
Universal — the backend testing pyramid, real-DB integration testing, and consumer-driven contract testing are principles; the runner/container/Pact-binding differs by language.
Unit tests — pure logic only
Integration tests — against a REAL database (the bulk for backends)
Contract testing (Pact) — at service boundaries (the GAP)
E2E — critical journeys only
Set coverage thresholds by criticality
5b. Assert behavior, not implementation
5c. Keep tests deterministic (flake kills suites)
Date.now() / clocks via fake timers; never sleep() to "wait for" somethingMath.random / crypto), wallclock, and the network (mock outbound or use Testcontainers for downstream too)| ❌ Anti-pattern | ✅ Correct |
|---|---|
| Mocking the database in tests | Real Postgres via Testcontainers |
| All unit tests, no integration | Integration-heavy for backends (bugs live in I/O) |
| Deploy-the-world to check service compat | Consumer-driven contract tests (Pact) |
| Many flaky E2E tests | Few critical-journey E2E; integration for the rest |
| Uniform coverage target | Higher bar on money/auth/data-integrity paths |
| Asserting "function X was called with args Y" | Assert observable behavior (return value, DB row, emitted event) |
| Real time / sleep(n) / random clocks | Fake timers + seeded random + controlled DB state |
| Tests passing alone but failing in suite | No shared mutable state; pass in any order |
| Tier | Examples | Action SLA | |---|---|---| | Critical | 0% coverage on payment/auth/data-integrity logic; no integration test against real DB; services with no contract tests breaking each other | Fix immediately | | Major | DB mocked in integration tests; business logic < 50% branch coverage | Fix this sprint | | Minor | Missing contract test on a low-risk internal boundary; flaky E2E | Schedule within 2 sprints |
test(<scope>): integration tests for <feature> / test(contract): Pact for <consumer>↔<provider>@testcontainers/postgresql) → real Postgres; run migrations; test against it@pact-foundation/pact) — consumer tests generate pacts; provider verification in its pipeline--coverage with threshold)testcontainers-python; pact-python for contractstesting + testcontainers-go; pact-goschema-design — integration tests run against the real schemaapi-contract — contract tests verify the API contract holdscicd-pipeline — tests + contract verification gate the pipelinedevelopment
Design webhooks correctly on both sides — sending (HMAC signing, retries with backoff, at-least-once) and receiving (verify signature on raw body, enqueue + 200 fast, dedupe on event id). Use when adding webhook delivery or consuming a provider's webhooks. Not for internal service-to-service events (use async-messaging) or general outbound-call retry policy (use resilience-patterns).
testing
Use transactions and isolation levels correctly — keep them short, no network calls inside, explicit isolation, retry on serialization conflicts, and choose optimistic vs pessimistic locking. Use when a write spans multiple tables, when concurrent updates corrupt data, or when designing money/inventory flows. Not for cross-service event delivery (use async-messaging Outbox) or schema-level constraints (use schema-design).
data-ai
Design a relational schema — normalize to 3NF then denormalize with justification, choose the right Postgres index type per data shape, enforce constraints at the DB. Use when modeling a new domain, when queries are slow, or before a migration. Not for diagnosing slow queries (use query-optimization) or shipping the change without downtime (use migration-strategy).
development
Apply reliability primitives — capped exponential backoff with jitter, circuit breakers, timeouts, and idempotency keys — to every outbound call and mutating endpoint. Use when integrating an external service, when retries cause duplicate effects, or before shipping a payment/order flow. Not for job-runner retry config specifically (use background-jobs) or webhook-delivery specifics (use webhook-design, which reuses these primitives).