plugins/python-development/skills/python-tdd/SKILL.md
Generate focused, behavior-driven Python tests using TDD methodology with pytest. TRIGGER WHEN: writing Python tests, improving coverage, reviewing test quality, or practicing red-green-refactor workflows with pytest. DO NOT TRIGGER WHEN: the target is non-Python (use testing:tdd for language-agnostic TDD or testing:e2e-testing-patterns for browser E2E).
npx skillsauth add acaprino/anvil-toolset python-tddInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Generate focused, behavior-driven tests with pytest. Prioritize observable behavior over implementation details. For framework configuration (pyproject.toml, plugins, CI), see references/framework-config.md. For deep conftest patterns and heavy-dep mocking, see references/pytest-infrastructure.md. For full TDD discipline, see references/tdd-best-practices.md.
DO mock: External API calls / HTTP requests, database queries in unit tests, file system operations, time-dependent behavior (datetime.now, time.time), environment variables.
DON'T mock: Your own models and dataclasses, simple utility functions, the function under test itself, framework features you rely on.
| Priority | What | Target | |----------|------|--------| | P0 | Critical paths (auth, payments, data integrity) | 100% line + branch | | P1 | Core business logic + public API | 90%+ line | | P2 | Utilities, helpers, config | 80%+ line |
Overall: 80%+ line coverage; 100% on critical paths.
Class-based grouping -- nest by feature, then scenario:
class TestUserService:
class TestCreateUser:
def test_should_create_when_valid_data(self): ...
def test_should_raise_when_email_exists(self): ...
Flat function naming -- test_<action>_should_<outcome>_when_<condition>:
def test_create_user_should_succeed_when_valid_data(): ...
def test_login_should_fail_when_password_expired(): ...
def test_transfer_should_debit_sender_when_sufficient_funds():
# Arrange
sender = make_account(balance=100)
receiver = make_account(balance=50)
# Act
transfer(sender, receiver, amount=30)
# Assert
assert sender.balance == 70
assert receiver.balance == 80
Prefer factory fixtures in conftest.py over raw dicts:
# conftest.py
@pytest.fixture
def make_user():
def _make_user(name="Test User", email="[email protected]", active=True):
return User(name=name, email=email, active=active)
return _make_user
def test_deactivate_user(make_user):
user = make_user(active=True)
user.deactivate()
assert user.active is False
@pytest.fixture
def db_session():
session = SessionLocal()
yield session
session.rollback()
session.close()
@pytest.fixture(scope="module") # or scope="session" for engine
def api_client(): yield TestClient(app)
Share via conftest.py -- pytest auto-discovers.
@pytest.mark.parametrize("input_val,expected", [
pytest.param("[email protected]", True, id="valid-email"),
pytest.param("no-at-sign.com", False, id="missing-at"),
pytest.param("", False, id="empty-string"),
])
def test_is_valid_email(input_val, expected):
assert is_valid_email(input_val) == expected
@patch("myapp.services.requests.get") # patch where USED (top-level import)
def test_fetch_user(mock_get):
mock_get.return_value.json.return_value = {"id": 1}
result = fetch_user(1)
mock_get.assert_called_once_with("https://api.example.com/users/1")
@patch("myapp.services.UserRepository", autospec=True) # autospec catches signature drift
def test_repo_call(MockRepo): ...
@patch("myapp.client.requests.get", side_effect=ConnectionError("timeout"))
def test_network_error(mock_get):
with pytest.raises(ServiceUnavailableError):
fetch_data("https://api.example.com/data")
When a function imports inside another function (lazy import), it is NOT a module-level attribute at the usage site. Patch at the definition site, not the usage site.
# WRONG -- AttributeError: module has no attribute 'get_db'
monkeypatch.setattr("myapp.services.processor.get_db", mock_db)
# CORRECT -- patch where get_db is DEFINED
monkeypatch.setattr("myapp.database.get_db", mock_db)
Rule: if monkeypatch.setattr raises AttributeError, check whether the target is a lazy import. If so, patch the module where the function is defined.
@pytest.mark.asyncio
async def test_async_fetch():
result = await async_fetch("https://api.example.com")
assert result["status"] == "ok"
Set asyncio_mode = "auto" in pyproject.toml to skip the @pytest.mark.asyncio decorator.
from hypothesis import given, strategies as st
@given(st.integers(min_value=0, max_value=1000))
def test_deposit_increases_balance(amount):
account = Account(balance=0)
account.deposit(amount)
assert account.balance == amount
def test_export(tmp_path): # auto-cleaned
output = tmp_path / "report.csv"
export_report(output, [{"name": "Alice"}])
assert "Alice" in output.read_text()
def test_db_url(monkeypatch):
monkeypatch.setenv("DATABASE_URL", "postgresql://localhost/test")
assert get_database_url() == "postgresql://localhost/test"
| Anti-pattern | Why it's bad | Do this instead |
|--------------|--------------|-----------------|
| assert obj is not None alone | Asserts nothing about behavior | Assert on a specific attribute or return value |
| Mocking the function under test | Tests nothing real | Mock its dependencies instead |
| 40+ tests for a simple module | Sign of over-testing or bloated module | Split module or consolidate parametrized tests |
| Testing framework internals | Validates pytest/SQLAlchemy, not your code | Test your logic through public API |
| Copy-pasting mock setup in every test | Fragile, hard to maintain | Extract into fixtures or factory functions |
| Testing private methods directly | Couples tests to implementation | Test through the public interface |
| Catching exceptions inside test code | Swallows real failures silently | Use pytest.raises as context manager |
| No assertions in test body | Test always passes, proves nothing | Every test must assert something |
| Asserting on mock.called only | Doesn't verify correct arguments | Use assert_called_once_with(expected_args) |
| Hardcoded golden values (== 660) | Breaks when algorithm improves, not when behavior is wrong | Assert invariants, use pytest.approx, derive from inputs |
| Heavy mocks in sub-directory conftest | Root tests load real deps first; mocks too late | Place ALL heavy dependency mocks in root tests/conftest.py |
| Missing markers on heavy-dep tests | Tests break silently when deps are mocked by default | Mark @pytest.mark.slow, use --strict-markers |
| Incomplete external service mocking | One unmocked service hangs CI (e.g., google.auth.default()) | Audit ALL external calls before finalizing integration conftest |
| Patching lazy import at use site | Function not bound at module level | Patch at definition site (see Core Patterns above) |
Root tests/conftest.py runs FIRST. Sub-directory conftest files run only when their tests are collected.
Heavy mocks go in root conftest. If you mock ortools, scipy, prometheus_client, or any large native dependency, do it in root tests/conftest.py. Sub-directory conftest mocks are too late -- collection-time imports already loaded the real module.
# tests/conftest.py (ROOT -- runs first, before any test collection)
import sys
from unittest.mock import MagicMock
for _mod in ("ortools", "ortools.sat.python.cp_model",
"scipy", "scipy.optimize", "prometheus_client"):
if _mod not in sys.modules:
sys.modules[_mod] = MagicMock()
Every external service the app uses must have a mock in the integration conftest:
| Service | What to mock | Why |
|---------|--------------|-----|
| Database | connection / session | Real DB not in CI |
| Auth | token verification | No auth server in tests |
| Cloud storage | upload / download | google.auth.default() hangs |
| Email | send functions | Sends real emails |
| Payment | charge / refund | Hits real API |
Audit: grep production code for external service imports; verify each has a mock.
Tests requiring real heavy dependencies (scipy.optimize, real DB, ML models) MUST be marked:
pytestmark = pytest.mark.slow # module-level
@pytest.mark.slow # per-test or per-class
class TestPortionSolver: ...
Default addopts in pyproject.toml: -m "not slow and not e2e".
For full conftest+sys.modules race + environment-safety patterns, see references/pytest-infrastructure.md.
references/tdd-best-practices.md -- TDD discipline, red-green-refactor depth, coverage strategiesreferences/framework-config.md -- pytest config, plugins, CI/CDreferences/pytest-infrastructure.md -- conftest ordering, heavy-dep mocking, mock target decision tree, external service auditdevelopment
Unified web frontend knowledge base covering CSS architecture, UX psychology, UI components, distinctive aesthetics, and interface design generation. TRIGGER WHEN: working on web styling, design systems, component decisions, responsive strategy, distinctive frontend aesthetics, or exploring multiple interface designs. DO NOT TRIGGER WHEN: the task is purely backend or unrelated to web frontend.
development
Coordinate parallel code reviews across multiple quality dimensions with finding deduplication, severity calibration, and consolidated reporting. Use this skill when organizing multi-reviewer code reviews, calibrating finding severity, or consolidating review results.
tools
Knowledge base for the codebase-mapper plugin. Provides writing guidelines, tone rules, and diagram conventions for generating human-readable project guides. Referenced by all codebase-mapper agents during document generation. TRIGGER WHEN: referenced by codebase-mapper pipeline agents (codebase-explorer, overview-writer, tech-writer, flow-writer, onboarding-writer, ops-writer, config-writer, guide-reviewer) during document generation. DO NOT TRIGGER WHEN: outside the /map-codebase pipeline (general documentation work should use docs:readme-craft or codebase-mapper:docs-create).
tools
Progressive Web App knowledge base for 2025-2026: Web App Manifest, Service Workers (Workbox 7, Serwist), Web Push (VAPID, RFC 8030/8291/8292, Declarative Push for Safari 18.4+), install flows (beforeinstallprompt, Window Controls Overlay), OPFS storage, Project Fugu, Core Web Vitals (INP < 200ms), security (HTTPS, CSP, COOP/COEP), and distribution (Bubblewrap, PWA Builder MSIX, Capacitor). TRIGGER WHEN: building, auditing, or debugging PWAs, including manifest, service worker, Web Push, install flow, OPFS, Background Sync, Wake Lock, vite-plugin-pwa, Next.js Serwist, @angular/pwa, @vite-pwa/nuxt, Bubblewrap, TWA, PWA Builder, or Capacitor wrapping. DO NOT TRIGGER WHEN: the task is generic frontend styling (use frontend), React performance (use react-development:review-react), cross-platform security unrelated to PWA (use platform-engineering), Tauri or Electron wrappers (use tauri-development), or GA4 / analytics (use digital-marketing).