skills/testing-python/SKILL.md
ALWAYS LOAD THIS SKILL WHEN WRITING TESTS, ADDING FIXTURES, OR SETTING UP PYTEST. Do not write Python tests directly — use this skill first. Python testing with pytest: philosophy, fixtures, mock servers, containerized testing.
npx skillsauth add quick-brown-foxxx/coding_rules_python testing-pythonInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Tests prove features work. Coverage is secondary. E2e tests beat unit tests. Real beats mocked.
When writing new tests, plan before coding:
docs/plans/test-cases-<feature>.mdtests/
├── unit/ # Pure function tests
├── integration/ # CLI tests, component interaction
├── fixtures/ # Shared test data and helpers
└── conftest.py # Shared fixtures
[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = ["test_*.py"]
python_classes = ["Test*"]
python_functions = ["test_*"]
asyncio_mode = "auto"
addopts = ["-n", "auto", "--dist", "worksteal"]
markers = [
"unit: unit tests",
"integration: integration tests",
]
[dependency-groups]
dev = [
"pytest>=9.0.1",
"pytest-xdist>=3.5.0",
"pytest-cov>=7.0.0",
"pytest-asyncio>=1.3.0",
# "pytest-qt>=4.5.0", # For Qt apps
# "pytest-httpserver>=1.1.0", # For HTTP mocking
]
import subprocess
def test_list_profiles_empty() -> None:
result = subprocess.run(
["uv", "run", "poe", "app", "list"],
capture_output=True, text=True,
)
assert result.returncode == 0
assert "No profiles found" in result.stdout
def test_create_and_list_profile(tmp_path: Path) -> None:
env = {**os.environ, "APP_DATA_DIR": str(tmp_path)}
subprocess.run(
["uv", "run", "poe", "app", "create", "test-profile"],
env=env, check=True,
)
result = subprocess.run(
["uv", "run", "poe", "app", "list"],
capture_output=True, text=True, env=env,
)
assert "test-profile" in result.stdout
def test_load_config_missing_file() -> None:
result = load_config(Path("/nonexistent"))
assert result.is_err
assert "not found" in result.unwrap_err()
def test_load_config_valid() -> None:
result = load_config(Path("tests/fixtures/valid_config.yaml"))
assert result.is_ok
config = result.unwrap()
assert config.name == "test"
@pytest.mark.asyncio
async def test_fetch_data() -> None:
result = await fetch_data("https://httpbin.org/get")
assert result.is_ok
@pytest.fixture
def app_data_dir(tmp_path: Path) -> Path:
data_dir = tmp_path / "data"
data_dir.mkdir()
return data_dir
@pytest.fixture
def isolated_env(monkeypatch: pytest.MonkeyPatch, tmp_path: Path) -> tuple[Path, Path]:
config = tmp_path / "config"
data = tmp_path / "data"
config.mkdir()
data.mkdir()
monkeypatch.setenv("XDG_CONFIG_HOME", str(config))
monkeypatch.setenv("XDG_DATA_HOME", str(data))
return config, data
@pytest.fixture
def sample_audio_16khz() -> np.ndarray:
return np.zeros(16000, dtype=np.float32) # 1 second of silence
@pytest.fixture
def mock_api(httpserver: HTTPServer) -> HTTPServer:
httpserver.expect_request("/api/data").respond_with_json({"status": "ok"})
return httpserver
def test_fetch_from_api(mock_api: HTTPServer) -> None:
result = fetch_data(mock_api.url_for("/api/data"))
assert result.is_ok
Tests run in parallel by default (-n auto via addopts). Override with -n0 (sequential) or -n4 (exact count).
uv run poe test # All tests (parallel, auto workers)
uv run pytest tests/unit/ # Unit only
uv run pytest -n0 # Force sequential (debugging)
uv run pytest --cov # With coverage report
Every test must set up its own state and clean up after itself. Use tmp_path for files, monkeypatch for env vars, yield fixtures for teardown. Never rely on test ordering or shared mutable state. For heavy setup (containers, DB), isolate between groups — scope fixtures to session/module and use non-overlapping namespaces.
Tests that pass alone but fail in parallel are broken tests — fix isolation, don't disable parallelism.
A flaky test is worse than a broken one — broken tests block immediately, flaky tests erode trust silently. Never ignore a flaky test. Fix it, rewrite it, or if the root cause is complex — file a bug and report to the user. No other options.
Not targets to chase, but sanity checks:
| Area | Guideline | |------|-----------| | Core business logic | >70% | | CLI commands | >70% | | UI components | >40% | | Utilities | As needed |
If coverage is low but e2e tests cover the workflows, that's fine.
After all tests are written and passing, dispatch a separate sub-agent to validate test quality. The validation agent must check:
# type: ignore to silence test failures, no overly broad exception catching, no tests that pass regardless of input.This step is mandatory before submitting work as complete.
When lightweight testing isn't enough. Same philosophy, higher infrastructure complexity.
Status: Design document. Not yet fully implemented.
Ask before building heavyweight infrastructure:
If yes to all three: build it. If not: stick with lightweight.
| Instead of... | Use... |
|---------------|--------|
| @patch("requests.get") | Real HTTP server (pytest-httpserver or custom) |
| @patch("subprocess.run") | Custom lightweight binary that mimics the real one |
| unittest.mock.Mock() for DB | Real database in container |
| Monkeypatched file operations | Real filesystem in tmp_path or container volume |
| Mocked system services (DBus) | Real daemon instance for tests |
tests/
├── containers/
│ ├── Dockerfile.test-env # Base test environment
│ ├── Dockerfile.mock-api # Mock API server
│ ├── docker-compose.test.yml # Orchestration
│ └── mock-bins/ # Custom mock binaries
│ ├── mock-telegram # Fake Telegram Desktop
│ └── mock-ffmpeg # Fake ffmpeg (returns predefined output)
├── integration/
│ └── test_with_containers.py
└── conftest.py # Container lifecycle fixtures
# tests/containers/docker-compose.test.yml
services:
mock-api:
build:
context: .
dockerfile: Dockerfile.mock-api
ports:
- "18080:8080"
test-db:
image: postgres:16-alpine
environment:
POSTGRES_DB: test
POSTGRES_PASSWORD: test
ports:
- "15432:5432"
@pytest.fixture(scope="session")
def test_services():
"""Start all test containers, yield, then tear down."""
compose_file = Path(__file__).parent / "containers" / "docker-compose.test.yml"
subprocess.run(
["podman-compose", "-f", str(compose_file), "up", "-d", "--wait"],
check=True,
)
yield
subprocess.run(
["podman-compose", "-f", str(compose_file), "down", "-v"],
check=True,
)
Instead of patching subprocess.run(), provide a real binary that behaves predictably:
#!/bin/env python3
# tests/containers/mock-bins/mock-telegram
import sys, time, os
print("Telegram Desktop Mock v1.0")
print(f"Working directory: {os.getcwd()}")
if "-many" in sys.argv and "-workdir" in sys.argv:
print(f"Mock Telegram started in {sys.argv[sys.argv.index('-workdir') + 1]}")
time.sleep(int(os.environ.get("MOCK_TELEGRAM_LIFETIME", "5")))
sys.exit(0)
print("Unknown arguments", sys.argv, file=sys.stderr)
sys.exit(1)
@pytest.fixture
def mock_telegram_bin(tmp_path: Path) -> Path:
mock_bin = tmp_path / "telegram"
mock_bin.write_text(MOCK_TELEGRAM_SCRIPT)
mock_bin.chmod(0o755)
return mock_bin
async def test_start_instance(mock_telegram_bin: Path) -> None:
result = await start_instance(profile, binary_path=mock_telegram_bin)
assert result.is_ok
pid = result.unwrap()
assert pid > 0
For APIs that need to maintain state across requests:
from http.server import HTTPServer, BaseHTTPRequestHandler
import threading, json
class MockAPIHandler(BaseHTTPRequestHandler):
profiles: dict[str, dict[str, str | int | bool]] = {}
def do_POST(self) -> None:
if self.path == "/api/profiles":
data = json.loads(self.rfile.read(int(self.headers["Content-Length"])))
self.profiles[data["id"]] = data
self.send_response(201)
self.end_headers()
self.wfile.write(json.dumps(data).encode())
def do_GET(self) -> None:
if self.path == "/api/profiles":
self.send_response(200)
self.end_headers()
self.wfile.write(json.dumps(list(self.profiles.values())).encode())
@pytest.fixture(scope="session")
def mock_api() -> Generator[str, None, None]:
server = HTTPServer(("127.0.0.1", 0), MockAPIHandler)
port = server.server_address[1]
thread = threading.Thread(target=server.serve_forever, daemon=True)
thread.start()
yield f"http://127.0.0.1:{port}"
server.shutdown()
@pytest.fixture(scope="session")
def dbus_session() -> Generator[str, None, None]:
"""Start a real DBus session daemon for tests."""
process = subprocess.Popen(
["dbus-daemon", "--session", "--print-address", "--nofork"],
stdout=subprocess.PIPE,
)
address = process.stdout.readline().decode().strip()
os.environ["DBUS_SESSION_BUS_ADDRESS"] = address
yield address
process.terminate()
process.wait()
development
ALWAYS LOAD THIS SKILL WHEN A NEW FEATURE, NON-TRIVIAL FIX, REFACTOR, OR PYTHON STRUCTURE CHANGE REQUIRES AN ARCHITECTURE DECISION ABOUT LAYERS, WRAPPERS, COMPOSITION ROOTS, FRAMEWORK CHOICES, REUSABLE CORES, OR WHERE CODE SHOULD LIVE. Do not make Python architecture decisions blindly — use this skill first. Python architecture guide + skill router for boundary placement, reusable core design, composition vs inheritance, framework vs custom choices, backend/service layering, and follow-up docs/skills.
tools
ALWAYS LOAD THIS SKILL WHEN CREATING ANY STANDALONE PYTHON SCRIPT OR SINGLE-FILE AUTOMATION. Do not create Python scripts directly — use this skill first. Single-file Python scripts with PEP 723 inline metadata, uv run, and typer CLI.
development
ALWAYS LOAD THIS SKILL WHEN WRITING OR EDITING PYTHON CODE. Do not write or modify Python files directly — use this skill first. Core Python standards: basedpyright strict typing, Result-based error handling, async patterns, security, code style.
testing
ALWAYS LOAD THIS SKILL WHEN ADDING KEYBOARD SHORTCUTS OR HOTKEYS TO A PYSIDE6/QT APP. Do not implement keyboard shortcuts directly — use this skill first. Set up customizable keyboard shortcuts for PySide6 apps with TOML config and platform-specific defaults.