Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

oborchers/testing-strategy

Name: testing-strategy
Author: oborchers

python-package/skills/testing-strategy/SKILL.md

npx skillsauth add oborchers/fractional-cto testing-strategy

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Configure pytest Strictly, Test Behavior Not Implementation

Every serious Python package -- attrs, httpx, Pydantic, FastAPI, Rich -- shares the same pytest configuration philosophy: strict by default, warnings as errors, no silent regressions. Without strict settings, typos in markers go unnoticed, deprecated upstream APIs break you without warning, and xfail tests silently pass for months hiding fixed bugs that never get their markers removed.

Testing strategy failures are quiet. Coverage regresses 1% at a time. A missing --strict-markers lets @pytest.mark.solw pass silently. filterwarnings without "error" lets upstream deprecation warnings accumulate until a dependency update breaks everything at once. The configuration below prevents all of this.

Pytest Configuration

These settings are non-negotiable. They appear in every major package's pyproject.toml:

[tool.pytest.ini_options]
testpaths = ["tests"]
xfail_strict = true
filterwarnings = ["error"]
addopts = ["--strict-markers", "--strict-config", "-ra"]
markers = [
    "slow: marks tests as slow (deselect with '-m \"not slow\"')",
    "network: marks tests that require network access",
    "integration: marks integration tests requiring external services",
]

| Setting | What It Prevents | |---------|-----------------| | testpaths = ["tests"] | Scanning src/, docs/, node_modules/ -- faster collection | | xfail_strict = true | Unexpectedly passing xfail silently succeeding instead of failing | | filterwarnings = ["error"] | Missing upstream DeprecationWarning until it breaks you | | --strict-markers | Typos like @pytest.mark.solw passing without error | | --strict-config | Typos like filterwarning (missing 's') being silently ignored | | -ra | Forgetting to check which tests were skipped or xfailed |

Add targeted warning ignores only for known upstream issues you cannot control:

filterwarnings = [
    "error",
    "ignore::DeprecationWarning:some_dependency.*",
]

Test Organization

Mirror the source directory -- the tests/ directory must mirror src/my_package/ exactly, with the same subdirectories and a test_-prefixed file for every source module. This makes it obvious where tests live and immediately reveals untested modules. See the project-structure skill for the full directory mapping.

Start flat within that mirror. Refactor to additional directories (e.g., tests/unit/, tests/integration/) only when test count exceeds 500 or different test layers need different infrastructure.

| Structure | When | Run Subsets | |-----------|------|-------------| | Flat mirror (tests/test_*.py matching src/) | < 500 tests, same fixtures | pytest -m "not slow" | | Directories (tests/unit/, tests/integration/) | > 500 tests, different infrastructure per layer | pytest tests/unit/ |

conftest.py rules

Fixtures only -- never put test functions in conftest.py
Fixtures flow downward to all tests in the directory and below
Past ~150 lines, extract into modules: pytest_plugins = ["tests.fixtures.database"]

Fixtures

Prefer factory fixtures

# DO: Factory with sensible defaults
@pytest.fixture
def make_user():
    def _make_user(name="test_user", email="[email protected]", role="user"):
        return User(name=name, email=email, role=role)
    return _make_user

def test_admin_permissions(make_user):
    admin = make_user(role="admin")
    assert admin.can_delete(make_user())

| Good | Bad | |------|-----| | One factory fixture with parameters | Separate fixture per variant (admin_user, inactive_user) | | Compose fixtures: client(app(config)) | Monolithic fixture that sets up everything | | Use built-ins: tmp_path, capsys, monkeypatch | Reinvent temporary directories or stdout capture | | autouse=True only for leak prevention | autouse=True for convenience |

Scope rules

A fixture can only depend on fixtures with equal or broader scope. Expensive resources (DB engines, HTTP servers) use scope="session", cheap per-test resources (DB transactions, test clients) use default scope with rollback in teardown.

Parametrize

Always use ids for readable test output. Include expected values in parameters -- never use conditionals inside parametrized tests.

# DO: Expected value in parameters
@pytest.mark.parametrize(("fmt", "expected"), [
    pytest.param("json", '"name"', id="json_format"),
    pytest.param("xml", "<name>", id="xml_format"),
])
def test_export(fmt, expected):
    assert expected in export(data, fmt)

# DON'T: Conditionals inside parametrized test
@pytest.mark.parametrize("fmt", ["json", "xml"])
def test_export(fmt):
    result = export(data, fmt)
    if fmt == "json": assert '"name"' in result    # Three tests pretending to be one
    elif fmt == "xml": assert "<name>" in result

Stack decorators for cartesian products:

@pytest.mark.parametrize("method", ["GET", "POST", "PUT"])
@pytest.mark.parametrize("auth", ["token", "api_key"])
def test_endpoint(method, auth):  # 3 x 2 = 6 tests
    ...

Coverage

[tool.coverage.run]
source_pkgs = ["my_library"]
branch = true
parallel = true

[tool.coverage.report]
show_missing = true
fail_under = 85
exclude_also = [
    "if TYPE_CHECKING:",
    "@overload",
    "raise NotImplementedError",
    "assert_never",
    "\\.\\.\\.",
]

| Decision | Recommendation | |----------|---------------| | Branch coverage | Always enable (branch = true). Line coverage misses untested else paths. | | fail_under | Start at 80, raise as coverage improves. Never lower it. Prevents silent regression. | | Target | 80-85% for libraries, 85-90% for production APIs, never chase 100% | | Exclusions | TYPE_CHECKING blocks, @overload, abstract methods, sentinel ... |

Run locally: pytest --cov=my_library --cov-report=term-missing

Async Testing

Enable pytest-asyncio auto mode to avoid decorating every async test:

[tool.pytest.ini_options]
asyncio_mode = "auto"

Any async def test_* is automatically detected. For trio or anyio backends, use asyncio_mode = "auto" with the anyio pytest plugin instead.

For FastAPI, use httpx.AsyncClient with ASGITransport:

@pytest.fixture
async def client():
    transport = ASGITransport(app=app)
    async with AsyncClient(transport=transport, base_url="http://test") as ac:
        yield ac

async def test_create_item(client):
    response = await client.post("/items/", json={"name": "Foo"})
    assert response.status_code == 201

Property-Based Testing with Hypothesis

Use Hypothesis for serialization round-trips, parsers, data transformations, and mathematical properties. Used by Pydantic, attrs, CPython, NumPy. Not worth it for simple CRUD or UI tests.

from hypothesis import settings, HealthCheck

settings.register_profile("ci", max_examples=1000, deadline=None,
                           suppress_health_check=[HealthCheck.too_slow])
settings.register_profile("dev", max_examples=50, deadline=400)
settings.load_profile(os.getenv("HYPOTHESIS_PROFILE", "default"))

Pin regression cases with @example() so they run on every invocation, not just when Hypothesis rediscovers them.

Add .hypothesis/ to .gitignore.

Mocking Best Practices

| Mock | Do Not Mock | |------|-------------| | External HTTP APIs, databases in unit tests | Your own pure functions | | Time/dates (time-machine), third-party services | Data structures, simple transformations | | Environment variables (monkeypatch) | The thing you are testing |

Patch where the name is used, not where it is defined: mocker.patch("myapp.email.SMTP") (correct) vs mocker.patch("smtplib.SMTP") (wrong). Prefer dependency injection over mocking -- pass InMemoryDatabase() instead of patching PostgresDatabase.

CI Test Matrix

Full Python version matrix on Linux. Add macOS and Windows only if your package has platform-specific behavior — when needed, test oldest + newest Python versions only. See the ci-cd skill for the full GitHub Actions workflow and reusable workflow patterns.

strategy:
  fail-fast: false
  matrix:
    python-version: ["3.10", "3.11", "3.12", "3.13"]
    os: [ubuntu-latest]
    include:
      # Add these only if your package has platform-specific behavior
      - { python-version: "3.10", os: macos-latest }
      - { python-version: "3.13", os: macos-latest }
      - { python-version: "3.10", os: windows-latest }
      - { python-version: "3.13", os: windows-latest }

Test with uv sync --resolution lowest-direct to verify minimum dependency bounds are correct.

Reference Configuration

Combine the pytest and coverage sections from above into pyproject.toml:

[tool.pytest.ini_options]
testpaths = ["tests"]
xfail_strict = true
filterwarnings = ["error"]
addopts = ["--strict-markers", "--strict-config", "-ra"]
asyncio_mode = "auto"

[tool.coverage.run]
source_pkgs = ["my_library"]
branch = true
parallel = true

[tool.coverage.report]
show_missing = true
fail_under = 85
exclude_also = ["if TYPE_CHECKING:", "@overload", "raise NotImplementedError", "assert_never", "\\.\\.\\.",]

Review Checklist

When reviewing tests and test configuration:

[ ] xfail_strict = true is set -- unexpectedly passing xfail tests fail the build
[ ] filterwarnings = ["error"] is set with only targeted, module-specific ignores
[ ] --strict-markers and --strict-config are in addopts
[ ] All custom markers are declared in the markers list
[ ] Factory fixtures with defaults are used instead of one fixture per test variant
[ ] pytest.raises always includes match="..." -- no bare exception catching
[ ] Parametrized tests use ids for readable output and include expected values in parameters
[ ] Coverage uses branch = true and fail_under is set (minimum 80)
[ ] TYPE_CHECKING blocks and @overload are excluded from coverage
[ ] Async tests use asyncio_mode = "auto" -- no manual decorators
[ ] Mocks target external services only -- own code tested directly via dependency injection
[ ] fail_under is treated as a ratchet -- raise it as coverage improves, never lower it
[ ] Hypothesis profiles exist for CI (max_examples=1000) and dev (max_examples=50)
[ ] Hypothesis regression cases are pinned with @example() decorators
[ ] CI matrix tests all Python versions on Linux; oldest + newest on macOS/Windows only if platform-specific behavior exists
[ ] Tests verify behavior and outcomes, not internal method call order

oborchers/testing-strategy

python-package/skills/testing-strategy/SKILL.md

This skill should be used when the user is configuring pytest, writing tests, setting up test fixtures, using parametrize, measuring code coverage, writing async tests with pytest-asyncio, using Hypothesis for property-based testing, choosing between nox and tox, building CI test matrices, setting up snapshot testing with syrupy, mocking with pytest-mock, or reviewing test organization. Covers pytest configuration, fixtures, coverage thresholds, async testing, Hypothesis profiles, CI matrices, and mocking best practices.

10 stars

development

Updated May 13, 2026

$ install --global

skillsauth

npx skillsauth add oborchers/fractional-cto testing-strategy

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 13, 2026, 6:50 AM224.4s1 file scanned

SKILL.md

name:: testing-strategy
description:: This skill should be used when the user is configuring pytest, writing tests, setting up test fixtures, using parametrize, measuring code coverage, writing async tests with pytest-asyncio, using Hypothesis for property-based testing, choosing between nox and tox, building CI test matrices, setting up snapshot testing with syrupy, mocking with pytest-mock, or reviewing test organization. Covers pytest configuration, fixtures, coverage thresholds, async testing, Hypothesis profiles, CI matrices, and mocking best practices.
version:: 1.0.0

Configure pytest Strictly, Test Behavior Not Implementation

Pytest Configuration

These settings are non-negotiable. They appear in every major package's pyproject.toml:

[tool.pytest.ini_options]
testpaths = ["tests"]
xfail_strict = true
filterwarnings = ["error"]
addopts = ["--strict-markers", "--strict-config", "-ra"]
markers = [
    "slow: marks tests as slow (deselect with '-m \"not slow\"')",
    "network: marks tests that require network access",
    "integration: marks integration tests requiring external services",
]

Add targeted warning ignores only for known upstream issues you cannot control:

filterwarnings = [
    "error",
    "ignore::DeprecationWarning:some_dependency.*",
]

Test Organization

Start flat within that mirror. Refactor to additional directories (e.g., tests/unit/, tests/integration/) only when test count exceeds 500 or different test layers need different infrastructure.

conftest.py rules

Fixtures only -- never put test functions in conftest.py
Fixtures flow downward to all tests in the directory and below
Past ~150 lines, extract into modules: pytest_plugins = ["tests.fixtures.database"]

Fixtures

Prefer factory fixtures

# DO: Factory with sensible defaults
@pytest.fixture
def make_user():
    def _make_user(name="test_user", email="[email protected]", role="user"):
        return User(name=name, email=email, role=role)
    return _make_user

def test_admin_permissions(make_user):
    admin = make_user(role="admin")
    assert admin.can_delete(make_user())

Scope rules

Parametrize

Always use ids for readable test output. Include expected values in parameters -- never use conditionals inside parametrized tests.

# DO: Expected value in parameters
@pytest.mark.parametrize(("fmt", "expected"), [
    pytest.param("json", '"name"', id="json_format"),
    pytest.param("xml", "<name>", id="xml_format"),
])
def test_export(fmt, expected):
    assert expected in export(data, fmt)

# DON'T: Conditionals inside parametrized test
@pytest.mark.parametrize("fmt", ["json", "xml"])
def test_export(fmt):
    result = export(data, fmt)
    if fmt == "json": assert '"name"' in result    # Three tests pretending to be one
    elif fmt == "xml": assert "<name>" in result

Stack decorators for cartesian products:

@pytest.mark.parametrize("method", ["GET", "POST", "PUT"])
@pytest.mark.parametrize("auth", ["token", "api_key"])
def test_endpoint(method, auth):  # 3 x 2 = 6 tests
    ...

Coverage

[tool.coverage.run]
source_pkgs = ["my_library"]
branch = true
parallel = true

[tool.coverage.report]
show_missing = true
fail_under = 85
exclude_also = [
    "if TYPE_CHECKING:",
    "@overload",
    "raise NotImplementedError",
    "assert_never",
    "\\.\\.\\.",
]

Run locally: pytest --cov=my_library --cov-report=term-missing

Async Testing

Enable pytest-asyncio auto mode to avoid decorating every async test:

[tool.pytest.ini_options]
asyncio_mode = "auto"

Any async def test_* is automatically detected. For trio or anyio backends, use asyncio_mode = "auto" with the anyio pytest plugin instead.

For FastAPI, use httpx.AsyncClient with ASGITransport:

@pytest.fixture
async def client():
    transport = ASGITransport(app=app)
    async with AsyncClient(transport=transport, base_url="http://test") as ac:
        yield ac

async def test_create_item(client):
    response = await client.post("/items/", json={"name": "Foo"})
    assert response.status_code == 201

Property-Based Testing with Hypothesis

Use Hypothesis for serialization round-trips, parsers, data transformations, and mathematical properties. Used by Pydantic, attrs, CPython, NumPy. Not worth it for simple CRUD or UI tests.

from hypothesis import settings, HealthCheck

settings.register_profile("ci", max_examples=1000, deadline=None,
                           suppress_health_check=[HealthCheck.too_slow])
settings.register_profile("dev", max_examples=50, deadline=400)
settings.load_profile(os.getenv("HYPOTHESIS_PROFILE", "default"))

Pin regression cases with @example() so they run on every invocation, not just when Hypothesis rediscovers them.

Add .hypothesis/ to .gitignore.

Mocking Best Practices

CI Test Matrix

strategy:
  fail-fast: false
  matrix:
    python-version: ["3.10", "3.11", "3.12", "3.13"]
    os: [ubuntu-latest]
    include:
      # Add these only if your package has platform-specific behavior
      - { python-version: "3.10", os: macos-latest }
      - { python-version: "3.13", os: macos-latest }
      - { python-version: "3.10", os: windows-latest }
      - { python-version: "3.13", os: windows-latest }

Test with uv sync --resolution lowest-direct to verify minimum dependency bounds are correct.

Reference Configuration

Combine the pytest and coverage sections from above into pyproject.toml:

[tool.pytest.ini_options]
testpaths = ["tests"]
xfail_strict = true
filterwarnings = ["error"]
addopts = ["--strict-markers", "--strict-config", "-ra"]
asyncio_mode = "auto"

[tool.coverage.run]
source_pkgs = ["my_library"]
branch = true
parallel = true

[tool.coverage.report]
show_missing = true
fail_under = 85
exclude_also = ["if TYPE_CHECKING:", "@overload", "raise NotImplementedError", "assert_never", "\\.\\.\\.",]

Review Checklist

When reviewing tests and test configuration:

[ ] xfail_strict = true is set -- unexpectedly passing xfail tests fail the build
[ ] filterwarnings = ["error"] is set with only targeted, module-specific ignores
[ ] --strict-markers and --strict-config are in addopts
[ ] All custom markers are declared in the markers list
[ ] Factory fixtures with defaults are used instead of one fixture per test variant
[ ] pytest.raises always includes match="..." -- no bare exception catching
[ ] Parametrized tests use ids for readable output and include expected values in parameters
[ ] Coverage uses branch = true and fail_under is set (minimum 80)
[ ] TYPE_CHECKING blocks and @overload are excluded from coverage
[ ] Async tests use asyncio_mode = "auto" -- no manual decorators
[ ] Mocks target external services only -- own code tested directly via dependency injection
[ ] fail_under is treated as a ratchet -- raise it as coverage improves, never lower it
[ ] Hypothesis profiles exist for CI (max_examples=1000) and dev (max_examples=50)
[ ] Hypothesis regression cases are pinned with @example() decorators
[ ] CI matrix tests all Python versions on Linux; oldest + newest on macOS/Windows only if platform-specific behavior exists
[ ] Tests verify behavior and outcomes, not internal method call order

Related Skills

oborchers/using-planning-tools

tools

VerifiedTrustedCommunity

This skill should be used when the user invokes any /plan-* command from the planning-tools plugin (/plan-context, /plan-master, /plan-open-questions, /plan-verify, /plan-tick, /plan-progress, /plan-delete), asks how Claude Code's plan files work, asks where plans are stored, asks to author or audit a multi-phase master planning document, asks how to walk through a plan's Open Questions interactively, asks how to write progress entries, or mentions ~/.claude/plans/ or .claude/planning-tools.local.md. Provides the index of planning-tools commands, the master-plan workflow lifecycle, the v0.3.0+ list-shape mandate (phases and questions as headings + bulleted scope items, never tables), the v0.3.2+ plain-bullet shape (no `- [ ]` checkboxes — heading emoji is the sole tick signal), the progress-entry methodology, and the mechanics of Claude Code's plan-mode file storage.

14SKILL.mdUpdated May 13, 2026

oborchers/using-planning-tools

oborchers/whitespace-density

testing

VerifiedTrustedCommunity

This skill should be used when the user is adjusting spacing, padding, margins, content density, section gaps, vertical rhythm, or separation between elements. Also applies when reviewing whether a design feels cramped or too sparse, choosing between borders and whitespace for separation, or defining a spacing system. Covers the 4px/8px spacing system, macro vs micro whitespace, content density spectrum, separation techniques (whitespace > background shifts > borders), and vertical rhythm.

12SKILL.mdUpdated May 22, 2026

oborchers/whitespace-density

oborchers/visual-interest-expression

development

VerifiedTrustedCommunity

This skill should be used when the user is defining brand personality in design, choosing between illustration and photography, adding motion or animation, creating visual motifs, ensuring layout variety, customizing CSS framework defaults, or calibrating the level of creative expression for a given context. Covers Lavie & Tractinsky's expressive aesthetics, the expression spectrum (restrained to bold), brand personality translation, illustration systems, photography direction, and template independence.

12SKILL.mdUpdated May 22, 2026

oborchers/visual-interest-expression

oborchers/visual-hierarchy

development

VerifiedTrustedCommunity

This skill should be used when the user is establishing visual importance, designing headings, creating focal points, designing CTAs or buttons, arranging label-data relationships, implementing scanning patterns (F-pattern, Z-pattern), or ensuring one dominant element per screen. Covers the three levers of hierarchy (size, weight, color), three-tier information architecture, the 'emphasize by de-emphasizing' principle, CTA design, and label-data relationships.

12SKILL.mdUpdated May 22, 2026

oborchers/visual-hierarchy

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/oborchers/fractional-cto.git

# Copy into Claude Code skills folder (global)
cp -r fractional-cto/python-package/skills/testing-strategy ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

oborchers/fractional-cto

10 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT