Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

santosomar/python-test-updater

Name: python-test-updater
Author: santosomar

skills/testing/python-test-updater/SKILL.md

npx skillsauth add santosomar/general-secure-coding-agent-skills python-test-updater

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Python Test Updater

Same triage discipline as → java-test-updater. Python differences: no compile-time breaks (everything fails at runtime), more mocker.patch coupling, and snapshot libraries make some updates one command.

Python failure taxonomy

No compile step means everything surfaces at test runtime:

| Failure | Python-specific signal | Action | | --------------------------------------------- | ----------------------------------------------- | ------------------------ | | AttributeError: 'X' has no attribute 'foo' | Renamed/removed method | Update call site | | TypeError: f() missing 1 positional argument | Signature changed | Add arg or use default | | TypeError: f() got an unexpected keyword | Kwarg renamed/removed | Update kwarg | | ImportError / ModuleNotFoundError | Module moved | Update import | | AssertionError with value diff | Behavior changed (intentional?) or regression | Triage | | AssertionError: Expected 'mock' to be called | Over-mocked internal | Loosen/delete mock | | AssertionError in snapshot compare | Snapshot stale | Review diff, --snapshot-update if intentional |

Automating the mechanical fixes

Python's runtime errors point straight at the problem. For signature changes across many tests:

# conftest.py — one-time shim during migration
# OLD: Order(items, region)   NEW: Order(items, region, currency="USD")
# Tests still call Order(items, region).  Temporary compat:

@pytest.fixture(autouse=True)
def _order_compat(mocker):
    orig_init = Order.__init__
    def compat_init(self, items, region, currency="USD"):
        orig_init(self, items, region, currency)
    mocker.patch.object(Order, "__init__", compat_init)

This is a bridge, not a fix. All tests pass → remove the shim → fix tests one by one as you touch them. Don't leave compat shims in permanently.

Mock coupling — the `mocker.patch` problem

def test_process(mocker):
    mock_validate = mocker.patch("orders.service._validate_order")
    mock_save = mocker.patch("orders.service._save_order")
    process(order)
    mock_validate.assert_called_once_with(order)
    mock_save.assert_called_once()

_validate_order was inlined into process. Test fails: Expected '_validate_order' to have been called once. Called 0 times.

The behavior didn't change — the order is still validated — but the test was asserting structure. Delete the mock assertion. The test should assert the outcome of validation:

def test_process_rejects_invalid():
    bad = Order(items=[])
    with pytest.raises(InvalidOrder, match="empty"):
        process(bad)

This survives refactors because it tests what validation does, not that a function named _validate_order was called.

Snapshot updates — review first

If using syrupy / pytest-snapshot:

pytest --snapshot-update

This updates all failing snapshots to current output. Dangerous if any failure is a regression. Workflow:

Run without --snapshot-update. Read every diff.
For each: intentional change? Or regression?
Only after confirming all are intentional: --snapshot-update.
Commit the snapshot changes with a message explaining why they changed.

Assertion triage — same as Java

assert invoice.total == Decimal("27.80")   # fails: actual 27.55

git log -p -- src/pricing.py → "Fix: half-even rounding." Intentional. Update with reason:

# abc123: half-even rounding fix. Was 27.80 (half-up bug).
assert invoice.total == Decimal("27.55")

Versus:

assert len(results) == 5   # fails: actual 4

The change was to a completely different module. Why are there fewer results? Regression. Don't update. Investigate.

`pytest.approx` drift

assert score == 0.8472819  # now fails: 0.8472820

Last digit changed — floating-point ops reordered. If precision to 7 decimals isn't spec'd, this was over-tight:

assert score == pytest.approx(0.8473, rel=1e-4)

This isn't "loosening to make it pass." It's fixing an over-specific assertion that should never have been that tight.

Do not

Do not blindly run --snapshot-update. Review diffs first. A regression in a snapshot looks identical to an intentional change.
Do not fix mocker.patch("module._private") failures by updating the patch path. You're chasing implementation. Replace with behavioral assertions.
Do not widen pytest.approx tolerance until the test passes. If rel=0.5 is what it takes, the test isn't testing anything.
Do not skip investigating assertion failures in "unrelated" tests. Unrelated is where regressions hide.
Do not leave compat shims in conftest.py after the migration. They mask further API drift.

Output format

## Failing tests
Total: <N>  Import/Attribute: <N>  Signature: <N>  Assertion: <N>  Mock: <N>  Snapshot: <N>

## Mechanical fixes
| Test | Error | Fix |
| ---- | ----- | --- |

## Mock decoupling
| Test | Over-coupled patch | Replacement assertion |
| ---- | ------------------ | --------------------- |

## Assertion triage
| Test | Old | New | Cause commit | Classification | Action |
| ---- | --- | --- | ------------ | -------------- | ------ |

## Snapshot review
| Snapshot | Diff summary | Intentional? |
| -------- | ------------ | ------------ |

## Regressions
<tests correctly failing — file bugs, don't update>

## After
Passing: <N>  Updated: <N>  Decoupled: <N>  Deleted: <N>  Bugs filed: <N>

santosomar/python-test-updater

skills/testing/python-test-updater/SKILL.md

Updates broken pytest tests after intentional code changes — triaging assertion failures from mock-coupling failures from genuine regressions, using Python's introspection to automate where safe. Use when a refactor or API change leaves a pile of failing tests and you need to decide update vs. fix vs. delete.

development

Updated Apr 13, 2026

$ install --global

skillsauth

npx skillsauth add santosomar/general-secure-coding-agent-skills python-test-updater

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 13, 2026, 4:37 AM68.5s1 file scanned

SKILL.md

name:: python-test-updater
description:: Updates broken pytest tests after intentional code changes — triaging assertion failures from mock-coupling failures from genuine regressions, using Python's introspection to automate where safe. Use when a refactor or API change leaves a pile of failing tests and you need to decide update vs. fix vs. delete.
license:: Apache-2.0
category:: testing
suite:: general-secure-coding-agent-skills
version:: 0.3.0
related:: java-test-updater, behavior-preservation-checker

Python Test Updater

Python failure taxonomy

No compile step means everything surfaces at test runtime:

Automating the mechanical fixes

Python's runtime errors point straight at the problem. For signature changes across many tests:

# conftest.py — one-time shim during migration
# OLD: Order(items, region)   NEW: Order(items, region, currency="USD")
# Tests still call Order(items, region).  Temporary compat:

@pytest.fixture(autouse=True)
def _order_compat(mocker):
    orig_init = Order.__init__
    def compat_init(self, items, region, currency="USD"):
        orig_init(self, items, region, currency)
    mocker.patch.object(Order, "__init__", compat_init)

This is a bridge, not a fix. All tests pass → remove the shim → fix tests one by one as you touch them. Don't leave compat shims in permanently.

Mock coupling — the `mocker.patch` problem

def test_process(mocker):
    mock_validate = mocker.patch("orders.service._validate_order")
    mock_save = mocker.patch("orders.service._save_order")
    process(order)
    mock_validate.assert_called_once_with(order)
    mock_save.assert_called_once()

_validate_order was inlined into process. Test fails: Expected '_validate_order' to have been called once. Called 0 times.

The behavior didn't change — the order is still validated — but the test was asserting structure. Delete the mock assertion. The test should assert the outcome of validation:

def test_process_rejects_invalid():
    bad = Order(items=[])
    with pytest.raises(InvalidOrder, match="empty"):
        process(bad)

This survives refactors because it tests what validation does, not that a function named _validate_order was called.

Snapshot updates — review first

If using syrupy / pytest-snapshot:

pytest --snapshot-update

This updates all failing snapshots to current output. Dangerous if any failure is a regression. Workflow:

Run without --snapshot-update. Read every diff.
For each: intentional change? Or regression?
Only after confirming all are intentional: --snapshot-update.
Commit the snapshot changes with a message explaining why they changed.

Assertion triage — same as Java

assert invoice.total == Decimal("27.80")   # fails: actual 27.55

git log -p -- src/pricing.py → "Fix: half-even rounding." Intentional. Update with reason:

# abc123: half-even rounding fix. Was 27.80 (half-up bug).
assert invoice.total == Decimal("27.55")

Versus:

assert len(results) == 5   # fails: actual 4

The change was to a completely different module. Why are there fewer results? Regression. Don't update. Investigate.

`pytest.approx` drift

assert score == 0.8472819  # now fails: 0.8472820

Last digit changed — floating-point ops reordered. If precision to 7 decimals isn't spec'd, this was over-tight:

assert score == pytest.approx(0.8473, rel=1e-4)

This isn't "loosening to make it pass." It's fixing an over-specific assertion that should never have been that tight.

Do not

Do not blindly run --snapshot-update. Review diffs first. A regression in a snapshot looks identical to an intentional change.
Do not fix mocker.patch("module._private") failures by updating the patch path. You're chasing implementation. Replace with behavioral assertions.
Do not widen pytest.approx tolerance until the test passes. If rel=0.5 is what it takes, the test isn't testing anything.
Do not skip investigating assertion failures in "unrelated" tests. Unrelated is where regressions hide.
Do not leave compat shims in conftest.py after the migration. They mask further API drift.

Output format

## Failing tests
Total: <N>  Import/Attribute: <N>  Signature: <N>  Assertion: <N>  Mock: <N>  Snapshot: <N>

## Mechanical fixes
| Test | Error | Fix |
| ---- | ----- | --- |

## Mock decoupling
| Test | Over-coupled patch | Replacement assertion |
| ---- | ------------------ | --------------------- |

## Assertion triage
| Test | Old | New | Cause commit | Classification | Action |
| ---- | --- | --- | ------------ | -------------- | ------ |

## Snapshot review
| Snapshot | Diff summary | Intentional? |
| -------- | ------------ | ------------ |

## Regressions
<tests correctly failing — file bugs, don't update>

## After
Passing: <N>  Updated: <N>  Decoupled: <N>  Deleted: <N>  Bugs filed: <N>

Related Skills

santosomar/verified-pseudocode-extractor

development

VerifiedTrustedCommunity

Extracts human-readable pseudocode from a verified formal artifact (Dafny, Lean, TLA+) while preserving the verified properties as annotations, so the proof-carrying logic can be reimplemented in a production language. Use when porting verified code to an unverified target, when documenting what a formal spec actually does, or when handing a verified algorithm to an implementer.

SKILL.mdUpdated Apr 13, 2026

santosomar/verified-pseudocode-extractor

santosomar/tlaplus-spec-generator

development

VerifiedTrustedCommunity

Translates natural-language or pseudocode descriptions of concurrent and distributed systems into TLA+ specifications ready for the TLC model checker. Identifies state variables, actions, type invariants, safety properties, and liveness properties from the description. Use when formalizing a protocol, when the user describes a distributed algorithm to verify, when designing a consensus or locking scheme, or when starting formal verification of a concurrent system.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-spec-generator

santosomar/tlaplus-model-reduction

testing

VerifiedTrustedCommunity

Reduces a TLA+ model so TLC can actually check it — shrinks constants, adds state constraints, abstracts data, or applies symmetry — when the state space is too large to enumerate. Use when TLC runs out of memory, when checking takes hours, or when a spec works at N=2 and you need confidence at larger scale.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-model-reduction

santosomar/tlaplus-guided-code-repair

development

VerifiedTrustedCommunity

TLA+-specific instance of model-guided repair — reads a TLC error trace, identifies the enabling condition that should have been false, strengthens the corresponding action, and maps the fix to source code. Use when TLC reports an invariant violation or deadlock and you have the code-to-TLA+ mapping from extraction.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-guided-code-repair

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/santosomar/general-secure-coding-agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r general-secure-coding-agent-skills/skills/testing/python-test-updater ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

santosomar/general-secure-coding-agent-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

santosomar/python-test-updater

$ install --global

Security Scan Results

SKILL.md

Python Test Updater

Python failure taxonomy

Automating the mechanical fixes

Mock coupling — the mocker.patch problem

Snapshot updates — review first

Assertion triage — same as Java

pytest.approx drift

Do not

Output format

Related Skills

santosomar/verified-pseudocode-extractor

santosomar/tlaplus-spec-generator

santosomar/tlaplus-model-reduction

santosomar/tlaplus-guided-code-repair

santosomar/python-test-updater

$ install --global

Security Scan Results

SKILL.md

Python Test Updater

Python failure taxonomy

Automating the mechanical fixes

Mock coupling — the mocker.patch problem

Snapshot updates — review first

Assertion triage — same as Java

pytest.approx drift

Do not

Output format

Related Skills

santosomar/verified-pseudocode-extractor

santosomar/tlaplus-spec-generator

santosomar/tlaplus-model-reduction

santosomar/tlaplus-guided-code-repair

Mock coupling — the `mocker.patch` problem

`pytest.approx` drift

Mock coupling — the `mocker.patch` problem

`pytest.approx` drift