Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

santosomar/bug-reproduction-test-generator

Name: bug-reproduction-test-generator
Author: santosomar

skills/debugging/bug-reproduction-test-generator/SKILL.md

npx skillsauth add santosomar/general-secure-coding-agent-skills bug-reproduction-test-generator

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Bug Reproduction Test Generator

A bug you can't reliably reproduce is a bug you can't reliably fix. This skill turns prose ("it crashes when I upload a big file") into an executable test that is red on the buggy code and will be green once it's fixed.

Step 1 — Classify the input you have

| Input available | First move | | ----------------------------------------- | ------------------------------------------------------- | | Stack trace | Top project-code frame → target function; exception → assertion | | "Steps to reproduce" prose | Translate each step to a setup line; last step → action | | Log excerpt | Find the first anomalous line; grep for the producer | | Failing production request (curl, HAR) | Replay against a test harness; wrap in an assertion | | Screenshot / "it looks wrong" | Not machine-checkable — ask for the data, not the UI | | "It's intermittent" | Do not write a test yet — first stabilize (Step 4) |

Step 2 — Identify the triple

Every reproduction test is (setup, action, assertion). Extract each:

Setup: the minimal state needed. A user? One user, not a seeded database. A file? The smallest file that triggers it.
Action: the single call/request/event that triggers the bug. If the bug needs two actions in sequence, both are the action.
Assertion: what should happen, not what currently happens. The test must describe correct behavior, because once fixed this test is your regression guard.

Step 3 — Write, then minimize

Write a test that reproduces. It will be too big. Minimize mechanically:

| Dimension | Minimization | | ------------ | -------------------------------------------------------------------------- | | Setup state | Delete one setup line → still red? Keep deleting. Add back when it goes green. | | Input size | Binary-bisect the input (half the file, half the list) until it goes green | | Dependencies | Inline mocks; if removing a mock turns it green, that mock was load-bearing |

The target is: a test so small that when it fails, the fault is obvious from reading the test alone.

Step 4 — Stabilize intermittents

If the bug is non-deterministic, the test can't simply assert correct behavior — it'll pass by luck half the time.

| Source of flakiness | Stabilization | | --------------------- | ------------------------------------------------------------- | | Timing / sleep-based | Replace wall-clock with an injected clock you advance manually | | Thread interleaving | Force the bad interleaving with a latch/barrier; assert on the state you now deterministically reach | | Network | Mock the transport; inject the specific response/error that triggers the bug | | Test order | Run the test in isolation; if it passes alone, the bug is test pollution, not product code | | Randomness | Seed the RNG with the seed that reproduces (if logged); if not, loop 1000× and assert all pass |

If you cannot stabilize, do not ship a test that retries. A retry loop in a reproduction test is an admission you don't understand the bug.

Worked example

Report: "Exporting a CSV with a quote in a column value produces a file Excel can't open."

Triple extraction:

Setup: one row with one field containing a double-quote
Action: call export_csv(rows)
Assertion: output field is RFC-4180 escaped (" → "" and field is quoted)

Minimal test:

def test_csv_export_escapes_quotes():
    rows = [{"name": 'say "hi"'}]
    out = export_csv(rows)
    assert out == 'name\r\n"say ""hi"""\r\n'

Three lines of meaning. When this fails, the reader immediately sees: quotes aren't escaped.

Edge cases

The bug is a hang, not a crash: Assertion is pytest.fail() after the call, with a timeout decorator. The test passes if it reaches fail() within the timeout.
The bug is a leak: Can't assert on memory from a unit test. Instead, assert that close()/__exit__ was called on the resource mock.
The correct behavior is undefined: Ask. A test must assert something; if nobody knows what "correct" is, you found a requirements bug — → requirement-enhancer.

Do not

Do not assert on the current (buggy) behavior. The test should describe what should happen and must currently fail.
Do not reproduce via the UI when you can reproduce via the API. UI tests are 10× slower and 10× flakier.
Do not add this test to the "flaky" quarantine. If it's flaky, you haven't finished reproducing.
Do not skip minimization because "the test works." A 50-line reproduction nobody can read is a 50-line liability.

Handoff

Once red: → bug-localization (test is the signal) → bug-to-patch-generator (test is the oracle).

santosomar/bug-reproduction-test-generator

skills/debugging/bug-reproduction-test-generator/SKILL.md

Creates minimal, reproducible test cases from bug reports to confirm the defect before and after a fix. Use when a bug is reported without a failing test, when the user needs a regression test for a fix, or when the user asks to reproduce a bug as a test.

testing

Updated Apr 13, 2026

$ install --global

skillsauth

npx skillsauth add santosomar/general-secure-coding-agent-skills bug-reproduction-test-generator

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 24, 2026, 8:44 PM1.9s1 file scanned

SKILL.md

name:: bug-reproduction-test-generator
description:: Creates minimal, reproducible test cases from bug reports to confirm the defect before and after a fix. Use when a bug is reported without a failing test, when the user needs a regression test for a fix, or when the user asks to reproduce a bug as a test.
license:: Apache-2.0
category:: debugging
suite:: general-secure-coding-agent-skills
version:: 0.3.0
related:: bug-localization, test-case-reducer, bug-to-patch-generator

Bug Reproduction Test Generator

Step 1 — Classify the input you have

Step 2 — Identify the triple

Every reproduction test is (setup, action, assertion). Extract each:

Setup: the minimal state needed. A user? One user, not a seeded database. A file? The smallest file that triggers it.
Action: the single call/request/event that triggers the bug. If the bug needs two actions in sequence, both are the action.
Assertion: what should happen, not what currently happens. The test must describe correct behavior, because once fixed this test is your regression guard.

Step 3 — Write, then minimize

Write a test that reproduces. It will be too big. Minimize mechanically:

The target is: a test so small that when it fails, the fault is obvious from reading the test alone.

Step 4 — Stabilize intermittents

If the bug is non-deterministic, the test can't simply assert correct behavior — it'll pass by luck half the time.

If you cannot stabilize, do not ship a test that retries. A retry loop in a reproduction test is an admission you don't understand the bug.

Worked example

Report: "Exporting a CSV with a quote in a column value produces a file Excel can't open."

Triple extraction:

Setup: one row with one field containing a double-quote
Action: call export_csv(rows)
Assertion: output field is RFC-4180 escaped (" → "" and field is quoted)

Minimal test:

def test_csv_export_escapes_quotes():
    rows = [{"name": 'say "hi"'}]
    out = export_csv(rows)
    assert out == 'name\r\n"say ""hi"""\r\n'

Three lines of meaning. When this fails, the reader immediately sees: quotes aren't escaped.

Edge cases

The bug is a hang, not a crash: Assertion is pytest.fail() after the call, with a timeout decorator. The test passes if it reaches fail() within the timeout.
The bug is a leak: Can't assert on memory from a unit test. Instead, assert that close()/__exit__ was called on the resource mock.
The correct behavior is undefined: Ask. A test must assert something; if nobody knows what "correct" is, you found a requirements bug — → requirement-enhancer.

Do not

Do not assert on the current (buggy) behavior. The test should describe what should happen and must currently fail.
Do not reproduce via the UI when you can reproduce via the API. UI tests are 10× slower and 10× flakier.
Do not add this test to the "flaky" quarantine. If it's flaky, you haven't finished reproducing.
Do not skip minimization because "the test works." A 50-line reproduction nobody can read is a 50-line liability.

Handoff

Once red: → bug-localization (test is the signal) → bug-to-patch-generator (test is the oracle).

Related Skills

santosomar/verified-pseudocode-extractor

development

VerifiedTrustedCommunity

Extracts human-readable pseudocode from a verified formal artifact (Dafny, Lean, TLA+) while preserving the verified properties as annotations, so the proof-carrying logic can be reimplemented in a production language. Use when porting verified code to an unverified target, when documenting what a formal spec actually does, or when handing a verified algorithm to an implementer.

SKILL.mdUpdated Apr 13, 2026

santosomar/verified-pseudocode-extractor

santosomar/tlaplus-spec-generator

development

VerifiedTrustedCommunity

Translates natural-language or pseudocode descriptions of concurrent and distributed systems into TLA+ specifications ready for the TLC model checker. Identifies state variables, actions, type invariants, safety properties, and liveness properties from the description. Use when formalizing a protocol, when the user describes a distributed algorithm to verify, when designing a consensus or locking scheme, or when starting formal verification of a concurrent system.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-spec-generator

santosomar/tlaplus-model-reduction

testing

VerifiedTrustedCommunity

Reduces a TLA+ model so TLC can actually check it — shrinks constants, adds state constraints, abstracts data, or applies symmetry — when the state space is too large to enumerate. Use when TLC runs out of memory, when checking takes hours, or when a spec works at N=2 and you need confidence at larger scale.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-model-reduction

santosomar/tlaplus-guided-code-repair

development

VerifiedTrustedCommunity

TLA+-specific instance of model-guided repair — reads a TLC error trace, identifies the enabling condition that should have been false, strengthens the corresponding action, and maps the fix to source code. Use when TLC reports an invariant violation or deadlock and you have the code-to-TLA+ mapping from extraction.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-guided-code-repair

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/santosomar/general-secure-coding-agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r general-secure-coding-agent-skills/skills/debugging/bug-reproduction-test-generator ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

santosomar/general-secure-coding-agent-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT