Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

santosomar/test-case-reducer

Name: test-case-reducer
Author: santosomar

skills/testing/test-case-reducer/SKILL.md

npx skillsauth add santosomar/general-secure-coding-agent-skills test-case-reducer

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Test Case Reducer

Fuzzer found a crash with a 50KB JSON input. The bug is probably triggered by 3 bytes of it. Reduction finds those 3 bytes.

The interestingness test

Everything hinges on a boolean oracle: does this (smaller) input still trigger the bug?

def interesting(input_bytes):
    proc = subprocess.run(["./parser", "-"], input=input_bytes,
                          capture_output=True, timeout=5)
    return proc.returncode == -11   # SIGSEGV

Make it specific. "Crashes" is too loose — you might reduce to a different crash. Check for the same stack trace, same error message, same assertion.

Delta debugging (ddmin) — the general algorithm

Works on any sequence (bytes, lines, list elements):

ddmin(input, n=2):
    Split input into n chunks.
    For each chunk: is input-minus-this-chunk still interesting?
        YES → recurse on the reduced input, reset n=2
    For each chunk: is this-chunk-alone interesting?
        YES → recurse on just this chunk, reset n=2
    No reduction at granularity n?
        n < len(input) → try n*=2 (finer chunks)
        else → done, input is 1-minimal

"1-minimal" means: removing any single element breaks the interestingness. Not globally minimal (that's NP-hard), but small enough.

Structured shrinking — faster for known formats

Bytes-level ddmin on JSON wastes time producing syntactically invalid JSON. Shrink at the grammar level:

| Format | Shrink operations | | -------- | ----------------------------------------------------------- | | JSON | Remove object keys; remove array elements; replace strings with ""; replace numbers with 0; replace objects with null | | Source code | Remove statements; inline variables; simplify expressions (a + b → a); remove function params | | Lists/trees | Remove children; collapse single-child nodes; replace subtrees with leaves | | Integers | Try 0, 1, -1, value/2 — smaller magnitude | | Strings | Try "", first char, each half |

Hypothesis/QuickCheck do this automatically for inputs they generated. For external inputs, do it manually.

Worked example

Crash input (from fuzzer):

{"users": [
  {"id": 1, "name": "alice", "tags": ["a", "b", "c"], "meta": {"x": 1, "y": 2}},
  {"id": 2, "name": "bob", "tags": [], "meta": {"x": null}},
  {"id": 3, "name": "", "tags": ["d"], "meta": {}}
]}

Interestingness: parser.parse(j) raises KeyError: 'y'.

Structured shrink:

Remove users[0] → still crashes? No. Keep users[0].
Remove users[1] → still crashes? Yes. Drop it.
Remove users[2] → still crashes? Yes. Drop it.

Now: {"users": [{"id": 1, "name": "alice", "tags": ["a","b","c"], "meta": {"x": 1, "y": 2}}]}

Remove "tags" from users[0] → Yes. Drop.
Remove "name" → Yes. Drop.
Remove "id" → Yes. Drop.
Remove "meta" → No. Keep. ← aha
Remove "x" from meta → No. Keep.
Remove "y" from meta → Yes. Drop.

Now: {"users": [{"meta": {"x": 1}}]}

Replace 1 with 0 → Yes. {"users": [{"meta": {"x": 0}}]}
Can we simplify further? Remove the outer "users"? No — parser expects it. Done.

Minimal: {"users": [{"meta": {"x": 0}}]}

The bug: parser assumes meta has "y" when "x" is present. 50KB → 30 bytes. The bug is now obvious.

Tools

| Target | Tool | | ------------------------ | ----------------------------------------------------- | | C/C++ source (compiler bugs) | creduce, cvise | | Arbitrary text | halfempty, delta | | Hypothesis-generated | Built-in — hypothesis.find or just @given shrinking | | AFL crashes | afl-tmin | | Custom formats | Write ddmin yourself — it's ~30 lines |

Flaky interestingness

If the bug is non-deterministic (race condition, uninitialized memory), interesting() flips between True and False on the same input. Reduction goes haywire.

Fix: require N-out-of-M successes:

def interesting(inp):
    crashes = sum(1 for _ in range(10) if _run_once(inp) == CRASH)
    return crashes >= 3   # crashes at least 30% of the time

Slower (10× the runs), but stable.

Do not

Do not use "any error" as the interestingness test. You'll reduce to a trivially malformed input that triggers a different (boring) error.
Do not skip verifying the final reduced input. Run it manually, confirm it's the same bug.
Do not reduce on a machine different from where the bug was found. Environment-dependent bugs (locale, filesystem, libc version) might not reproduce.
Do not stop at "small enough to read." Keep going to 1-minimal — the last few removals often reveal exactly which field matters.
Do not throw away intermediate reductions. If the full reduction takes hours and the machine dies, you want the 80%-reduced checkpoint.

Output format

## Original input
Size: <bytes/lines/elements>
<truncated sample>

## Interestingness test
<exact predicate — error type, stack frame, message pattern>
Stability: <deterministic | N-of-M with N, M>

## Reduction trace
| Step | Operation | Size after | Still interesting |
| ---- | --------- | ---------- | ----------------- |

## Minimal input
Size: <bytes/lines/elements>  (reduction: <%>  from original)
<full input>

## Verified
<ran minimal input N times, triggered the bug M times>

## Bug hypothesis
<what the minimal input reveals about the bug>

santosomar/test-case-reducer

skills/testing/test-case-reducer/SKILL.md

Shrinks a failing test input to its minimal form while preserving the failure — delta debugging and structured shrinking to find the smallest input that still triggers the bug. Use when a fuzzer or property test finds a failure with a huge input, when a bug report has an unwieldy reproduction, or when you need a minimal test case for a regression suite.

development

Updated Apr 13, 2026

$ install --global

skillsauth

npx skillsauth add santosomar/general-secure-coding-agent-skills test-case-reducer

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 13, 2026, 4:38 AM47.6s1 file scanned

SKILL.md

name:: test-case-reducer
description:: Shrinks a failing test input to its minimal form while preserving the failure — delta debugging and structured shrinking to find the smallest input that still triggers the bug. Use when a fuzzer or property test finds a failure with a huge input, when a bug report has an unwieldy reproduction, or when you need a minimal test case for a regression suite.
license:: Apache-2.0
category:: testing
suite:: general-secure-coding-agent-skills
version:: 0.3.0
related:: bug-reproduction-test-generator, unit-test-generator

Test Case Reducer

Fuzzer found a crash with a 50KB JSON input. The bug is probably triggered by 3 bytes of it. Reduction finds those 3 bytes.

The interestingness test

Everything hinges on a boolean oracle: does this (smaller) input still trigger the bug?

def interesting(input_bytes):
    proc = subprocess.run(["./parser", "-"], input=input_bytes,
                          capture_output=True, timeout=5)
    return proc.returncode == -11   # SIGSEGV

Make it specific. "Crashes" is too loose — you might reduce to a different crash. Check for the same stack trace, same error message, same assertion.

Delta debugging (ddmin) — the general algorithm

Works on any sequence (bytes, lines, list elements):

ddmin(input, n=2):
    Split input into n chunks.
    For each chunk: is input-minus-this-chunk still interesting?
        YES → recurse on the reduced input, reset n=2
    For each chunk: is this-chunk-alone interesting?
        YES → recurse on just this chunk, reset n=2
    No reduction at granularity n?
        n < len(input) → try n*=2 (finer chunks)
        else → done, input is 1-minimal

"1-minimal" means: removing any single element breaks the interestingness. Not globally minimal (that's NP-hard), but small enough.

Structured shrinking — faster for known formats

Bytes-level ddmin on JSON wastes time producing syntactically invalid JSON. Shrink at the grammar level:

Hypothesis/QuickCheck do this automatically for inputs they generated. For external inputs, do it manually.

Worked example

Crash input (from fuzzer):

{"users": [
  {"id": 1, "name": "alice", "tags": ["a", "b", "c"], "meta": {"x": 1, "y": 2}},
  {"id": 2, "name": "bob", "tags": [], "meta": {"x": null}},
  {"id": 3, "name": "", "tags": ["d"], "meta": {}}
]}

Interestingness: parser.parse(j) raises KeyError: 'y'.

Structured shrink:

Remove users[0] → still crashes? No. Keep users[0].
Remove users[1] → still crashes? Yes. Drop it.
Remove users[2] → still crashes? Yes. Drop it.

Now: {"users": [{"id": 1, "name": "alice", "tags": ["a","b","c"], "meta": {"x": 1, "y": 2}}]}

Remove "tags" from users[0] → Yes. Drop.
Remove "name" → Yes. Drop.
Remove "id" → Yes. Drop.
Remove "meta" → No. Keep. ← aha
Remove "x" from meta → No. Keep.
Remove "y" from meta → Yes. Drop.

Now: {"users": [{"meta": {"x": 1}}]}

Replace 1 with 0 → Yes. {"users": [{"meta": {"x": 0}}]}
Can we simplify further? Remove the outer "users"? No — parser expects it. Done.

Minimal: {"users": [{"meta": {"x": 0}}]}

The bug: parser assumes meta has "y" when "x" is present. 50KB → 30 bytes. The bug is now obvious.

Tools

Flaky interestingness

If the bug is non-deterministic (race condition, uninitialized memory), interesting() flips between True and False on the same input. Reduction goes haywire.

Fix: require N-out-of-M successes:

def interesting(inp):
    crashes = sum(1 for _ in range(10) if _run_once(inp) == CRASH)
    return crashes >= 3   # crashes at least 30% of the time

Slower (10× the runs), but stable.

Do not

Do not use "any error" as the interestingness test. You'll reduce to a trivially malformed input that triggers a different (boring) error.
Do not skip verifying the final reduced input. Run it manually, confirm it's the same bug.
Do not reduce on a machine different from where the bug was found. Environment-dependent bugs (locale, filesystem, libc version) might not reproduce.
Do not stop at "small enough to read." Keep going to 1-minimal — the last few removals often reveal exactly which field matters.
Do not throw away intermediate reductions. If the full reduction takes hours and the machine dies, you want the 80%-reduced checkpoint.

Output format

## Original input
Size: <bytes/lines/elements>
<truncated sample>

## Interestingness test
<exact predicate — error type, stack frame, message pattern>
Stability: <deterministic | N-of-M with N, M>

## Reduction trace
| Step | Operation | Size after | Still interesting |
| ---- | --------- | ---------- | ----------------- |

## Minimal input
Size: <bytes/lines/elements>  (reduction: <%>  from original)
<full input>

## Verified
<ran minimal input N times, triggered the bug M times>

## Bug hypothesis
<what the minimal input reveals about the bug>

Related Skills

santosomar/verified-pseudocode-extractor

development

VerifiedTrustedCommunity

Extracts human-readable pseudocode from a verified formal artifact (Dafny, Lean, TLA+) while preserving the verified properties as annotations, so the proof-carrying logic can be reimplemented in a production language. Use when porting verified code to an unverified target, when documenting what a formal spec actually does, or when handing a verified algorithm to an implementer.

SKILL.mdUpdated Apr 13, 2026

santosomar/verified-pseudocode-extractor

santosomar/tlaplus-spec-generator

development

VerifiedTrustedCommunity

Translates natural-language or pseudocode descriptions of concurrent and distributed systems into TLA+ specifications ready for the TLC model checker. Identifies state variables, actions, type invariants, safety properties, and liveness properties from the description. Use when formalizing a protocol, when the user describes a distributed algorithm to verify, when designing a consensus or locking scheme, or when starting formal verification of a concurrent system.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-spec-generator

santosomar/tlaplus-model-reduction

testing

VerifiedTrustedCommunity

Reduces a TLA+ model so TLC can actually check it — shrinks constants, adds state constraints, abstracts data, or applies symmetry — when the state space is too large to enumerate. Use when TLC runs out of memory, when checking takes hours, or when a spec works at N=2 and you need confidence at larger scale.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-model-reduction

santosomar/tlaplus-guided-code-repair

development

VerifiedTrustedCommunity

TLA+-specific instance of model-guided repair — reads a TLC error trace, identifies the enabling condition that should have been false, strengthens the corresponding action, and maps the fix to source code. Use when TLC reports an invariant violation or deadlock and you have the code-to-TLA+ mapping from extraction.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-guided-code-repair

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/santosomar/general-secure-coding-agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r general-secure-coding-agent-skills/skills/testing/test-case-reducer ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

santosomar/general-secure-coding-agent-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT