Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

santosomar/counterexample-to-test-generator

Name: counterexample-to-test-generator
Author: santosomar

skills/verification/counterexample-to-test-generator/SKILL.md

npx skillsauth add santosomar/general-secure-coding-agent-skills counterexample-to-test-generator

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Counterexample → Test Generator

A model checker trace is a proof that the bug exists — in the model. A test is a proof it exists in the code. The translation is: map each model action back to a code operation, sequence them, and assert the invariant.

This is the model-checking analogue of → bug-reproduction-test-generator.

The mapping problem

| Model trace element | Test element | | ------------------------------ | ----------------------------------------------------------- | | Initial state | Test setup / fixture | | Action(p) firing | Call the code function that Action models, as process p | | Interleaving of actions | Thread scheduling — controlled or forced | | Final (violating) state | Assertion on the corresponding code state |

The hard part is interleaving. The model checker found a specific schedule that breaks things. Your test has to reproduce that schedule.

Controlling the schedule

| Determinism level | Technique | Reliability | | ------------------------- | ------------------------------------------------------------ | ---------------------- | | Full control | Single-threaded simulation: call each step explicitly in order | 100% — preferred | | Partial control | Barriers/latches between steps to force ordering | High if used carefully | | No control | Run threads concurrently many times, hope for the interleaving | Flaky — last resort |

Always try single-threaded first. If the model actions are Read(p1), Read(p2), Write(p1), Write(p2), you don't need real threads — just call p1.read(); p2.read(); p1.write(); p2.write(); in sequence. The interleaving that matters is the order of operations on shared state, not actual OS threads.

Step-by-step

Extract the action sequence from the trace. Ignore state dumps; you want the action names and parameters between states.
Map each action to a code call. Use the mapping from → program-to-tlaplus-spec-generator (or equivalent extraction).
Identify shared state. What model variables correspond to real shared objects? Those need to be set up in the fixture.
Sequence the calls. One after another, in trace order. If an action was \E p : Foo(p), use the p TLC chose.
Translate the invariant. Inv == counter = N becomes assert counter == N.
Run it. If the test passes, your model doesn't match your code (see below). If it fails — good, you have a repro.

Worked example

TLC trace (from the lost-update counter):

State 1: counter=0, tmp=[p1|->0, p2|->0], pc=[p1|->"start", p2|->"start"]
State 2: <Read(p1)>    counter=0, tmp=[p1|->0, p2|->0], pc=[p1|->"write", p2|->"start"]
State 3: <Read(p2)>    counter=0, tmp=[p1|->0, p2|->0], pc=[p1|->"write", p2|->"write"]
State 4: <Write(p1)>   counter=1, ...
State 5: <Write(p2)>   counter=1    ← Inv violated: 2 increments, counter should be 2

Action sequence: Read(p1), Read(p2), Write(p1), Write(p2).

Code mapping:

Read(p) ↔ first half of increment(): tmp = counter
Write(p) ↔ second half of increment(): lock(); counter = tmp + 1; unlock()

Problem: increment() is one function in code, two actions in the model. To reproduce the interleaving, split it:

Test (Go):

func TestLostUpdate_FromTLC(t *testing.T) {
    counter := 0
    var mu sync.Mutex

    // Simulate the TLC trace: Read(p1), Read(p2), Write(p1), Write(p2)
    // Split increment() into its two atomic halves.
    tmp1 := counter           // Read(p1) -- reads 0
    tmp2 := counter           // Read(p2) -- reads 0
    mu.Lock(); counter = tmp1 + 1; mu.Unlock()   // Write(p1) -- counter=1
    mu.Lock(); counter = tmp2 + 1; mu.Unlock()   // Write(p2) -- counter=1

    // Invariant: two increments → counter == 2
    if counter != 2 {
        t.Fatalf("lost update: 2 increments, counter=%d (TLC trace reproduced)", counter)
    }
}

Test fails with counter=1. Bug reproduced, deterministically, no real threads.

When the test doesn't reproduce

The trace violated the model invariant but the test passes. Three reasons:

| Cause | How to tell | Fix | | -------------------------------------- | ------------------------------------------------------- | ------------------------------------------- | | Model is more abstract than code | Code has a check the model doesn't | Fix the model — it was over-approximating | | Wrong code mapping | You called the wrong function for an action | Re-check the extraction mapping | | Atomicity mismatch | Model treats X as one step; code has finer interleaving | Refine the model's actions |

A test that doesn't reproduce is good news — it means the model found a bug in itself, not the code. Fix the model.

Do not

Do not use real threads when a sequential simulation reproduces the interleaving. Real threads make the test flaky and slow.
Do not assert the entire final state. Assert the invariant. Other state may legitimately differ between model and code.
Do not skip the test because the model "already proved it." The model is an abstraction — the test confirms the abstraction was faithful.
Do not leave the test in the suite after the fix without inverting the assertion. Post-fix, the test should pass (counter == 2) — it becomes a regression guard.

Output format

## Action sequence (from trace)
1. <Action(params)>
2. ...

## Code mapping
<Action> ↔ <function/code region>

## Schedule strategy
<single-threaded simulation | barriers | stress loop> — <why>

## Test
<code block — executable>

## Expected
FAIL before fix: <what assertion trips>
PASS after fix:  <same assertion, now holds>

santosomar/counterexample-to-test-generator

skills/verification/counterexample-to-test-generator/SKILL.md

Converts a model checker counterexample trace into an executable test case in the source language, so the bug found in the model is reproducible (and regression-guarded) in the real code. Use when TLC/NuSMV/Spin finds a violation and you want a failing test before writing the fix.

development

Updated Apr 13, 2026

$ install --global

skillsauth

npx skillsauth add santosomar/general-secure-coding-agent-skills counterexample-to-test-generator

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 13, 2026, 4:39 AM26.6s1 file scanned

SKILL.md

name:: counterexample-to-test-generator
description:: Converts a model checker counterexample trace into an executable test case in the source language, so the bug found in the model is reproducible (and regression-guarded) in the real code. Use when TLC/NuSMV/Spin finds a violation and you want a failing test before writing the fix.
license:: Apache-2.0
category:: verification
suite:: general-secure-coding-agent-skills
version:: 0.3.0
related:: counterexample-debugger, bug-reproduction-test-generator, model-guided-code-repair

Counterexample → Test Generator

This is the model-checking analogue of → bug-reproduction-test-generator.

The mapping problem

The hard part is interleaving. The model checker found a specific schedule that breaks things. Your test has to reproduce that schedule.

Controlling the schedule

Step-by-step

Extract the action sequence from the trace. Ignore state dumps; you want the action names and parameters between states.
Map each action to a code call. Use the mapping from → program-to-tlaplus-spec-generator (or equivalent extraction).
Identify shared state. What model variables correspond to real shared objects? Those need to be set up in the fixture.
Sequence the calls. One after another, in trace order. If an action was \E p : Foo(p), use the p TLC chose.
Translate the invariant. Inv == counter = N becomes assert counter == N.
Run it. If the test passes, your model doesn't match your code (see below). If it fails — good, you have a repro.

Worked example

TLC trace (from the lost-update counter):

State 1: counter=0, tmp=[p1|->0, p2|->0], pc=[p1|->"start", p2|->"start"]
State 2: <Read(p1)>    counter=0, tmp=[p1|->0, p2|->0], pc=[p1|->"write", p2|->"start"]
State 3: <Read(p2)>    counter=0, tmp=[p1|->0, p2|->0], pc=[p1|->"write", p2|->"write"]
State 4: <Write(p1)>   counter=1, ...
State 5: <Write(p2)>   counter=1    ← Inv violated: 2 increments, counter should be 2

Action sequence: Read(p1), Read(p2), Write(p1), Write(p2).

Code mapping:

Read(p) ↔ first half of increment(): tmp = counter
Write(p) ↔ second half of increment(): lock(); counter = tmp + 1; unlock()

Problem: increment() is one function in code, two actions in the model. To reproduce the interleaving, split it:

Test (Go):

func TestLostUpdate_FromTLC(t *testing.T) {
    counter := 0
    var mu sync.Mutex

    // Simulate the TLC trace: Read(p1), Read(p2), Write(p1), Write(p2)
    // Split increment() into its two atomic halves.
    tmp1 := counter           // Read(p1) -- reads 0
    tmp2 := counter           // Read(p2) -- reads 0
    mu.Lock(); counter = tmp1 + 1; mu.Unlock()   // Write(p1) -- counter=1
    mu.Lock(); counter = tmp2 + 1; mu.Unlock()   // Write(p2) -- counter=1

    // Invariant: two increments → counter == 2
    if counter != 2 {
        t.Fatalf("lost update: 2 increments, counter=%d (TLC trace reproduced)", counter)
    }
}

Test fails with counter=1. Bug reproduced, deterministically, no real threads.

When the test doesn't reproduce

The trace violated the model invariant but the test passes. Three reasons:

A test that doesn't reproduce is good news — it means the model found a bug in itself, not the code. Fix the model.

Do not

Do not use real threads when a sequential simulation reproduces the interleaving. Real threads make the test flaky and slow.
Do not assert the entire final state. Assert the invariant. Other state may legitimately differ between model and code.
Do not skip the test because the model "already proved it." The model is an abstraction — the test confirms the abstraction was faithful.
Do not leave the test in the suite after the fix without inverting the assertion. Post-fix, the test should pass (counter == 2) — it becomes a regression guard.

Output format

## Action sequence (from trace)
1. <Action(params)>
2. ...

## Code mapping
<Action> ↔ <function/code region>

## Schedule strategy
<single-threaded simulation | barriers | stress loop> — <why>

## Test
<code block — executable>

## Expected
FAIL before fix: <what assertion trips>
PASS after fix:  <same assertion, now holds>

Related Skills

santosomar/verified-pseudocode-extractor

development

VerifiedTrustedCommunity

Extracts human-readable pseudocode from a verified formal artifact (Dafny, Lean, TLA+) while preserving the verified properties as annotations, so the proof-carrying logic can be reimplemented in a production language. Use when porting verified code to an unverified target, when documenting what a formal spec actually does, or when handing a verified algorithm to an implementer.

SKILL.mdUpdated Apr 13, 2026

santosomar/verified-pseudocode-extractor

santosomar/tlaplus-spec-generator

development

VerifiedTrustedCommunity

Translates natural-language or pseudocode descriptions of concurrent and distributed systems into TLA+ specifications ready for the TLC model checker. Identifies state variables, actions, type invariants, safety properties, and liveness properties from the description. Use when formalizing a protocol, when the user describes a distributed algorithm to verify, when designing a consensus or locking scheme, or when starting formal verification of a concurrent system.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-spec-generator

santosomar/tlaplus-model-reduction

testing

VerifiedTrustedCommunity

Reduces a TLA+ model so TLC can actually check it — shrinks constants, adds state constraints, abstracts data, or applies symmetry — when the state space is too large to enumerate. Use when TLC runs out of memory, when checking takes hours, or when a spec works at N=2 and you need confidence at larger scale.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-model-reduction

santosomar/tlaplus-guided-code-repair

development

VerifiedTrustedCommunity

TLA+-specific instance of model-guided repair — reads a TLC error trace, identifies the enabling condition that should have been false, strengthens the corresponding action, and maps the fix to source code. Use when TLC reports an invariant violation or deadlock and you have the code-to-TLA+ mapping from extraction.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-guided-code-repair

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/santosomar/general-secure-coding-agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r general-secure-coding-agent-skills/skills/verification/counterexample-to-test-generator ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

santosomar/general-secure-coding-agent-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT