skills/debugging/counterexample-debugger/SKILL.md
Interprets and explains counterexamples produced by model checkers or property-based testing tools to make them actionable. Use when TLC, NuSMV, CBMC, or a property-based test emits a counterexample the user doesn't understand, when a trace is too long to read, or when mapping a model-level trace back to source code.
npx skillsauth add santosomar/general-secure-coding-agent-skills counterexample-debuggerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A counterexample is a proof that a property fails — a concrete sequence of states that leads to the violation. The trace is correct by construction; the job is to make it understandable.
| Tool | Counterexample format | Key decoding move |
| ---------------------- | ---------------------------------------- | ------------------------------------------------------ |
| TLC (TLA+) | Numbered states with full variable dumps | Diff consecutive states — only the changes matter |
| NuSMV / nuXmv | State trace + -> State: N.M <- headers | Same: diff states, ignore unchanged vars |
| CBMC / ESBMC | C-level trace with assignments | Map each assignment to a source line |
| Alloy | Instance graph (relational) | Project to the atoms involved in the violated assertion|
| Hypothesis / QuickCheck| Shrunk failing input | The input IS the counterexample — no trace to decode |
Raw traces are unreadable because they show every variable at every step. Compress:
i → i+1, show only variables that changed.(... 5 irrelevant steps ...).Step 3 [ClientSend] beats Step 3.A 40-state TLC trace usually compresses to 5–8 relevant transitions.
After compression, narrate the trace as a story:
Property: Inv == (lock_holder /= NULL) => (waiters = {})
(If someone holds the lock, no one should be waiting — i.e., the lock is supposed to be fair.)
Raw trace: 12 states, 7 variables each. 84 lines.
Compressed:
Step 1 [Init] lock_holder=NULL waiters={} pc=<<idle,idle>>
Step 4 [Acquire(1)] lock_holder=1 ← trigger
Step 7 [Wait(2)] waiters={2} ← point of no return
Step 8 [CheckInv] VIOLATION: lock_holder=1 ∧ waiters={2}
Narrative: Process 1 acquires the lock. Three steps later — before process 1 releases — process 2 enters the wait set. The invariant assumed acquire-release would be atomic from the waiters' perspective. It isn't.
Map to code: Step 7's Wait(2) action corresponds to Lock::wait() at lock.c:44. The check-then-wait in that function isn't guarded.
Liveness counterexamples (<>[]P failures) are lassos: a finite prefix followed by a cycle that repeats forever. TLC marks the loop-back point with Back to state N.
WF_vars(SomeAction). The trace will show an enabled action that never fires.counterexample-to-test-generator consumes the narrative, not the raw trace.## Property violated
<property, restated in English>
## Compressed trace
Step <N> [<action>] <changed vars only> ← <annotation: trigger / PNR / violation>
...
## Narrative
<setup → trigger → point of no return → violation, in prose>
## Source mapping
Step <N> ↔ <file>:<line> (<function>)
## Suggested fix location
<file>:<line> — <what about this code allows step N>
development
Extracts human-readable pseudocode from a verified formal artifact (Dafny, Lean, TLA+) while preserving the verified properties as annotations, so the proof-carrying logic can be reimplemented in a production language. Use when porting verified code to an unverified target, when documenting what a formal spec actually does, or when handing a verified algorithm to an implementer.
development
Translates natural-language or pseudocode descriptions of concurrent and distributed systems into TLA+ specifications ready for the TLC model checker. Identifies state variables, actions, type invariants, safety properties, and liveness properties from the description. Use when formalizing a protocol, when the user describes a distributed algorithm to verify, when designing a consensus or locking scheme, or when starting formal verification of a concurrent system.
testing
Reduces a TLA+ model so TLC can actually check it — shrinks constants, adds state constraints, abstracts data, or applies symmetry — when the state space is too large to enumerate. Use when TLC runs out of memory, when checking takes hours, or when a spec works at N=2 and you need confidence at larger scale.
development
TLA+-specific instance of model-guided repair — reads a TLC error trace, identifies the enabling condition that should have been false, strengthens the corresponding action, and maps the fix to source code. Use when TLC reports an invariant violation or deadlock and you have the code-to-TLA+ mapping from extraction.