Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

santosomar/semantic-equivalence-verifier

Name: semantic-equivalence-verifier
Author: santosomar

skills/code-quality/semantic-equivalence-verifier/SKILL.md

npx skillsauth add santosomar/general-secure-coding-agent-skills semantic-equivalence-verifier

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Semantic Equivalence Verifier

→ behavior-preservation-checker tests on a sample; this skill proves over the full input space. Use when "we ran it on 500 inputs" isn't enough.

When proof beats testing

| Situation | Why testing fails | | ------------------------------------------------------ | --------------------------------------------------- | | Input space is infinite and adversarial | Attacker picks the input you didn't test (crypto, parsers, sanitizers) | | Rare edge case matters (overflow, boundary) | 1-in-2³² inputs — fuzzing takes forever | | Regulatory / safety requirement | "We tested it" isn't certifiable evidence | | The transformation is algebraic | x*2 ↔ x<<1 — easy to prove, tedious to test |

Proof strategies — pick by code shape

| Shape | Strategy | | -------------------------------------- | --------------------------------------------------------------------- | | Straight-line, no loops | Symbolic execution → discharge old_out == new_out with an SMT solver | | Loop, both versions have same structure| Show loop bodies equivalent + same termination → induction | | Loop, restructured (unrolled, fused) | Find a simulation relation between iteration states | | Recursive | Structural induction on the recursion argument | | Different algorithms, same spec | Can't prove A≡B directly. Prove A⊨spec ∧ B⊨spec separately. |

Step-by-step — SMT-backed equivalence (common case)

Translate both fragments to a common logical form. Same input variable names, same state model.
Assert the preconditions: assume(pre(x)).
Assert inequivalence of outputs: assert(old_out != new_out).
Solve. If UNSAT → no input makes them differ → equivalent. If SAT → the model is a counterexample.

For loops: replace the loop with its invariant. You need the invariant to be the same for both versions (or for one to imply the other).

Worked example

Old:

int abs_old(int x) {
    if (x < 0) return -x;
    return x;
}

New (branchless):

int abs_new(int x) {
    int mask = x >> 31;
    return (x + mask) ^ mask;
}

Encode (SMT-LIB, bitvector theory):

(declare-const x (_ BitVec 32))
(define-fun old () (_ BitVec 32)
  (ite (bvslt x #x00000000) (bvneg x) x))
(define-fun new () (_ BitVec 32)
  (let ((mask (bvashr x #x0000001f)))
    (bvxor (bvadd x mask) mask)))
(assert (not (= old new)))
(check-sat)

Solve: unsat. Equivalent for all 32-bit inputs. ∎

But wait — what about INT_MIN? -INT_MIN overflows in C (UB). The SMT encoding uses modular arithmetic, so bvneg(INT_MIN) = INT_MIN, and both versions return INT_MIN. Equivalent, including at the UB point — they both return the wrong thing. The proof shows equivalence, not correctness. If you need correctness, add (assume (not (= x INT_MIN))) or prove against the spec result ≥ 0.

When proof fails

| Failure | What it means | Next move | | ------------------------------------------ | -------------------------------------------------- | ------------------------------------------ | | SAT — solver found a counterexample | They're NOT equivalent | Look at the model. Is it a real input, or outside the precondition you forgot to state? | | Unknown / timeout | Solver couldn't decide | Add lemmas; break into smaller pieces; try a different solver | | Can't encode the construct | Heap, I/O, unbounded recursion | Abstract it — model the heap as an uninterpreted function; axiomatize I/O |

Limits — when to fall back to testing

Heavy heap mutation → state space explodes
Floating point → solvers handle it but slowly and with surprising edge cases (NaN, -0.0)
External calls → can't symbolically execute what you don't have the body of
Code > ~500 lines → decompose or give up on full proof; → behavior-preservation-checker with heavy fuzzing

Do not

Do not claim equivalence without stating the precondition. "Equivalent for all inputs" and "equivalent for non-null inputs" are different theorems.
Do not confuse "equivalent" with "correct." Two functions can be identically wrong. Equivalence is A↔B; correctness is A↔spec.
Do not trust a proof of a hand-translation. If you translated the code to SMT-LIB by hand, the bug might be in the translation. Use a trusted frontend (→ python-to-dafny-translator, CBMC's C frontend) when possible.

Output format

## Claim
old: <fragment>
new: <fragment>
Equivalent under precondition: <P(x)>

## Proof strategy
<SMT | induction | simulation>

## Result
<PROVEN EQUIVALENT | COUNTEREXAMPLE: x=<val>, old=<o>, new=<n> | UNKNOWN — <why>>

## Trust base
<what the proof relies on: solver correctness, translation fidelity, axioms assumed>

santosomar/semantic-equivalence-verifier

skills/code-quality/semantic-equivalence-verifier/SKILL.md

Proves two program fragments semantically equivalent using symbolic reasoning — stronger than testing, applicable when differential testing is insufficient or impossible. Use when behavior preservation must be proven rather than sampled, when the input space is too large to enumerate, or when a transformation needs a correctness argument.

development

Updated Apr 13, 2026

$ install --global

skillsauth

npx skillsauth add santosomar/general-secure-coding-agent-skills semantic-equivalence-verifier

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 13, 2026, 4:07 AM149.5s1 file scanned

SKILL.md

name:: semantic-equivalence-verifier
description:: Proves two program fragments semantically equivalent using symbolic reasoning — stronger than testing, applicable when differential testing is insufficient or impossible. Use when behavior preservation must be proven rather than sampled, when the input space is too large to enumerate, or when a transformation needs a correctness argument.
license:: Apache-2.0
category:: code-quality
suite:: general-secure-coding-agent-skills
version:: 0.3.0
related:: behavior-preservation-checker, python-to-dafny-translator

Semantic Equivalence Verifier

→ behavior-preservation-checker tests on a sample; this skill proves over the full input space. Use when "we ran it on 500 inputs" isn't enough.

When proof beats testing

Proof strategies — pick by code shape

Step-by-step — SMT-backed equivalence (common case)

Translate both fragments to a common logical form. Same input variable names, same state model.
Assert the preconditions: assume(pre(x)).
Assert inequivalence of outputs: assert(old_out != new_out).
Solve. If UNSAT → no input makes them differ → equivalent. If SAT → the model is a counterexample.

For loops: replace the loop with its invariant. You need the invariant to be the same for both versions (or for one to imply the other).

Worked example

Old:

int abs_old(int x) {
    if (x < 0) return -x;
    return x;
}

New (branchless):

int abs_new(int x) {
    int mask = x >> 31;
    return (x + mask) ^ mask;
}

Encode (SMT-LIB, bitvector theory):

(declare-const x (_ BitVec 32))
(define-fun old () (_ BitVec 32)
  (ite (bvslt x #x00000000) (bvneg x) x))
(define-fun new () (_ BitVec 32)
  (let ((mask (bvashr x #x0000001f)))
    (bvxor (bvadd x mask) mask)))
(assert (not (= old new)))
(check-sat)

Solve: unsat. Equivalent for all 32-bit inputs. ∎

When proof fails

Limits — when to fall back to testing

Heavy heap mutation → state space explodes
Floating point → solvers handle it but slowly and with surprising edge cases (NaN, -0.0)
External calls → can't symbolically execute what you don't have the body of
Code > ~500 lines → decompose or give up on full proof; → behavior-preservation-checker with heavy fuzzing

Do not

Do not claim equivalence without stating the precondition. "Equivalent for all inputs" and "equivalent for non-null inputs" are different theorems.
Do not confuse "equivalent" with "correct." Two functions can be identically wrong. Equivalence is A↔B; correctness is A↔spec.
Do not trust a proof of a hand-translation. If you translated the code to SMT-LIB by hand, the bug might be in the translation. Use a trusted frontend (→ python-to-dafny-translator, CBMC's C frontend) when possible.

Output format

## Claim
old: <fragment>
new: <fragment>
Equivalent under precondition: <P(x)>

## Proof strategy
<SMT | induction | simulation>

## Result
<PROVEN EQUIVALENT | COUNTEREXAMPLE: x=<val>, old=<o>, new=<n> | UNKNOWN — <why>>

## Trust base
<what the proof relies on: solver correctness, translation fidelity, axioms assumed>

Related Skills

santosomar/verified-pseudocode-extractor

development

VerifiedTrustedCommunity

Extracts human-readable pseudocode from a verified formal artifact (Dafny, Lean, TLA+) while preserving the verified properties as annotations, so the proof-carrying logic can be reimplemented in a production language. Use when porting verified code to an unverified target, when documenting what a formal spec actually does, or when handing a verified algorithm to an implementer.

SKILL.mdUpdated Apr 13, 2026

santosomar/verified-pseudocode-extractor

santosomar/tlaplus-spec-generator

development

VerifiedTrustedCommunity

Translates natural-language or pseudocode descriptions of concurrent and distributed systems into TLA+ specifications ready for the TLC model checker. Identifies state variables, actions, type invariants, safety properties, and liveness properties from the description. Use when formalizing a protocol, when the user describes a distributed algorithm to verify, when designing a consensus or locking scheme, or when starting formal verification of a concurrent system.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-spec-generator

santosomar/tlaplus-model-reduction

testing

VerifiedTrustedCommunity

Reduces a TLA+ model so TLC can actually check it — shrinks constants, adds state constraints, abstracts data, or applies symmetry — when the state space is too large to enumerate. Use when TLC runs out of memory, when checking takes hours, or when a spec works at N=2 and you need confidence at larger scale.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-model-reduction

santosomar/tlaplus-guided-code-repair

development

VerifiedTrustedCommunity

TLA+-specific instance of model-guided repair — reads a TLC error trace, identifies the enabling condition that should have been false, strengthens the corresponding action, and maps the fix to source code. Use when TLC reports an invariant violation or deadlock and you have the code-to-TLA+ mapping from extraction.

SKILL.mdUpdated Apr 13, 2026

santosomar/tlaplus-guided-code-repair

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/santosomar/general-secure-coding-agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r general-secure-coding-agent-skills/skills/code-quality/semantic-equivalence-verifier ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

santosomar/general-secure-coding-agent-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT