skills/verification/python-to-dafny-translator/SKILL.md
Translates Python functions into Dafny, adding types, pre/postconditions, and loop invariants sufficient for Dafny to verify. Use when formally verifying a Python algorithm, when the user wants machine-checked correctness for a critical function, or when building a verified reference implementation.
npx skillsauth add santosomar/general-secure-coding-agent-skills python-to-dafny-translatorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Dafny is a verification-aware language: you write code + spec together, and the verifier checks the code meets the spec. Translating from Python means adding everything Dafny needs that Python omits: types, termination measures, loop invariants, and a precise contract.
| Python has | Dafny needs | You must supply |
| ----------------------------- | ----------------------------------- | -------------------------------------------------------- |
| Duck typing | Static types | Type annotations — infer from usage if not annotated |
| Unbounded int | Unbounded int — ✅ matches | Nothing. This is the easy one. |
| list (mutable, heterogeneous) | seq<T> (immutable, homogeneous) or array<T> (mutable, heap) | Choose: algorithm is functional → seq. Mutates in place → array + frame conditions. |
| dict | map<K,V> (immutable) | Usually fine — most dict uses are read-mostly |
| for x in xs | while i < |xs| with index | Rewrite as indexed loop; Dafny's forall is a quantifier, not a loop |
| Exceptions | No exceptions | Encode as Result datatype, or as a precondition that rules out the failing case |
| Implicit termination | Explicit decreases clause | Find the measure — usually the loop bound or recursion arg size |
| No spec | requires/ensures | The hard part. See below. |
The contract has to come from somewhere. In priority order:
ensures.binary_search, sort, gcd) → use the textbook postcondition.ensures/requires.| Python construct | Dafny translation |
| ----------------------------- | ------------------------------------------------------------------ |
| def f(x): ... return y | method f(x: T) returns (y: U) or function f(x: T): U if pure |
| x = []; x.append(...) | var x: seq<T> := []; x := x + [elem]; |
| for i in range(n): | var i := 0; while i < n { ...; i := i + 1; } |
| for x in xs: | var i := 0; while i < \|xs\| { var x := xs[i]; ...; i := i + 1; } |
| if c: return a; return b | if c { y := a; } else { y := b; } (in a method) |
| x[i] = v (list) | If array: x[i] := v;. If seq: x := x[..i] + [v] + x[i+1..]; |
| raise ValueError | Precondition requires <not-that-case> — or return Failure(msg) |
| a // b | a / b (Dafny / on int is floor for positives — sign semantics differ for negatives, add requires b > 0) |
Every while needs invariants strong enough that invariant ∧ ¬guard ⟹ postcondition.
Pattern — accumulator loops: The invariant says "the accumulator holds the result for the prefix processed so far."
// Python: total = sum(xs)
var i, total := 0, 0;
while i < |xs|
invariant 0 <= i <= |xs|
invariant total == Sum(xs[..i]) // "correct so far"
decreases |xs| - i
{ total := total + xs[i]; i := i + 1; }
Pattern — search loops: The invariant says "the thing isn't in the part we've already looked at."
Pattern — two-pointer: Two index-bound invariants plus the relationship between them.
If you can't find the invariant: → invariant-inference.
Python:
def index_of(xs: list[int], target: int) -> int:
"""Return index of target in xs, or -1."""
for i, x in enumerate(xs):
if x == target:
return i
return -1
Dafny:
method IndexOf(xs: seq<int>, target: int) returns (r: int)
ensures r == -1 ==> target !in xs
ensures r >= 0 ==> 0 <= r < |xs| && xs[r] == target
{
var i := 0;
while i < |xs|
invariant 0 <= i <= |xs|
invariant target !in xs[..i] // not found in the prefix
decreases |xs| - i
{
if xs[i] == target { return i; }
i := i + 1;
}
return -1;
}
Verifies. The target !in xs[..i] invariant plus loop exit (i == |xs|) gives target !in xs, which discharges the r == -1 postcondition.
| Python feature | Why it's hard | Workaround |
| ----------------------------- | ---------------------------------------------------- | ------------------------------------------- |
| Deep mutation of aliased refs | Dafny's heap model needs modifies frames; aliasing blows them up | Rewrite to functional style with seq |
| **kwargs, *args | No variadic in Dafny | Fix arity for the specific call you're verifying |
| Dynamic dispatch / isinstance | Dafny has traits but dispatch is resolved at verification | One translated version per concrete type |
| I/O, network, filesystem | Dafny has no I/O model | Model as inputs/outputs; verify the pure core |
| Float | Dafny has real (mathematical), not IEEE 754 | If IEEE semantics matter, this is the wrong tool |
Flag these. A translation with assume false hidden where the hard part is is worse than no translation.
ensures true always verifies and proves nothing.assume to paper over a hard invariant. assume is a trust hole — mark it LOUDLY and justify it.decreases * means "I gave up on proving this terminates" — only acceptable if the function genuinely might not.## Spec (extracted from)
<docstring | algorithm name | tests | user-provided>
## Dafny
<code block — must verify clean or failures must be annotated>
## Trust holes
<every `assume` with justification, or "none">
## Untranslated
<Python constructs with no Dafny equivalent — and what you did instead>
development
Extracts human-readable pseudocode from a verified formal artifact (Dafny, Lean, TLA+) while preserving the verified properties as annotations, so the proof-carrying logic can be reimplemented in a production language. Use when porting verified code to an unverified target, when documenting what a formal spec actually does, or when handing a verified algorithm to an implementer.
development
Translates natural-language or pseudocode descriptions of concurrent and distributed systems into TLA+ specifications ready for the TLC model checker. Identifies state variables, actions, type invariants, safety properties, and liveness properties from the description. Use when formalizing a protocol, when the user describes a distributed algorithm to verify, when designing a consensus or locking scheme, or when starting formal verification of a concurrent system.
testing
Reduces a TLA+ model so TLC can actually check it — shrinks constants, adds state constraints, abstracts data, or applies symmetry — when the state space is too large to enumerate. Use when TLC runs out of memory, when checking takes hours, or when a spec works at N=2 and you need confidence at larger scale.
development
TLA+-specific instance of model-guided repair — reads a TLC error trace, identifies the enabling condition that should have been false, strengthens the corresponding action, and maps the fix to source code. Use when TLC reports an invariant violation or deadlock and you have the code-to-TLA+ mapping from extraction.