skills/proof-driven/SKILL.md
Proof-driven development. Use when implementing with formal verification using property-based testing, theorem proving, or proof tactics; zero unproven property policy enforced.
npx skillsauth add outlinedriven/odin-codex-plugin proof-drivenInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Prove properties from requirements before writing code. Proofs guide implementation, not the reverse. Zero unproven properties in final code.
Modern insight (2025): PBT + example tests pairing is the standard -- properties discover edge cases, example tests prevent regressions and serve as documentation. Counterexamples from shrinking should always become permanent regression tests. AI-assisted PBT (Anthropic 2025) can generate properties from docstrings, but human judgment for property selection remains essential.
See frameworks for language-specific PBT and stateful testing tools. See examples for brief property test patterns per language. See formal-tools for theorem provers and bounded model checkers.
| Category | Description | Example |
|----------|-------------|---------|
| Postcondition | Output satisfies contract | sorted(sort(xs)) |
| Invariant | Property preserved by operation | len(xs) == len(sort(xs)) |
| Idempotence | f(f(x)) == f(x) | deduplicate(deduplicate(xs)) |
| Inverse / Round-trip | g(f(x)) == x | decode(encode(x)) == x |
| Model-based | Implementation matches reference | my_sort(xs) == stdlib_sort(xs) |
| Commutativity | Order doesn't matter | a + b == b + a |
| Metamorphic | Relationship between outputs | sin(-x) == -sin(x) |
Most effective (OOPSLA 2025): Model-based properties (~80% bug detection), postconditions (~65%). Least effective: properties that reimplement the logic under test.
Anti-pattern: Don't reimplement the function in the property. Properties should be simpler than the code they test.
Before attempting a proof, reason through the property — SHORT-form KEYWORDS for internal scratchwork, break down the property into hypothesis and assumptions, critically review which proof strategy fits (induction, case analysis, contradiction, construction), validate each strategy against the property structure. Work through the proof step-by-step, verifying each step against the axioms. If a step fails, diagnose why before revising the strategy. For numeric calculation arising in the proof (e.g., bound arithmetic, complexity sums), invoke fend per the baseline rule; never self-calculate. Symbolic reasoning, case enumeration, and induction structure are in-head — they are not arithmetic.
assert fast_check works is not assert my_code worksShrinking transforms a failing complex input into the minimal input that still fails. This is the most valuable feature of PBT frameworks.
| Aspect | PBT | Fuzzing | |--------|-----|---------| | Input generation | Guided by properties | Guided by code coverage | | Oracle | User-written property assertions | Crashes/exceptions/timeouts | | Best for | Correctness, algorithms, contracts | Security, memory safety, crash detection | | Convergence (2025) | Hybrid tools (Bolero, Antithesis) combine both approaches |
Main Property (Goal)
|-- Supporting Property 1
| +-- Helper Property 1a
|-- Supporting Property 2
+-- Edge Case Property 3
| Gate | Pass Criteria | Blocking | |------|---------------|----------| | Framework | PBT framework available and configured | Yes | | Properties | All property tests pass | Yes | | Unproven | Zero skipped/pending properties | Yes | | Coverage | >= 80% line coverage | If present |
| Code | Meaning | |------|---------| | 0 | All properties pass, zero unproven/skipped | | 11 | Property testing framework not available | | 12 | No property tests created | | 13 | Property tests failed or proofs incomplete | | 14 | Coverage gaps (properties missing) |
testing
ODIN's compress-operations dispatcher under the Compressor/Extender role. Invoke on "tidy", "clean up", "tidy this file/memory/workspace/git/docs", or when active context (current file, diff, stack, memory directory) has structural rot to resolve before touching behavior. Detects target domain from context and routes to the sibling skill. Requires explicit target or clear active-context signal — do not invoke speculatively.
development
Cross-domain taste skill — apply distinctive judgment to any artifact (prose, code, design, decisions) instead of converging to AI defaults. Two modes — `audit` (judge work against the two-sided charter and portable anchors) and `anchor` (load register before producing). Auto-detects by phrasing; override via `/taste audit | anchor`. Trigger on "is this slop?", "overkill?", "elegant?", "taste-test this".
tools
One-shot bootstrap of strict-mode tooling per ecosystem plus per-task GOALS.md scaffolding so an agentic loop can self-verify. Writes typechecker/linter/schema-validator config for TS (strict + noUncheckedIndexedAccess + exactOptionalPropertyTypes), Python (Pyright strict, Ruff strict), Rust (Clippy deny-correctness), Go (golangci-lint with staticcheck), OCaml (dune --release); establishes `.agent-tasks/<id>/GOALS.md` per-task convention distinct from project-stable AGENTS.md. C++/Java/Kotlin and framework specifics (Spring Boot, Nest, React-strict) are out of scope. Trigger on new project bootstrap, agentic-task setup, "make this self-verifying", "set the loop's goal", "scaffold goals for this issue". Pairs with `llm-self-loop` runtime.
tools
Install git pre-commit hooks via the project's hook tool — Husky+lint-staged (JS), pre-commit (Python/OCaml), lefthook (Go), cargo-husky (Rust). Use when the user wants commit-time formatting, linting, type-checking, or test gates. Detects ecosystem first.