skills/formal-verify/SKILL.md
Continuous formal verification of architectural constraints and code quality. Use when asked to verify, audit, or validate codebase integrity. Runs automatically via hooks on every edit (structural) and pre-commit (full). Catches ownership violations, boundary crossings, state machine bugs, and code smells that grep ratchets miss. Triggers: "verify", "formal verify", "check architecture", "audit code quality", "run verification", "/verify", "/verify --bootstrap", "/verify --grade".
npx skillsauth add petekp/claude-skills formal-verifyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill when architectural intent matters more than "it compiles."
This skill runs a three-layer verification loop:
The layers are intentionally tiered:
/verify: all three layersBootstrap a target project with:
/verify --bootstrap
Bootstrap runs four phases:
.verifier//verify
Runs all layers in verbose mode and prints a unified report./verify --bootstrap
Installs dependencies, creates .verifier/, and scaffolds the first rule set./verify --evolve
Checks for drift between architectural docs and existing verification specs./verify --grade
Runs Layer 3 only and reports the current elegance grade.The runner extracts facts from Rust and Swift source files, then checks
structural.yaml rules such as:
Structural checks are the default PostToolUse hook because they are the fastest.
Behavioral verification covers state transitions and protocol contracts:
Use this layer at slice checkpoints, before risky merges, and whenever a change touches coordination logic or cross-language contracts.
Elegance auditing scores code for:
It produces a grade and line-level deductions so the agent can clean up code, not just make it technically correct.
When a violation is found, tailor the output to the audience:
If the agent fails to resolve the same violation three times, stop the fix loop and escalate with:
Bootstrap creates and maintains:
.verifier/
├── structural.yaml
├── elegance.yaml
├── specs/
├── facts/
└── reports/
structural.yaml stores declarative Layer 1 ruleselegance.yaml stores thresholds and grade policyspecs/ stores Z3Py and TLA+ behavioral specsfacts/ caches extracted AST factsreports/ stores the most recent verification outputsfacts/ and reports/ should be gitignored in the target project.
/verify before claiming a migration is complete./verify --grade when the code is correct but still feels rough.SKILL.md focused on orchestration; pull detailed mechanics from the
references below.@references/layer1-structural.md
Fact extraction, Z3 encoding, reachability, and incremental invalidation.@references/layer2-behavioral.md
When to use TLA+/Apalache versus Z3Py, plus spec execution contracts.@references/layer3-elegance.md
Metric families, grading, thresholds, and the Layer 3 sub-module layout.@references/constraint-yaml-spec.md
Structural rule schema, selectors, assertions, and fact pattern operators.@references/bootstrap-process.md
The install, discover, interview, validate bootstrap workflow.@references/agent-feedback-loop.md
Hook integration, violation injection, retries, and escalation policy.@references/spec-authoring-guide.md
Translating plain-English architectural intent into formal specs.tools
Comprehensively manually test the Circuit plugin's user-facing surface in either Claude Code or Codex. Use this skill whenever the user asks to "manually test Circuit", "QA the Circuit plugin", "exercise the Circuit surface", "run the Circuit checklist", "smoke test Circuit", "find regressions in Circuit", "test the Claude Circuit plugin", "test the Codex Circuit plugin", or when preparing a Circuit release for marketplace publication. Argument is the host package to test — `claude` or `codex`. Produces a Markdown report with per-command pass/fail, exploratory findings ranked by severity, run-folder evidence links, and a concise terminal summary. Use even if the user does not say the word "test" — phrases like "go through every Circuit command" or "make sure Circuit still works end-to-end" should also trigger.
development
Turn the prompt supplied with this skill into a concise, auditable Codex Goal or explain why a Goal is not the right fit. Use when the user asks to draft, formulate, rewrite, tighten, or create a `/goal` from a plain-language task, especially for multi-step work that needs a durable objective, evidence-based completion, constraints, iteration policy, and a default adversarial review loop.
development
Give the human a fast, plain-English catch-up on what changed in the project: what the agents did, why, and what decisions need their input. Use this whenever the user asks to "catch me up", "what changed", "where are we", "recap", "brief me", "give me the rundown", "what did you do", "summarize the session", "fill me in", or otherwise signals they have been away and want to get back up to speed quickly. Built for someone steering several agent-driven projects at once who does not read the code closely but needs to grasp the core ideas, the choices made, and the open decisions well enough to steer. Trigger even if they do not use these exact words: any request to get oriented on recent progress should use this skill.
tools
Expert Unix and macOS systems engineer for shell scripting, system administration, command-line tools, launchd, Homebrew, networking, and low-level system tasks. Use when the user asks about Unix commands, shell scripts, macOS system configuration, process management, or troubleshooting system issues.