
Discovery-driven planning for tasks in existing codebases. Explores code, gathers requirements, identifies edge cases, and produces prd.json + plan.json under .prove/runs/<branch>/<slug>/ for the orchestrator.
Create a focused handoff prompt file for clean conversation-level handoffs. Deterministically assembles context from git state and prove artifacts, generates a small LLM pickup note, recommends the right agent, and outputs the exact command to start a fresh session. Use when hitting context limits, transitioning between phases, or wanting a clean restart without losing context.
Clean up all task artifacts (plans, reports, branches, handoff context) with optional archiving to .prove/archive/. Use after a task lifecycle is complete to archive key documents and remove working artifacts.
Iterative code quality audit that loops until clean. Runs the code-steward agent in a bounded fix-audit cycle — first pass gets human approval, subsequent passes auto-fix until the audit returns clean or the iteration cap is hit.
Orchestrate a multi-agent round-table discussion among expert subagents who build on each other's ideas across multiple rounds. Use when the user wants diverse expert perspectives on a hard problem -- game design, architecture, debugging, balance, strategy, or any domain where a single viewpoint is insufficient. Also triggers on: "get multiple perspectives", "debate this", "what would experts say", "round-table", "brainstorm with agents", "devil's advocate", or when a problem clearly benefits from structured multi-expert deliberation even if the user doesn't explicitly ask for it.
Interactive brainstorming sessions for software architecture, product scoping, and general engineering. Use when the user wants to explore ideas, gather requirements, narrow down solutions, weigh trade-offs, or make technical decisions. Triggers on "brainstorm", "let's think through", "help me decide", "what approach should", "pros and cons", or any open-ended design/architecture discussion. Saves decisions to .prove/decisions/ directory.
Unified notification skill. Configures orchestrator reporters (Slack, Discord, MCP, custom) by generating bash scripts and updating .claude/.prove.json, and sends test notifications through configured reporters. Triggers: "notify setup", "set up notifications", "configure alerts", "slack reporter", "discord reporter", "test notifications".
Planning for tasks and individual steps. Use to plan a task (discovery-driven requirements gathering producing prd.json + plan.json for the orchestrator), plan implementation, or drill into a numbered plan step for design decisions, edge cases, and test strategy. Triggers on "plan a task", "plan this", "plan implementation", "requirements gathering", "plan step", "Let's work on step 1.2.3", "design decisions", "edge cases", "test strategy", "drill into step".
Create Claude Code skills, slash commands, subagents, or technical specs. Dispatches by type. Use when the user wants to create a new skill, new slash command, new subagent, new agent, write a spec, draft a specification, formalize a design, audit a spec, promote a decision record, or needs help with skill descriptions, command frontmatter, agent tool permissions, or RFC-style technical documentation. Triggers on "create a skill", "new skill", "create a command", "new slash command", "create an agent", "new subagent", "write a spec", "spec for", "draft an RFC", "formalize this", "audit this spec", "revise the spec".
Prompt engineering toolkit — craft optimized LLM prompts, manage the research cache, and count tokens. Dispatches to three subcommands. Use when the user wants to craft a prompt, write a prompt, generate a system prompt, create an agent prompt, optimize a prompt, do prompt engineering, manage the prompt cache, prune the cache, refresh cached research, check token count, or measure prompt size.
Assemble per-commit intent manifests into a review document. Launch the browser-based review UI for structured accept/reject per intent group. Falls back to LLM reconstruction when manifests are missing.
Generate optimized LLM prompts using the bundled prompt engineering guide and optional research. Delegates to the llm-prompt-engineer agent. Use when the user wants to create a new prompt, system instruction, or agent definition from scratch. Triggers on "craft a prompt", "write a prompt", "generate a system prompt", "create an agent prompt", "prompt for", or any request to produce a new LLM prompt.
Guide for creating Claude Code slash commands with best practices. Use when the user wants to create a new slash command, asks about slash command structure, or needs help with command frontmatter, arguments, or tool restrictions.
Create, revise, and audit technical specifications following RFC/IETF conventions. Use when the user wants to write a new spec, edit an existing spec, review a spec for completeness, or formalize a design decision into a specification document. Triggers on "write a spec", "spec for", "formalize this", "draft a specification", "audit this spec", "revise the spec", "protocol spec", "format spec", or any request to create structured technical documentation with normative requirements. Also triggers when the user has a brainstorm decision record they want to turn into a formal spec.
Create Claude Code subagents. Triggers on "create an agent", "new subagent", agent design questions, or tool permission/role definition requests.
Semantic commit assistant. Reads scopes from .claude/.prove.json, detects scope gaps and offers to register new ones, groups changes into logical units, and creates conventional commits.
Generate machine-parseable, LLM-optimized documentation for Claude Code agents. Use when documenting agents, subagents, APIs, modules, or any code that other agents will consume. Triggers include "document this agent", "write agent docs", "create API docs for agents", or when documentation needs to be actionable by LLMs.
Analyze scope and delegate to docs-writer and/or agentic-doc-writer. Triggers on "auto docs", "document everything", "generate docs".
Generate and maintain an LLM-optimized CLAUDE.md for the target project. Scans the codebase (tech stack, conventions, structure), reads .claude/.prove.json config, and composes a concise CLAUDE.md with behavioral directives that Claude Code follows during the session. Full ownership of the file — safe to re-run, always produces deterministic output.
Manage the prompt engineering research cache. List, add, prune, or refresh cached research artifacts across three tiers — plugin-bundled, global, and project. Use when the user wants to manage their prompt research cache.
Post-diff Socratic quiz that builds deep comprehension of agent-generated code. Analyzes recent changes, generates causal/design questions, quizzes the developer interactively, and logs comprehension gaps.
--- name: docs-writer description: Generate human-readable documentation (READMEs, guides, API references, contributor docs). Triggers: "document this", "write docs", "create a README", "write a guide for". --- # Docs Writer Delegate writing to the `technical-writer` subagent. Follow `references/interaction-patterns.md` for all `AskUserQuestion` usage. ## Workflow 1. Identify subject type and audience 2. Gather context -- targeted reads, single pass 3. Delegate to `technical-writer` with str
Configure orchestrator notification reporters (Slack, Discord, MCP, custom). Generates bash scripts and updates .claude/.prove.json. Triggers on "notify setup", "set up notifications", "configure alerts".
Estimate token counts for prompt files. Wraps the token-count script for measuring agents, skills, commands, references, or any text file. Use when the user wants to check prompt size, compare token budgets, or measure files before/after optimization.
Project management skill for roadmapping, backlog grooming, vision documents, and shipping tracking. Use when the user wants to plan what to work on next, review the roadmap, add/prioritize backlog items, update the vision, log shipped work, or run a retro. Also triggers on "what should I work on", "update the roadmap", "add to backlog", "what have we shipped", "project status", or any planning/prioritization discussion. This skill provides the file structure and conventions; the product-owner agent provides the strategic reasoning.
Unified documentation skill. Generates human-readable docs (READMEs, guides, API references, getting-started), LLM-optimized agent/API docs, and regenerates CLAUDE.md. Triggers: "write docs", "document this", "README", "API reference", "getting-started", "LLM docs", "agent docs", "CLAUDE.md". For single-directive CLAUDE.md growth, use `/prove:remember` instead.
Lift reasoning-log findings into durable scrum decisions at milestone close — the milestone-close curation pass that promotes findings into the durable decision store. Triggers on "curate", "curate the milestone", "promote findings", "journal to codex", "promote to decisions", "curate reasoning log", "milestone curation", "promote hacks/risks/decisions". You are the driver Claude session: the scrum reconciler emits a `curation_proposed` event per closed-milestone task with candidate findings; you read the candidates, classify each as adr|glossary|pattern, human-gate each promotion with AskUserQuestion, and record it as a first-class scrum decision.
Drive two structured-agent methodologies on prove primitives: the top-down decompose ladder (charter/VISION/milestone → epic → story → task) and AC-gated story-close. Triggers on "decompose", "decompose the milestone", "break this epic into stories", "ladder down", "decompose ladder", "close the story", "story close", "verify acceptance criteria", "AC-gated close", "run the acceptance criteria", "decompose into epics/stories/tasks". You are the driver Claude session: a planning subagent (Agent tool, native structured output) emits each layer's child list, you write children as layered scrum tasks, an AskUserQuestion gate promotes them, and you recurse. Story-close dispatches each acceptance criterion by kind (bash/assert/gate/ agent), writes a verification reasoning-log entry per criterion, promotes the run's durable decisions into the scrum decision store (human-gated), then delegates worktree/validation/review/merge to orchestrator full-mode.
Gather charter, team, or decomposition-kickoff answers through a self-contained HTML intake form instead of a conversational interview, then drive the same writer the conversation would. Triggers on "intake form", "fill out a form", "charter form", "team form", "decompose form", "HTML form", "render an intake form", "form instead of questions". You are the driver: render the form with `claude-prove intake render`, hand the operator the file to fill and copy, read the pasted-back payload, validate it with `claude-prove intake validate`, and map the validated answers onto the existing writer (bootstrap for charter, `scrum team` for team, the decompose ladder for decompose). The form and the interview are two front-ends to ONE writer — never a second writer.
Autonomous task orchestrator with mode dispatch. Absorbs autopilot (plan-exists, skip PRD) and full-auto (no plan, requirements-first) as invocation modes. Auto-scales between simple mode (<=3 steps, sequential, no worktrees) and full mode (4+ steps, parallel worktrees with mandatory principal-architect review). Each run stores state as JSON under .prove/runs/<branch>/<slug>/ (prd.json, plan.json, state.json, reports/*.json) and is mutated only through the run_state CLI. Creates feature branches, runs validation gates, commits per step, and supports rollback via git. Triggers on "orchestrate", "orchestrator", "autopilot", "full auto", "full-auto", "autonomous execution", "run autonomously", "hands-off", "hands-off mode", "implement without me".
Prune stale cached versions of the prove plugin from Claude Code's plugin cache. Use when superseded versions pile up under plugins/cache and agents read stale skills/references from them, or when reclaiming plugin-cache disk space. Triggers on "clean up cached plugin versions", "prune the plugin cache", "remove old prove versions", "stale plugin cache".
Analyze the active task plan and .claude/.prove.json to configure .claude/settings.local.json with scoped permission rules. Use before orchestrator, autopilot, or implementation. Triggers on "prep permissions", "setup permissions", "configure permissions", "allow tools", "stop asking me".
Anchor session context into prove primitives before compaction and rehydrate from them after. Built-in compaction summarizes by recency and drops the claude-prove state an agent needs to reorient; this skill externalizes volatile context into durable anchors (scrum tasks, decisions, run-state, a compact-anchors pointer file) pre-compact, then runs a deterministic reorientation sequence post-compact. Use before a manual /compact, when context is about to auto-compact, or immediately after a compaction. Triggers on "smart compact", "prepare for compaction", "anchor before compact", "context is getting long", "rehydrate", "reorient after compact".
Code quality audit and fix orchestration. Three modes — session review of branch diff (default), full deep audit with PCD pipeline + parallel fixes, iterative fix-audit loop bounded by max-passes. Invoked by /prove:steward, /prove:steward-review, /prove:auto-steward, steward, auto-steward, steward review, code audit, code quality, clean up the code, fix code quality, refactor for clarity.
Task-lifecycle operations on orchestrator runs — handoff, pickup, resume context, progress, orchestrator status, complete task, merge task, task cleanup, archive task, clean artifacts. Dispatches subcommands: `handoff` serializes conversation context to .prove/handoff.md; `pickup` resumes from it; `progress` reports read-only status across active runs; `complete <slug>` merges a task branch to main and cleans up; `cleanup <slug>` archives artifacts without merging. Use when hitting context limits, transitioning sessions, checking run status, or closing out tasks.
Synthesize the 7-section risk-forward Review Brief from a run's reasoning log. Triggers on "reasoning brief", "review brief", "synthesize the brief", "generate the brief", "brief the run", "brief for review", "story brief". You are the driver: the `acb brief` CLI renders a mechanical preservation-safe backbone and proves preservation; you synthesize the narrative prose (summary + changes), single-pass or multipass over episode chunks, then gate it through Stage-1 (mechanical, blocking) and Stage-2 (prose judge, advisory).
Apply model-driven CONTENT reshaping to stored run artifacts that sit behind the current schema, on explicit operator invocation only. Triggers on "migrate runs", "migrate run artifacts", "run content migration", "reshape run artifacts", "bring runs to current schema". You are the driver: the `run-state migrate-runs` CLI mechanically detects which artifacts are behind and emits a plan naming each one plus its migration-instruction file; you read the instructions and reshape the prose/findings, gated by the operator. The deterministic `schema migrate` handles structural column moves; this skill covers only the content reshaping beyond them. Never run as a background or resident loop — only when the operator asks.
Execute a whole scrum milestone or a plan.json task tree as one parallel fan-out run. Triggers on "run the milestone", "execute milestone", "workflow", "fan out the milestone", "parallel milestone", "run the whole task tree", "milestone autopilot". Compiles the dependency graph to a plan, runs its tasks in parallel waves through the orchestrator's full-mode machinery (worktrees, validators, principal-architect review, sequential merge), mirrors task status back to the scrum store, and auto-rebounds on merge conflicts. Raises per-wave fan-out above the orchestrator default.
Build, update, or query the content-addressable file index (CAFI) that helps agents navigate the codebase via routing-hint descriptions.
Session-scoped code quality review. Audits only source files changed in the current branch/task, skipping tests. Lighter version of /prove:steward for use during active work.
Create Claude Code skills with best practices for description tuning, resource bundling, and interaction patterns. Use when the user wants to create a new skill, asks about skill structure, or needs help with skill descriptions and trigger phrases.
Interactive planning and requirement gathering for a specific task/step from the active run's plan.json. Use when the user wants to dig into a numbered step (e.g., "Let's work on step 1.2.3") to create detailed requirements, make design decisions, identify edge cases, and define test strategies BEFORE implementation.