skills/orchestrator/SKILL.md
Autonomous task orchestrator with mode dispatch. Absorbs autopilot (plan-exists, skip PRD) and full-auto (no plan, requirements-first) as invocation modes. Auto-scales between simple mode (<=3 steps, sequential, no worktrees) and full mode (4+ steps, parallel worktrees with mandatory principal-architect review). Each run stores state as JSON under .prove/runs/<branch>/<slug>/ (prd.json, plan.json, state.json, reports/*.json) and is mutated only through the run_state CLI. Creates feature branches, runs validation gates, commits per step, and supports rollback via git. Triggers on "orchestrate", "orchestrator", "autopilot", "full auto", "full-auto", "autonomous execution", "run autonomously", "hands-off", "hands-off mode", "implement without me".
npx skillsauth add mjmorales/claude-prove orchestratorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Simple mode (<=3 steps): sequential, no worktrees, lightweight reporting. Full mode (4+ steps): parallel worktrees, architect review, full tracking.
Requires .prove/runs/<branch>/<slug>/plan.json for execution. If missing and the user wants a plan-first flow, use --full (starts with PRD phase) or run /prove:plan --task first.
The skill is entered via $ARGUMENTS. Parse the first token as a mode flag:
| Flag | Meaning | Entry Point |
|------|---------|-------------|
| --autopilot [plan-id] | Plan already exists; skip PRD phase. Derive slug from arg (or active run), locate .prove/runs/<branch>/<slug>/plan.json, begin execution. | Phase 0 |
| --full [task-desc] | No plan yet; gather requirements first. task-desc seeds the PRD. | "Full Mode: Requirements Gathering (PRD)" |
| (none) | Auto-detect (see below). | Resolved mode |
.prove/runs/<branch>/<slug>/ with a plan.json. If exactly one exists → act as --autopilot with that slug.AskUserQuestion header "Mode":
AskUserQuestion listing slugs, plus a "Full (new run)" option..prove-wt-slug.txt in the current worktree, or from the single run dir on the branch.After mode + slug are resolved, proceed to the appropriate phase. All downstream phases (Initialization, Plan Review, Execution Loop, Completion, Merge & Cleanup) are identical across modes — only the entry point differs.
All run artifacts are JSON and live under .prove/runs/<branch>/<slug>/:
prd.json, plan.json — write-once inputsstate.json — hot path, mutated only via scripts/prove-run ... (thin wrapper over claude-prove run-state ...)reports/<step_id>.json — write-once per stepDirect edits to state.json are blocked by a PreToolUse hook. Render human views JIT (run_state show).
All .prove/... paths resolve from the main worktree ($MAIN_ROOT), not the orchestrator worktree. Scripts find this via git worktree list.
Derive slug + branch namespace from user input:
feature, fix, chore (default feature) — matches the task's intent, not the git branchVerify plan exists: .prove/runs/<branch>/<slug>/plan.json. If missing → stop, suggest /prove:plan --task.
Initialize run state (if not already done):
scripts/prove-run init \
--branch <branch> --slug <slug> \
--plan .prove/runs/<branch>/<slug>/plan.json \
--prd .prove/runs/<branch>/<slug>/prd.json
init is the ONLY subcommand that requires explicit --branch/--slug (no state.json exists yet). After init, the orchestrator worktree's .prove-wt-slug.txt carries the slug and every subsequent invocation resolves it automatically.
Auto-scale — total step count <=3: simple mode; 4+: full mode. Record mode in the plan at creation time; orchestrator does not override.
Create feature branch + worktree:
git worktree add .claude/worktrees/orchestrator-<slug> -b orchestrator/<slug>
printf '%s\n' "<slug>" > .claude/worktrees/orchestrator-<slug>/.prove-wt-slug.txt
Load validators from .claude/.prove.json or auto-detect per references/validation-config.md.
Load reporters from .claude/.prove.json reporters array. Reporter dispatch is automatic via Claude Code hooks (see references/reporter-protocol.md).
plan.json; walk tasks[] in order. Each task carries its own steps[], deps[], wave, optional worktree block..claude/.prove.json and any step-level acceptance_criteria.scripts/prove-run show planEvery state mutation uses scripts/prove-run — the agent-facing wrapper. Never inline python/jq/sed. Slug is auto-resolved from .prove-wt-slug.txt; if missing, prove-run hard-errors (exit 2).
For each step N:
scripts/prove-run step-start <step_id>[Step <id>] Starting: <title>scripts/prove-run validator <step_id> build pass
scripts/prove-run validator <step_id> lint pass
scripts/prove-run validator <step_id> test pass
scripts/prove-run step-complete <step_id> --commit <SHA>scripts/prove-run report <step_id> --status completed --commit <SHA>scripts/prove-run step-halt <step_id> --reason "<short>"orchestrator: [WIP] ...)Group steps into waves by dependency. Max 4 agents per wave; split larger waves.
For each task in the wave:
.prove-wt-slug.txt with the run slug):
WT_PATH=$(claude-prove worktree create <slug> <task-id>)
RUN_DIR=".prove/runs/<branch>/<slug>"
PROMPT=$(claude-prove orchestrator task-prompt \
--run-dir "$RUN_DIR" --task-id <task-id> --project-root <project-root> --worktree "$WT_PATH")
isolation: "worktree"):
Agent(subagent_type: "general-purpose", run_in_background: true, prompt: $PROMPT)
scripts/prove-run step-start <task-id>.<first-step-seq>
Create ALL worktrees first, then launch ALL agents as parallel calls in a single message.
The SubagentStop hook auto-completes the step with the subagent's latest commit SHA (or halts with a diagnostic if no commits landed). You only need to intervene when:
halted → inspect halt_reason, decide retry or abortscripts/prove-run report <step_id> --status completed --commit <SHA>
Sub-agents themselves must NOT call scripts/prove-run step-complete — the orchestrator owns step transitions and the hook is the safety net. Their contract is: record typed findings, commit, exit.
Findings backstop (per completed task). The task prompt instructs each worker to land its substantive findings as typed reasoning-log entries — hack/risk/decision/assumption, the kinds milestone-close curation mechanically sweeps. A finding that exists only in the worker's handoff message is invisible to curation. When a handoff message reports findings, verify they landed: claude-prove acb log list --run-dir "$RUN_DIR" (worker entries carry agent: "task-<task-id>"). Transcribe any missing finding yourself as a typed entry (agent: "driver", the finding prose in body, plus the type's required fields) via claude-prove acb log append --run-dir "$RUN_DIR" --file <entry.json>. Never fold worker findings into synthesis prose alone — synthesis is not swept.
Before review, re-run validators. Implementation agents run command validators during work; the orchestrator verifies and runs LLM validators (agents cannot spawn validation-agent).
For each completed task:
cd into the task worktree (slug auto-resolves from the worktree marker)scripts/prove-run validator <step_id> <phase> <status>Every task requires principal-architect approval before merge.
REVIEW LOOP (max 3 iterations per task):
1. Build review prompt:
WT_PATH=$(claude-prove worktree path <slug> <task-id>)
RUN_DIR=".prove/runs/<branch>/<slug>"
REVIEW_PROMPT=$(claude-prove orchestrator review-prompt \
--worktree "$WT_PATH" --task-id <task-id> --run-dir "$RUN_DIR" --base-branch <base-branch>)
2. Launch review:
Agent(subagent_type: "principal-architect", prompt: $REVIEW_PROMPT)
3. Parse verdict:
- APPROVED → exit loop, proceed to merge
- CHANGES_REQUIRED → continue
4. Record review:
scripts/prove-run review <task-id> rejected --notes "<summary>" --reviewer principal-architect
5. Launch fix agent in the SAME worktree (slug resolves automatically):
Agent(
subagent_type: "general-purpose",
prompt: """
Fix review findings for Task <task-id>.
## Findings
<paste CHANGES_REQUIRED items>
## Rules
- Fix ONLY flagged items
- Do not refactor beyond the flags
- Run tests after fixes
- Commit: fix(<scope>): address review feedback (round N)
"""
)
6. Go to step 1 (re-review)
If 3 iterations without APPROVED:
- AskUserQuestion header "Resolution" — "Force Approve" / "Fix Manually" / "Abort"
On APPROVED:
scripts/prove-run review <task-id> approved --reviewer principal-architect
After all wave tasks are approved:
Merge each task (in order) into the orchestrator worktree:
cd .claude/worktrees/orchestrator-<slug>
BRANCH=$(claude-prove worktree branch <slug> <task-id>)
git merge "$BRANCH" --no-ff -m "merge: task <id> - <name>"
Clean up task worktree + branch:
claude-prove worktree remove <slug> <task-id>
Merge conflict: halt, ask user. No force-merge.
Run full test suite after the wave merges.
Repeat 2a-2e for subsequent waves.
Run validators in phase order:
validation-agent subagent; see references/llm-validator-protocol.md for launch contract, diff scope, and failure handlingValidators load per references/validation-config.md.
scripts/prove-run validator <step_id> <phase> failscripts/prove-run step-halt <step_id> --reason "<phase> validation failed"orchestrator: [WIP] <step> (validation failed))git add <files modified in this step>
git commit -m "$(cat <<'EOF'
orchestrator: step <step_id> - <step title>
Part of: <task title>
Validated: <phases that passed>
Co-Authored-By: Claude <[email protected]>
EOF
)"
Capture the SHA and pass it to scripts/prove-run step-complete --commit <SHA> and scripts/prove-run report --status completed --commit <SHA>.
Render the final status:
scripts/prove-run show state
state.json is the single source of truth. Do NOT write a report markdown file — the CLI renders it JIT. For the user-facing summary, combine:
scripts/prove-run show state outputgit diff --stat main...orchestrator/<slug> for file changesPresent: status, JSON artifact paths, next action (review / fix blocker / merge). Never persist a markdown summary — it would drift from state.json.
Runs after user review and approval. Skip if execution halted.
AskUserQuestion header "Merge & Cleanup":
/prove:task cleanup)git merge --no-ff orchestrator/<slug> -m "merge: <task-name>"
If another run merged first, pull/merge main first. On conflict: halt and inform the user — no force-merge.
PROJECT_ROOT="." bash scripts/cleanup.sh --auto <slug>
Archives to .prove/archive/<date>_<slug>/, removes run directory, worktree, branch. Generates SUMMARY.md from JSON in archive.
Present: merge SHA, archived location, skipped items. If "Skip": remind to run /prove:task cleanup <slug>.
Triggered by --full without an existing plan. Gathers PRD + plan via a requirements subagent, then gates on two AskUserQuestion approvals (PRD, Plan) before handing off to Phase 0 (Initialization).
references/prd-workflow.mdreferences/prd-template.md| Scenario | Action |
|----------|--------|
| No plan found | Stop, suggest /prove:plan --task |
| Branch exists | AskUserQuestion: Resume / Start Fresh |
| Build/test fails | One retry, then step halt, commit WIP, halt |
| Subagent produces no changes | Log in report, skip commit, continue |
| Git conflict | Halt, report to user |
| Unclear requirements | Halt, ask user |
| Review deadlock (3 rounds) | AskUserQuestion: Force Approve / Fix Manually / Abort |
| state.json write rejected by hook | Use scripts/prove-run; never Write/Edit state.json directly |
| No slug resolved | Ensure you are inside a worktree with .prove-wt-slug.txt (created by claude-prove worktree create) |
To stop in-flight work, cancel → discard → re-dispatch fresh:
claude-prove scrum task cancel <id> --cascade --reason <r> [--detail <d>]. Cascades down parent_id, stamping cancelled on the root and parent_cancelled on descendants so the abandonment is auditable.claude-prove run-state init --overwrite --branch <b> --slug <g> .... Resets state.json so a re-dispatch starts clean instead of resuming the abandoned run.This is the deterministic guarantee. Native subagents are not externally signalable
mid-run, so the only hard stop is the /workflows token budget + the subagent
timeout — they bound and terminate the run without cooperation.
Tradeoff: cancel-and-redispatch discards in-flight work unless a synthesis
reasoning-log entry was already written. The graceful Layer 2 path below recovers
that work cooperatively, but is best-effort and layers ON TOP of this floor — it never
replaces it, because a non-polling or stuck agent only stops at the budget/timeout.
Layer 1 is always the backstop.
A best-effort graceful interrupt that lets a re-dispatch RESUME rather than restart. It sits on top of the Layer-1 floor and never replaces it.
CANCEL file under the run dir —
.prove/runs/<branch>/<slug>/CANCEL (resolve .prove/... from the main worktree,
$MAIN_ROOT, not a task worktree). An optional one-line body records the reason.
echo "<reason>" > "$MAIN_ROOT/.prove/runs/<branch>/<slug>/CANCEL"
claude-prove orchestrator task-prompt already instructs workers to poll this flag
at natural checkpoints. When the flag is present, the worker performs a graceful
handoff: write a synthesis reasoning-log entry capturing progress + next steps
(claude-prove acb log append --run-dir <dir> --file <entry.json>), commit
work-in-progress, and self-exit — so the re-dispatch resumes from the handoff.CANCEL file so the resumed run is
not immediately re-interrupted: rm -f "$MAIN_ROOT/.prove/runs/<branch>/<slug>/CANCEL".A non-polling or stuck worker will not stop here; the /workflows token budget and the
subagent timeout (Layer 1) remain the hard stop.
scripts/prove-run — never direct-edit state.json, never inline python3 -c, jq, or sed for run statePROGRESS.md, run-log.md, report.md) — views render JIT from JSON via the CLI.prove-wt-slug.txt; agents must not invent or pass slugs ad-hoc. If no slug, hard-error and surface the fix (create marker or run claude-prove worktree create)git add <files> over git add -ABranches: orchestrator/<slug> (worktree: .claude/worktrees/orchestrator-<slug>). Sub-tasks: task/<slug>/<task-id> (worktree: .claude/worktrees/<slug>-task-<task-id>). Managed by claude-prove worktree.
Run directory:
.prove/runs/<branch>/<slug>/
├── prd.json # Requirements
├── plan.json # Task graph
├── state.json # Live run state (mutated only via CLI)
├── state.json.lock
└── reports/
└── <step_id>.json # Per-step report (write-once)
| Script | Purpose |
|--------|---------|
| scripts/prove-run | Agent wrapper around the run_state CLI — all JSON mutations and renders go through this |
| claude-prove orchestrator task-prompt | Prompt for worktree implementation agents (CLI subcommand) |
| claude-prove orchestrator review-prompt | Review prompt for principal-architect (CLI subcommand) |
| claude-prove worktree | Create/remove/list/reset sub-task worktrees (writes .prove-wt-slug.txt) |
| scripts/cleanup.sh | Archive and remove run artifacts |
| File | Purpose |
|------|---------|
| references/prd-workflow.md | Full-mode PRD + plan generation workflow |
| references/prd-template.md | PRD field reference |
| references/llm-validator-protocol.md | Prompt validator dispatch contract |
| references/handoff-protocol.md | Phase handoff |
| references/reporter-protocol.md | Reporter dispatch |
| references/validation-config.md | Validators (schema, auto-detect, execution order) |
| references/interaction-patterns.md | AskUserQuestion usage |
Follow the commit skill: scopes from .claude/.prove.json, <type>(<scope>): <description>. One atomic commit per step.
tools
Clean and compact prove's durable memory layers — team Lore, the Codex (scrum decisions), annotations, and contributor artifacts — keeping the tribal knowledge that grows team accuracy and folding away the rot. Triggers on "janitor", "clean the lore", "compact the lore", "compact the codex", "memory cleanup", "clean up team memory", "prune stale decisions", "tidy tribal knowledge", "lore cleanup". You are the driver: the CLI emits inventories and executes writes; the `memory-janitor` agent judges each entry; a per-team batch gate approves; nothing is ever deleted — consolidation, promotion, and supersession only.
testing
Anchor session context into prove primitives before compaction and rehydrate from them after. Built-in compaction summarizes by recency and drops the claude-prove state an agent needs to reorient; this skill externalizes volatile context into durable anchors (scrum tasks, decisions, run-state, a compact-anchors pointer file) pre-compact, then runs a deterministic reorientation sequence post-compact. Use before a manual /compact, when context is about to auto-compact, or immediately after a compaction. Triggers on "smart compact", "prepare for compaction", "anchor before compact", "context is getting long", "rehydrate", "reorient after compact".
tools
Apply model-driven CONTENT reshaping to stored run artifacts that sit behind the current schema, on explicit operator invocation only. Triggers on "migrate runs", "migrate run artifacts", "run content migration", "reshape run artifacts", "bring runs to current schema". You are the driver: the `run-state migrate-runs` CLI mechanically detects which artifacts are behind and emits a plan naming each one plus its migration-instruction file; you read the instructions and reshape the prose/findings, gated by the operator. The deterministic `schema migrate` handles structural column moves; this skill covers only the content reshaping beyond them. Never run as a background or resident loop — only when the operator asks.
tools
Synthesize the 7-section risk-forward Review Brief from a run's reasoning log. Triggers on "reasoning brief", "review brief", "synthesize the brief", "generate the brief", "brief the run", "brief for review", "story brief". You are the driver: the `acb brief` CLI renders a mechanical preservation-safe backbone and proves preservation; you synthesize the narrative prose (summary + changes), single-pass or multipass over episode chunks, then gate it through Stage-1 (mechanical, blocking) and Stage-2 (prose judge, advisory).