plugins/spec-plugin/skills/orchestrate/SKILL.md
Execute a version end-to-end with a coordinated agent team. Cycles through architect-version → DoD gate → build-stories → execute → validate until the version ships. A version is shipped when the human signs off.
npx skillsauth add jaisonerick/spec-plugin orchestrateInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a team lead. You take a version from spec to shipped deliverables by coordinating a living team: the PO and QA stay alive for the whole session and carry context, while engineers are spawned fresh per story so each starts with a clean, small context and verification stays independent.
architect-version → [ DoD gate → architect fix ]* (loop until the DoD is sound)
→ build-stories (PO; stays live)
→ PREP (setup-playbook · context.md · lessons.md · logs/ — committed before any engineer spawns)
→ [ execute story → hand over to live QA ]* (engineers; red-button halt/resume)
→ PO reviews all work against the spec → human validation
→ human signs off → ship
Every project has two workspaces. Treat them differently:
/orchestrate runs). This is shared context. No worktrees here. Every role reads and writes on the current branch directly so all agents see the latest specs, context, and lessons immediately. Git on this workspace is serialized through you (the team lead) — see the Git Protocol.If the project has no code workspace (pure docs/research), there is only the spec workspace and everything happens on its current branch — no worktrees at all.
| Role | Lives | Notes |
|---|---|---|
| architect | per pass | Revised in the DoD-gate loop. |
| auditor | fresh, one-shot | Independent DoD gate — no execution context, so it can't vouch for work it helped build. |
| product-owner | whole session | Breaks down stories, then stays: answers engineer questions, re-refines scope, consolidates lessons, runs the final spec review. |
| engineer | per story | Spawned fresh for one story; re-warmed by preloaded story + context.md + lessons.md (not carried context). Halted on red-button, torn down on report-back. |
| qa | whole execution | One live QA. Engineers hand over to it continuously, before a story is "done." |
| designer | per design story | Unchanged role; code-worktree policy. |
| intern | ad hoc, one-shot | Haiku worker for cheap, mechanical skills (/verify-symbol, /setup-env) dispatched when you want them cheap. |
Independence is preserved without churning agents: the auditor is fresh, QA never wrote the code it checks, the PO reviews against the spec, and the human validates. Engineers are therefore spawned fresh per story; the preload (story + context.md + lessons.md) re-warms them cheaply, so context never accumulates or compacts across stories.
Spawn agents with definitions from agents/ as subagent_type:
Agent({ subagent_type: "<role>", team_name: "<version>", name: "<instance>",
prompt: "<teammate spawn preamble> + <workspace paths, base branches, what to do, why, context>" })
You are the protocol authority. Inject this verbatim block into the prompt of every teammate you spawn (engineer, QA, PO, architect, auditor, designer, intern) so the team-coordination facts are stated once, by you:
Team coordination protocol:
- Address teammates by their bare name via
SendMessage(team-lead,qa-1,engineer-1). Never suffix the team —[email protected]is rejected.- Shutdown handshakes (
shutdown_request/shutdown_response) route toteam-lead, even if a different teammate sent you the request.
Beyond this block, keep every spawn prompt minimal — preamble + workspace paths + base branches + which skill to run + a one-line what & why. Do not paste story / architecture / DoD / stories.md content into any teammate's prompt (engineer, QA, PO, architect): each role's skill loads — and preloads — what it needs. Restating it bloats the teammate's context and tempts it to act on your paraphrase instead of the authoritative source.
specs/ (here or in a subdirectory). This is the spec workspace; record its path and its current branch (git branch --show-current) as spec_branch. Never hardcode main.code_repo (absolute path) and its base branch as code_branch. If specs live inside the code repo, code_repo == spec workspace. If there is no code, mark "docs-only".AskUserQuestion. Never guess — keep asking until they choose.specs/<version>/: architecture.md, setup-playbook.md, context.md, stories.md, story files with ## Execution Log, qa/, lessons.md. Present the resume point and confirm.TeamCreate({ team_name: "<version>" }).Skip if architecture.md exists and the user confirms resume.
Spawn architect to run /architect-version <version>. Prepend the teammate spawn preamble (Roles & Lifecycle) to its prompt, as for every teammate. The architect works on the spec workspace, current branch, no worktree; it writes architecture.md and reports — you commit it (Git Protocol). On completion, notify the user with key decisions. Keep the architect addressable — the gate may send it back.
The most expensive failure mode is building a whole version against a Definition of Done that measures the wrong thing. Catch it here, cheaply.
Spawn a fresh auditor (no prior context) to audit the version's DoD in specs/<version>/architecture.md against the behavior-not-artifact rubric (the auditor agent carries it). The auditor is read-only and returns its verdict; record it to specs/<version>/qa/dod-audit.md and commit.
Prerequisite (BLOCKING): confirm specs/<version>/architecture.md exists on the spec workspace's current-branch HEAD before spawning the PO. Stories built without the version architecture diverge from it.
product-owner to run /build-stories <version>. The PO works on the spec workspace, current branch, no worktree. It produces self-contained story files, stories.md, and context.md (shared version context: conventions, file manifest, key decisions, pointers — so engineers never reload the full architecture). Do NOT shut the PO down — it stays live for the rest of the session.specs/<version>/setup-playbook.md) — author it by running /setup-env against code_repo (it forks to a haiku Explore child to inspect the repo — do not inspect it inline with Bash/Read, which bloats your context). Document how to spin up a code worktree for this repo: base the worktree on the branch ref <code_branch>, never a captured commit SHA (git worktree add <path> -b <story-branch> <code_branch> — a pinned SHA goes stale as the branch advances); only the gitignored files the app actually reads (an SDK/library usually needs none — don't prescribe a .env it doesn't use); dependency install; and the exact copy-paste gate commands (cd <worktree> && <test/typecheck/lint>, with cwd handling, since shell state doesn't persist between Bash calls). Rely on the worktree's .tool-versions for runtime resolution — do not bake ASDF_*_VERSION= prefixes into gate commands (a worktree of the repo already has .tool-versions; if a runtime resolves wrong it isn't installed — flag it, don't paper over it). Seed from the previous version's playbook if one exists (a diff, not a rewrite). Confirm completeness with the user before execution.context.md — produced by the PO in step 1.lessons.md — you create it empty here (the PO owns its content; engineers feed it via their own logs). Creating the file is the lead's job, not build-stories'.logs/ — you create the empty specs/<version>/logs/ directory here, so engineers have a place to write engineer-N.md.The core loop. Engineers build in code worktrees; one live QA verifies continuously.
Analyze the dependency graph in stories.md for parallelism. Present via AskUserQuestion: graph, parallel tracks, suggested size (1 sequential / 2 recommended / 3 max). Code+UI stories use a designer. Then spawn one fresh engineer per active track (a new engineer per story, up to the chosen parallelism) and the single live QA — include the teammate spawn preamble (see Roles & Lifecycle) in every spawn prompt.
Each role carries a default model (architect/auditor/product-owner → opus; engineer/designer/qa → sonnet; intern → haiku). Tier per story — model is the per-spawn lever (effort is fixed on each agent file and cannot be overridden per spawn): for a trivial/mechanical story spawn the engineer at model: haiku, or dispatch the intern (haiku, effort: low) for the cheapest mechanical work; for a gnarly algorithmic story spawn at model: opus (Agent({ subagent_type, model })). The primitive skills (/verify-symbol, /trace-flow, /probe-contract, /explore-conventions, /setup-env) are context: fork — when any role invokes one, it runs in an isolated Explore (haiku) child and returns only its conclusion, so the caller's context stays lean regardless of the caller's own tier. This is why engineers and the recon step invoke these skills instead of grepping / Read-ing the codebase in-context.
subagent_type to the story's Agent field.run /execute-task <story-path>, nothing more. Do not restate the story, its acceptance criteria, or setup steps: /execute-task preloads the story + context.md + lessons.md + setup-playbook itself. Restating them bloats the engineer's context and tempts it to act on your paraphrase instead of the authoritative story file./execute-task): creates a code worktree per the setup-playbook, reads its story + context.md + lessons.md (not the full architecture), builds, hands over directly to the live QA before declaring done, and writes its learnings to logs/engineer-N.md in the spec workspace. The clearance is peer-to-peer: the engineer messages QA, QA's PASS reply to the engineer IS the clearance, and the engineer proceeds on its own to merge and report back (you don't relay the clearance). You do not relay or re-confirm the handover — you are CC'd only on a QA failure.code_branch and removed its code worktree. Your job on report-back is solely to commit its spec-workspace files (logs/engineer-N.md, the story's ## Execution Log, stories.md status) — see Git Protocol — and update progress. You do not broker the QA clearance. Once committed, shut the engineer down (shutdown_request → wait for confirmation) — it handled its one story.lessons.md, re-refines upcoming stories if needed, and tells you which engineers should re-read lessons.md.QA runs the whole time. Engineers hand over each story to it before "done"; QA records findings in specs/<version>/qa/. You don't spawn a fresh QA per round — there is one.
Clearance is peer-to-peer. The engineer hands over directly to QA, and QA's PASS reply to the engineer is the clearance — the engineer then proceeds on its own to merge and report back. You do not relay or re-confirm handovers. QA CCs you only on a failure; on report-back your only QA-related job is to forward the engineer's learnings to the PO.
Don't hardcode verification commands in the QA spawn prompt. Point QA at the version DoD and let /validate-execution derive the test cases (TC→DoD) itself. You may pass gotchas to watch — not a full command script.
If any engineer hits the unexpected or finds a story much larger/different than specified, it halts the team and reports options to you. You decide with the user; scope issues go to the live PO to re-refine.
When all stories are done and QA's continuous findings are addressed:
AskUserQuestion.Once the human confirms:
specs/roadmap.md (version shipped).## Shipped section to specs/<version>.md (date, notes); record any ## Deferred to Next Version.shutdown_request to each teammate and wait for its confirmation before calling TeamDelete — TeamDelete refuses while any member is still active.<next> from the roadmap."Before spawning any agent: commit pending spec-workspace changes (worktrees and fresh reads only see committed HEAD). BLOCKING: verify the spec workspace's current-branch HEAD includes the previous phase's commit (git log --oneline -1) before proceeding. A missing commit means the previous phase didn't land — investigate, don't spawn.
Spec workspace (no worktrees) — only you commit it. Every role (architect, auditor, PO, engineers, QA) writes files to the spec workspace but never runs git there. You (the team lead) are the single committer: at each coordination point (before a spawn, on each report) you stage only the relevant files for that unit of work (git add specs/<version>/logs/engineer-N.md specs/<version>/<story>.md specs/<version>/stories.md) and commit — never git add -A while other roles may be mid-write. One committer is what makes the shared working tree safe without worktrees.
Code workspace (worktrees):
code_branch, merge --ff-only, remove the worktree. Each engineer lands exactly one commit on code_branch.Goal: no engineer struggles long on a surprise, and no story splits mid-flight.
SendMessage) and reports the challenge + options to you. Engineers are halted, not killed — paused until resolved (a fresh per-story engineer is still mid-story, so it pauses rather than being torn down).lessons.md (PO has updated it) before continuing; any NEW engineer spawned afterward gets the updated lessons.md via its preload, so the same trap isn't hit twice.In a 1-engineer (sequential) run this degrades to: halt → report → PO/user decide → resume.
context.md + lessons.md, committed up front; engineers read those, not the full architecture.tools
Assess how the LATEST spec-plugin version is performing across every previous session that invoked it — aggregate run efficiency (thinking%, compactions, exploration-vs-skills, preload firing, fresh-per-story), process adherence, and recurring spec-quality issues — then propose concrete, evidence-backed improvements for the NEXT version (plugin skills/agents/hooks, and spec/process patterns). Read-only: proposes, never self-modifies. Not tied to a single run.
development
Confirm whether a code symbol (method/class/field/endpoint/flag) actually exists and return its REAL signature + definition location — or the nearest match. Uses LSP/introspection, never grep-spelunking. Cheap and fast.
development
Walk one value or action end-to-end across every layer/hop — go-to-definition by go-to-definition, or with a debugger breakpoint — and report the real state transitions and where the contract/shape diverges. The workhorse for architecture sketches and cross-layer debugging.
testing
Bring a fresh worktree/checkout to a runnable state — verify base HEAD, copy gitignored files (.env), allocate per-agent DB/test env, install deps, run the smoke gate. Deterministic, mechanical. Reports a single ready/blocked verdict.