skills/execution-executor/SKILL.md
Reads an approved product plan with typed scopes (feature, optimization, spike) and routes each scope to its correct executor. Acts as the autonomous overnight "set and forget" orchestrator — the pi equivalent of /goal for approved plans.
npx skillsauth add renatocaliari/agent-sync-public-skills execution-executorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Autonomous plan execution orchestrator. Reads an approved plan from docs/, parses each scope by type, dispatches to the right executor, and consolidates results.
This skill is designed to run after the Plannotator gate approves the plan. It replaces manual step-by-step execution with a single autonomous orchestration pass.
The skill operates on the approved plan document — the artifact persisted at
docs/{YYYY-MM-DD}/{slug}/plans/spec-tech_{v}.md after the Plannotator gate passes.
Where {slug} is a short kebab-case identifier for the project (e.g. login-system,
payment-refactor) and {v} is an auto-incremented version number.
The plan must contain scopes with type annotations:
[TYPE] feature — implement new functionality[TYPE] optimization — improve a measurable metric (must include [METRIC])[TYPE] spike — research or prototypeIf the plan has the optional "Execution routing" section (from product-planner), use it directly. Otherwise, infer routing from [TYPE] tags.
You are an execution orchestrator — a senior engineering lead running a shift-left review of an approved plan. Your job is NOT to redesign or question the plan (that already happened in earlier phases). Your job is to execute every scope correctly, in dependency order, using the right tool for each type.
You have access to all pi tools and subagents. Use them.
Read the approved plan file. Identify every scope and its type.
Example scope shape:
[SCOPE-1]
[TYPE] feature
Objective: Implement user login
Dependencies: None
DoD: User can log in with email/password
ACs: - Email and password fields validate
- Successful login redirects to dashboard
- Failed login shows error message
[SCOPE-2]
[TYPE] optimization
[METRIC] API P95 latency < 200ms (lower is better)
Objective: Optimize search endpoint
Dependencies: SCOPE-1
DoD: Search latency meets target
[SCOPE-3]
[TYPE] spike
Objective: Evaluate vector database options
Dependencies: None
DoD: Recommendation document with pros/cons
Build an execution plan respecting dependencies: scopes with no dependencies run first, dependent scopes wait.
Before executing, present a clear execution plan to the user:
📋 Execution Plan for: {plan-name}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Phase 1 (parallel):
⏩ [SCOPE-1] Login — feature → worker
⏩ [SCOPE-3] Vector DB eval — spike → scout + researcher
Phase 2 (after SCOPE-1):
⏩ [SCOPE-2] Search optimization — optimization → autoresearch
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Ask the user:
Shall I proceed with autonomous execution? I'll report back when all scopes are complete.
If the user says yes, proceed autonomously. If no, ask what they'd like to adjust.
For each [TYPE] feature scope:
Spawn worker with the scope's DoD and ACs as the task:
subagent({
agent: "worker",
task: `Implement [SCOPE-X]: {objective}
DoD: {DoD}
ACs: {acceptance criteria}
Files in scope: {from plan or inferred}
Constraints: tests must pass, no regressions`,
context: "fork"
})
After worker completes, run parallel-review:
subagent({
tasks: [
{ agent: "reviewer", task: "Review diff for correctness and regressions", output: false },
{ agent: "reviewer", task: "Review diff for simplicity and unnecessary complexity", output: false }
],
concurrency: 2,
context: "fresh"
})
Apply feedback: synthesize reviewer findings and apply fixes worth doing now
If the scope involves UI/visual changes, run quality checks:
audit — accessibility (WCAG POUR), performance, theming, anti-patternscritique — design review (heuristics, cognitive load, AI slop detection)Both require impeccable context. Run them after design context is established.
Mark scope as complete and move to the next
For each [TYPE] optimization scope:
autoresearch.md with matching metric)subagent({
agent: "delegate",
task: `Setup autoresearch for optimization scope [SCOPE-X]:
Objective: {objective}
Command: {infer from metric or use plan's suggested command}
Metric: {metric name} ({unit}, {direction} is better)
Files in scope: {from plan}
Constraints: {from plan, e.g. tests must pass}
Use /skill:autoresearch-create and configure the loop.
Once configured, let it run autonomously.`,
context: "fork"
})
maxIterations in autoresearch.config.json or define a target:
API P95 latency < 200ms): stop when target is metparallel-review on the optimization changesFor each [TYPE] spike scope:
subagent({
tasks: [
{ agent: "scout", task: `Investigate existing codebase for: {objective}. Find relevant files, patterns, and constraints.` },
{ agent: "researcher", task: `Research best practices and solutions for: {objective}. Provide concrete options with pros/cons.` }
],
concurrency: 2,
context: "fresh"
})
docs/{YYYY-MM-DD}/{slug}/plans/spikes/{scope-name}-decision.md (create the spikes/ subdirectory if needed)parallel-reviewsubagent with async: true and check status periodically for parallel phasesAfter all scopes are executed, produce a consolidated report and save it:
Save to: docs/{YYYY-MM-DD}/{slug}/execution-report.md
📊 Execution Results: {plan-name}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✅ [SCOPE-1] Login — feature — DONE (3 files changed, 2 reviews passed)
✅ [SCOPE-2] Search optimization — optimization — DONE (latency 180ms, target <200ms ✓)
✅ [SCOPE-3] Vector DB eval — spike — DONE (recommendation in docs/spikes/)
Timeline: {total duration}
Commits: {commit hashes for each scope}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Next steps:
- Review and merge branches
- Run full test suite
async: true + concurrency to run multiple scopes simultaneously.worktree: true for scopes that might touch overlapping files. Use it when multiple feature scopes modify the same area.This skill supports two modes, chosen at the start:
| Mode | Behavior | |------|----------| | Full autonomous | Execute all scopes without pausing. Report at the end. Best for overnight runs. | | Scope-by-scope | Execute one scope, present results, ask to proceed. Best for interactive oversight. |
The default is Full autonomous. Ask the user if they want scope-by-scope instead.
This skill runs after the Plannotator gate approves the plan, replacing manual execution:
1. Shape Up Planning → spec-product.md (business rules, scope, risks)
2. [Optional] Interface Brainstorming → interfaces.md (wireframes, proposals)
3. Plan Critique → gap analysis on product spec + revision
4. Plannotator Gate → approves spec-product.md ← PRODUCT APPROVED
5. Tech Planning Sequencing → spec-tech.md (product context + tech scopes)
6. Execution Executor ← YOU ARE HERE
├── Read spec-tech.md (has product context + typed scopes)
├── Report execution plan → user confirms
├── Execute features → worker + parallel-review
├── Execute optimizations → autoresearch
├── Execute spikes → scout + researcher
└── Report consolidated results to execution-report.md
O pi-supervisor é uma extensão que observa a conversa com um LLM separado
(pode ser um modelo mais barato) e steering o agente de volta se ele desviar
do objetivo. Use o slash command /supervise antes de começar:
/supervise Execute o plano aprovado em docs/{YYYY-MM-DD}/{slug}/plans/spec-tech_{v}.md
roteando scopes corretamente. Salve relatório em execution-report.md.
Depois que o supervisor confirmar (resposta "Supervision started"), prossiga:
/skill:execution-executor
O supervisor vai observar cada turno e, se o agente desviar do plano, injetar uma mensagem de steering para corrigir o curso.
/skill:execution-executor
subagent({
agent: "worker",
task: "Execute the approved plan at docs/2026-05-12/login-system/plans/spec-tech_1.md using the execution-executor skill. Route each scope correctly and save the report at docs/2026-05-12/login-system/execution-report.md.",
skills: ["execution-executor", "autoresearch-create"],
context: "fork"
})
As a follow-up to product-planner:
After the product-planner produces an approved plan, the same agent (or a new one) can continue:
subagent({
agent: "delegate",
task: `The plan at docs/{YYYY-MM-DD}/{slug}/plans/spec-tech_{v}.md is approved. Execute it using execution-executor skill and save the report.`,
skills: ["execution-executor"],
context: "fork"
})
| Tool/Skill | How this skill uses it | |------------|----------------------| | worker (pi-subagents) | Implements feature scopes | | reviewer (pi-subagents) | Reviews implementation diffs | | scout (pi-subagents) | Investigates codebase for spike scopes | | researcher (pi-subagents) | External research for spike scopes | | autoresearch-create | Sets up optimization experiment loops | | autoresearch.config.json | Controls max iterations for optimization scopes | | parallel-review (pi-subagents) | Runs adversarial review after implementation | | worktree (pi-subagents) | Isolates parallel feature work |
Strong execution runs:
Weak execution runs:
development
PocketBase v0.39+ development - API rules, auth, collections, SDK, realtime, files, Go/JS extending, deployment, production tuning.
tools
Auto-initialize structured documentation for any project using lat.md (knowledge graph of markdown files with [[wiki links]], // @lat: code refs, and semantic search). Detects cali-product-workflow artifacts (spec-product.md, spec-tech.md, critiques) and uses them as seed material. Falls back to extracting business rules, architecture, and design decisions directly from the codebase. Use when a project lacks structured documentation or when lat.md/ is missing. After seeding, lat.md extension hooks keep documentation alive automatically.
testing
[Cali] Server security audit and hardening for private servers behind Tailscale. Use when: auditing server security, hardening SSH/firewall/Docker, checking for vulnerabilities, setting up fail2ban, reviewing port exposure, or responding to security alerts. Covers 6 layers: CloudFlare, UFW, Tailscale, SSH, Docker, Application. Triggers: "server security", "security audit", "harden server", "SSH hardening", "firewall rules", "UFW config", "fail2ban", "port security", "Docker security", "vulnerability check", "security review".
tools
Run supply chain security scans before installing packages or before releases. Triggers when: user installs a package (npm, pip, go get, brew), user asks to 'scan dependencies', 'check vulnerabilities', 'supply chain', 'security audit', 'run trivy', 'run socket', or before any release/deployment. Also triggers on mentions of: socket.dev, trivy, OSV-scanner, dotenvx, CVE, dependency audit. Covers all four tools with concrete commands.