plugin/skills/skill-pipeline/SKILL.md
Pipeline orchestrator that classifies incoming coding tasks and routes them through the correct combination of skills at the right depth. Implements two feedback loops: the inner loop (detect, verify, recover) runs within a session via plan-interview, intent-framed-agent, context-surfing, verify-gate, self-healing (active recovery on failure), simplify-and-harden, and self-improvement. The outer loop (inspect, encode, regress-test) runs across sessions via learning-aggregator, harness-updater, and eval-creator. pre-flight-check bridges the two by surfacing accumulated knowledge — past heals and learnings — at session start. Handles standard, team-based, CI, and outer-loop pipeline variants. Does not replace individual skills; dispatches to them.
npx skillsauth add pskoett/pskoett-ai-skills skill-pipelineInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The conductor, not a player. This skill classifies tasks, selects the pipeline variant, calibrates depth, and orchestrates handoffs between skills. It produces no artifacts of its own — its output is routing decisions that activate other skills.
On every coding task, classify before acting. Evaluate scope signals and map to a task class.
Input signals: file count, task description, existing plan/handoff files, batch indicators, CI environment.
Task received
│
├─ Trivial (typo, rename, version bump)
│ → No skills. Just do it.
│
├─ Small (isolated fix, single-file, <10 logic lines)
│ → verify-gate + simplify-and-harden
│
├─ Medium (feature in known area, 2-5 files)
│ → intent-framed-agent + verify-gate + simplify-and-harden
│
├─ Large (complex refactor, new architecture, unfamiliar codebase, high-risk logic)
│ → Full standard pipeline
│ → Recommend /plan-interview before starting
│
├─ Long-running (multi-session, high context pressure, prior handoff exists)
│ → Full standard pipeline with context-surfing as critical skill
│
└─ Batch (multiple features from spec, 5+ discrete tasks, issue triage)
→ Team-based pipeline (agent-teams-simplify-and-harden)
When uncertain, start with Medium. Add skills if drift or quality issues appear mid-task.
For detailed heuristics, edge cases, and examples: read references/classification-rules.md.
Route task class to the right variant:
| Task Class | Variant | Rationale | |------------|---------|-----------| | Trivial | None | No overhead needed | | Small | Standard (minimal) | Verify + S&H only | | Medium | Standard (partial) | Scope monitoring + verify + review | | Large | Standard (full) | Full inner loop with planning | | Long-running | Standard (full) | Context-surfing is critical | | Batch | Team-based | Breadth over depth | | CI environment | CI | Headless review | | Periodic | Outer loop | Cross-session improvement |
Heuristic: Standard pipeline for depth (single complex feature). Team-based pipeline for breadth (batch of tasks). CI pipeline when CI=true or GITHUB_ACTIONS=true.
pre-flight-check (SessionStart hook — surfaces prior learnings + heals)
→ classify
→ (recommend /plan-interview if Large or Long-running)
→ intent-framed-agent (at planning-to-execution transition)
→ context-surfing (auto-activates when intent frame + plan exist; concurrent with intent monitoring)
→ [IMPLEMENTATION]
→ self-healing ← inner-loop recovery primitive; called whenever a command/test/build/external call fails or a helper is missing.
→ Diagnoses, patches, verifies, files HEAL- entry. Resumes when verified.
→ verify-gate (compile + test + lint; fix loop if red — fix loop calls self-healing)
→ simplify-and-harden (post-completion, if non-trivial diff)
→ self-improvement (on errors, corrections, S&H learning candidates, recurring heal handoffs)
Skill-by-class activation:
| Skill | Trivial | Small | Medium | Large | Long-running | |-------|---------|-------|--------|-------|-------------| | pre-flight-check | Hook | Hook | Hook | Hook | Hook | | plan-interview | - | - | - | Recommend | Recommend | | intent-framed-agent | - | - | Activate | Activate | Activate | | context-surfing | - | - | - | Activate | Critical | | verify-gate | - | Activate | Activate | Activate | Activate | | self-healing | On failure | On failure | On failure | On failure | On failure | | simplify-and-harden | - | If non-trivial | If non-trivial | If non-trivial | If non-trivial | | self-improvement | On error only | On error only | On error/completion | On error/completion | On error/completion |
classify (Batch)
→ (recommend /plan-interview if no spec exists)
→ agent-teams-simplify-and-harden
├─ Team lead emits Intent Frame #1
├─ Phase 1: parallel implementation agents
├─ verify-gate (compile + test + lint)
├─ Phase 2: parallel audit agents (simplify, harden, spec)
├─ Fix loop (up to 3 audit rounds)
└─ Learning loop output
→ self-improvement
classify (CI detected)
→ simplify-and-harden-ci (headless scan, PR changed files only)
→ self-improvement-ci (pattern aggregation, promotion recommendations)
The outer loop runs across sessions, not within them. Trigger on cadence (weekly, sprint boundary) or when pre-flight-check surfaces promotion-ready patterns.
learning-aggregator (read .learnings/, find patterns, rank promotion candidates)
→ harness-updater agent (apply promotions to CLAUDE.md, AGENTS.md, copilot-instructions.md)
→ eval-creator (create permanent test cases from promoted patterns)
→ eval-creator run (regression check on all existing evals)
When to trigger the outer loop:
pre-flight-check reports promotion-ready count > 3/learning-aggregatorOuter loop is always human-gated. learning-aggregator produces a gap report. harness-updater shows diffs for approval. No automatic writes to instruction files without human review.
Not just which skills — how deep each goes:
| Dimension | Small | Medium | Large | Long-running | Batch | |-----------|-------|--------|-------|-------------|-------| | Pre-flight check | Hook | Hook | Hook | Hook | Hook | | Planning passes | 0 | 0-1 | 1-2 | Deep iterative | Per-task or umbrella | | Intent frame | - | Single frame | Full frame + monitoring | Full + handoff | Team lead frame | | Context-surfing | - | - | Active | Critical (exit protocol ready) | Lightweight drift checks | | Verify-gate | Compile + test | Compile + test | Compile + test + lint | Compile + test + lint | Compile + test (per round) | | Self-healing | On failure (file HEAL) | On failure (file HEAL) | On failure + recurrence check | On failure + recurrence check | On failure (per task) | | S&H budget | 20% diff, 60s | 20% diff, 60s | 20% diff, 60s | 20% diff, 60s | 30% team growth cap | | Audit rounds (teams) | - | - | - | - | Up to 3 | | Self-improvement | Error-triggered | Error-triggered | Error + S&H feed | Error + S&H feed | Error + teams feed |
Artifacts flow between skills. The orchestrator ensures each skill receives what it needs.
Key handoffs:
Plan file (docs/plans/plan-NNN-<slug>.md) — produced by plan-interview, consumed by intent-framed-agent (context), context-surfing (wave anchor), agent-teams (task extraction).
Intent Frame — produced by intent-framed-agent, consumed by context-surfing (wave anchor strengthening). Copied into handoff files on drift exit.
Handoff file (.context-surfing/handoff-[slug]-[timestamp].md) — produced by context-surfing on drift exit, consumed by next session for resume.
Verify-gate signal — produced by verify-gate (pass/fail + diagnostics), consumed by simplify-and-harden (only activates after green gate) and the heal loop (on failure — verify-gate hands the diagnostics to self-healing, which diagnoses + patches + re-verifies, then signals verify-gate to re-check).
HEAL entries + artifacts (.learnings/HEALS.md, .learnings/heals/<HEAL-ID>/) — produced by self-healing, consumed by pre-flight-check (surfaces prior heals at session start by Pattern-Key / Active-Context), learning-aggregator (cross-session recurrence analysis), and self-improvement (when Handoff block flags promotion at Recurrence ≥ 3).
Learning candidates (learning_loop.candidates) — produced by simplify-and-harden and agent-teams, consumed by self-improvement for pattern tracking.
Learning entries (.learnings/*.md) — produced by self-improvement, consumed by learning-aggregator for cross-session analysis and by pre-flight-check at session start.
Gap report — produced by learning-aggregator, consumed by harness-updater agent for promotion and eval-creator for test case generation.
Eval cases (.evals/cases/*.md) — produced by eval-creator, consumed by regression runs and surfaced by pre-flight-check.
Precedence: If context-surfing and intent-framed-agent both fire simultaneously, context-surfing's exit takes precedence. Degraded context makes scope checks unreliable.
For the full artifact/signal/budget table: read references/handoff-matrix.md.
The orchestrator intervenes at these moments:
Classify the task. Select pipeline variant and depth. Emit routing decision. If Large/Long-running, recommend /plan-interview. If Batch, recommend team-based variant.
When user approves a plan from plan-interview, flow directly into the execution stage — no separate "should I proceed?" prompt. This means activating intent-framed-agent to emit an Intent Frame. The intent frame itself still requires user confirmation before coding begins (that confirmation is part of intent-framed-agent, not an extra gate). Populate task tracking with checklist items.
When no plan-interview was used and the user signals readiness ("go ahead", "implement this", "let's start"), activate intent-framed-agent. Emit Intent Frame. Wait for user confirmation of the frame before coding.
A command, test, build, lint, or external call fails before verify-gate even runs (or any other mid-task gap appears — missing helper, env drift, API change). Route into self-healing: diagnose, patch, verify, file the HEAL entry. Resume execution from the working state. Most heals are recurrences — self-healing searches HEALS.md by Pattern-Key first.
Activate verify-gate to run compile, test, and lint checks. If any fail, route into self-healing for the diagnosis/patch/verify loop (up to 3 attempts per phase). After each heal, verify-gate re-runs the checks. Once all checks pass and the diff meets the non-trivial threshold (see references/classification-rules.md), activate simplify-and-harden. If the diff is trivial, signal completion directly after verify-gate passes.
If context-surfing fires a drift exit, stop execution. Write handoff file. If the task was classified below Large, consider re-classifying upward for the next session.
Check for handoff files in .context-surfing/. If found, read completely. Re-establish context from handoff. Re-classify if needed. Resume from recommended re-entry point.
Users can override any routing decision:
depth=small / depth=large — override classificationvariant=teams / variant=standard — override pipeline selection--no-review — skip simplify-and-harden/plan-interview on any task regardless of classificationTasks can change class mid-execution. Watch for:
context-surfing drift exit, intent-framed-agent detects significant scope changeWhen re-classification is warranted:
/plan-interview if no plan existsintent-framed-agent monitors scope and context-surfing monitors quality. The orchestrator dispatches at decision points, then gets out of the way.simplify-and-harden has a 20% budget cap, the orchestrator respects it.For complete step-by-step walkthroughs of each variant including hybrid scenarios and session resume: read references/pipeline-variants.md.
development
Implementation + audit loop using parallel agent teams with structured simplify, harden, and document passes. Spawns implementation agents to do the work, then audit agents to find complexity, security gaps, and spec deviations, then loops until code compiles cleanly, all tests pass, and auditors find zero issues or the loop cap is reached. Use when: implementing features from a spec or plan, hardening existing code, fixing a batch of issues, or any multi-file task that benefits from a build-verify-fix cycle.
tools
Active runtime recovery for coding agents: when something breaks mid-task, diagnose the root cause, write a fix, VERIFY by re-running the broken thing, then file a `HEAL-` entry to `.learnings/HEALS.md` with proof. Use whenever a command, test, build, or lint fails or exits non-zero; on missing tooling, dependency/lockfile mismatch, wrong runtime version, venv or permission errors, port conflicts, dirty git state, or a missing `.env`; when the agent needs a helper or one-off script that doesn't exist yet; when an external API, tool, or MCP errors or rate-limits; or when a test flakes. Search `HEALS.md` by `Pattern-Key` first — most heals are recurrences, so increment `Recurrence-Count` instead of duplicating. Verify is mandatory: mark `pending-verify` honestly if sandboxed, `abandoned` if the fix can't be made to work. Pairs with `self-improvement` (which promotes recurring heals to durable memory) but owns the verify-before-persist discipline self-improvement doesn't.
development
Control-plane workflow for coordinating multi-agent, multi-session project work from a single Codex, GitHub Copilot, or agent-app control session. Use this skill whenever the user asks to orchestrate agents, create or steer worker sessions, run a workflow-like effort, fan out audits/research/migrations, coordinate parallel implementation streams, monitor other project sessions, or compare this control-session pattern to Claude Code dynamic workflows. This skill is especially relevant when the current session can spawn persistent project sessions and those sessions can spawn their own subagents, creating a two-level orchestration hierarchy.
tools
Active runtime recovery for coding agents: when something breaks mid-task, diagnose the root cause, write a fix, VERIFY by re-running the broken thing, then file a `HEAL-` entry to `.learnings/HEALS.md` with proof. Use whenever a command, test, build, or lint fails or exits non-zero; on missing tooling, dependency/lockfile mismatch, wrong runtime version, venv or permission errors, port conflicts, dirty git state, or a missing `.env`; when the agent needs a helper or one-off script that doesn't exist yet; when an external API, tool, or MCP errors or rate-limits; or when a test flakes. Search `HEALS.md` by `Pattern-Key` first — most heals are recurrences, so increment `Recurrence-Count` instead of duplicating. Verify is mandatory: mark `pending-verify` honestly if sandboxed, `abandoned` if the fix can't be made to work. Pairs with `self-improvement` (which promotes recurring heals to durable memory) but owns the verify-before-persist discipline self-improvement doesn't.