skills/harness/harness-engineering-plan/SKILL.md
Harness engineering planning skill. Decomposes software goals into milestones, TaskNodes, dependency-aware execution waves, stage gates, and validation evidence. Use when the user asks to plan a feature implementation, create a task breakdown, design milestones, create a TaskBoard, or structure a multi-step engineering effort.
npx skillsauth add escapewu/skills harness-engineering-planInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Turn a vague software goal into a controlled execution system: explicit milestones, small TaskNodes, dependency-aware execution waves, stage gates, validation evidence, safety boundaries, and integration checkpoints.
The goal is to make execution boring and auditable. Every task has a target, every milestone has a gate, and every gate has evidence.
A milestone represents a phase that can be validated independently. It is complete only when:
docs/ as SSOT.The next stage starts only after the previous stage has passed its gate. Partial completion is not enough — this prevents downstream work from building on unstable contracts or undocumented assumptions.
A good milestone exposes parallel work. Serial chains longer than two tasks indicate the milestone is designed incorrectly.
Bad shape:
M2
└── T01 → T02 → T03 → T04
Better shape — extract the shared contract into an earlier milestone:
M2A - Contract/Foundation
└── T01
M2B - Parallel Implementation
├── T02
├── T03
└── T04
M2C - Integration Gate
└── T05
When drawing the wave dependency graph in a milestone's tasks.md, always use a top-down vertical tree with one wave per block, separated by a downward arrow ▼ indicating the wave gate. This format is robust to narrow terminal widths, Markdown preview rendering, and copy/paste into review comments.
Bad shape (multi-column ASCII):
W0 (Contracts) W1 (Implementation) W2 (Gate)
───────────────── ────────────────── ──────────
M1-T1 ─┐
M1-T3 ─┼─► M1-T5b ──────► M1-T5c ──┐
M1-T4 ─┤ ├─► M1-T-Gate
M1-T6 ─┘ M1-T9 ──┘
Side-by-side columns break visually below ~120 cols (Markdown preview, narrow editor split, mobile, Slack/Linear paste). Arrows and box lines wrap mid-row and become unreadable.
Good shape (top-down tree, one wave per block, downward gate arrow):
W0 — Contracts (parallel)
├── M1-T1 build_verdict_core
├── M1-T3 summary_schema
├── M1-T4 plan_schema
├── M1-T6 sampling_tools
└── M1-T5a bdstate_schema
▼ (准入:W0 全部 done)
W1 — Implementation (parallel)
├── M1-T5b nodes_impl
├── M1-T5c routing_compile
├── M1-T9 verdict_dispatch
└── M1-T2 invariants_tests
▼ (准入:W1 全部 done)
W2 — Integration + Gate (serial)
├── M1-T5d entry_persistence
├── M1-T5e mermaid_export
├── M1-T7 cap_rules
├── M1-T10 antibias_unit_tests
└── M1-T-Gate E2E_mock_and_real
Properties:
├── / └──) + downward gate arrow (▼) convey "list within wave" + "wave-to-wave gate" without horizontal arrows.| Wave | TaskNodes | 并行度 | 准入 | table for at-a-glance metadata.If a wave's parallelism or dependency requires more detail (e.g., a single intra-wave fan-in), prefer expressing it in the per-TaskNode Depends On field rather than enriching the diagram with horizontal arrows.
Implementation tasks should not invent field names, schema names, lifecycle states, or safety semantics. A strong plan starts with:
Then implementation tasks run in parallel against stable names.
Parallel task outputs should not be merged by last-writer-wins. The integration gate must synthesize accepted behavior from all completed tasks, resolve conflicts intentionally, and run milestone-level validation.
A passing test suite is necessary but not sufficient. The gate must verify:
Every executable unit is represented as a TaskNode with these fields:
## TaskNode: M2-T03
**Title:** Short descriptive name
**Milestone:** M2 - Input Adapters
**Parent:** Root feature or parent TaskNode
**Layer:** 1
**Status:** planned | ready | running | validating | repair_needed | done | blocked | failed | abandoned
**Depends On:**
- M1-T01
- M1-T02
**Preconditions:**
- Contract names are finalized.
- Required fixtures exist.
**Input Context:**
- `path/to/file.py` — why this file matters.
- `docs/path.md` — governing design or contract.
**Expected Output:**
- Concrete files, behavior, schema, API response, UI component, or docs.
**Acceptance Criteria:**
- [ ] Checkable condition 1.
- [ ] Checkable condition 2.
**Validation Commands:**
- `pytest tests/path/test_file.py -v`
- `python -m compileall src tests -q`
**Safety Rules / Non-Goals:**
- Do not change unrelated contracts.
- Do not introduce prohibited capabilities.
**Done Evidence:**
- Test result summary.
- Commit or diff summary.
- Any manual or smoke validation evidence.
Normal: planned → ready → running → validating → done
Repair: validating → repair_needed → running → validating → done
Exception: running → blocked | failed, blocked → ready, failed → abandoned | redesigned
Parent status derived from children:
| Children state | Parent status | |---|---| | all done | done | | any blocked | blocked | | any running | in_progress | | dependencies incomplete | pending |
A release wave is the set of TaskNodes that can be executed together:
status == ready
AND milestone == active_milestone
AND layer == active_layer
AND all dependencies are done
AND all preconditions are true
AND no product decision is missing
AND no safety boundary is ambiguous
Exclude from parallel execution when tasks:
Each milestone should include:
# M2 - Input Adapters
**Purpose:** Normalize upstream data into a stable internal contract.
**Parallelism Expectation:** High after M1 contracts are complete.
**Milestone Gate:** All adapters pass unit tests, missing/stale data degrades safely, integrated output matches the input snapshot contract.
**TaskNodes:**
- M2-T01: Adapter A
- M2-T02: Adapter B
- M2-T03: Adapter C
- M2-T04: Aggregator / integration helper
**Gate Validation:**
- targeted adapter tests;
- integrated contract tests;
- compile/typecheck;
- safety scan;
- docs update check.
A typical feature follows this milestone sequence:
| Milestone | Purpose | Parallelism | |---|---|---| | M0 | Planning Gate — plan, task board, scope, safety boundaries | n/a | | M1 | Contract Foundation — schemas, statuses, degradation, test matrix, docs | high | | M2 | Input Adapters — normalize upstream data into stable contract | high | | M3A | Evidence Scorers — independent, auditable evidence items | high | | M3B | Resolver & Document Builder — combine evidence into decisions | medium | | M4 | Product Surface Integration — pipeline, API, UI, notifications, docs | medium-high | | M5 | Integration & Verification — final gate, full suite, smoke, safety scan | serial gate |
Not every feature uses all milestones. Adapt the shape — but keep the invariant: contracts before implementation, integration as a first-class gate.
For detailed milestone descriptions and example TaskNodes, see templates.md.
Use project-analysis as the read-only evidence layer when M0 or M1 cannot confidently define TaskNodes from existing docs and code.
Trigger it before finalizing the task board when:
docs/OVERVIEW.md -> docs/feature/INDEX.md or docs/reference/INDEX.md;Input Context would otherwise contain guesses instead of concrete entry points and files;Expected handoff from project-analysis:
Entry Points, Relevant Files, Contracts / Data Shapes, Risks / Open Questions, and Validation Candidates.Then copy those facts into the TaskNode Input Context, Acceptance Criteria, and Validation Commands. Do not use project-analysis to replace the task board; it supplies evidence for the harness, while this skill owns milestones, waves, gates, and TaskNode shape.
1. Write M0 plan and task board.
2. Complete M1 contracts before implementation.
3. Release independent M2 input tasks.
4. Integrate M2 and pass gate.
5. Release independent M3A scorer tasks.
6. Integrate M3A and pass gate.
7. Complete M3B resolver/document builder.
8. Integrate into product surfaces in M4.
9. Run M5 full verification.
10. Record final evidence and close the feature.
taskBoard is a temporary execution artifact, not a permanent doc. It lives under this skill's own tasks/ directory, separate from docs/.
.agents/skills/harness/harness-engineering-plan/
├── SKILL.md
├── templates.md
└── tasks/ ← temporary execution files (WIP)
├── <module>/
│ └── taskBoard.md ← active taskBoard
└── archive/ ← completed taskBoards
└── <module>/
└── taskBoard-<phase>.md
1. 任务启动 → 生成 taskBoard 到 tasks/<module>/taskBoard.md
2. 执行中 → 在 tasks/ 下更新状态,不碰 docs/
3. 完成后 → taskBoard 移入 tasks/archive/
4. 然后 → 提炼结论更新 docs/feature/<module>/ 中的 SSOT
Why: taskBoard carries transient execution state (Status: planned→running→done, Evidence, wave gate status). Mixing it into docs/ confuses WIP process with stable truth. docs/ is SSOT — update it only after completion, not during execution.
docs/ stays clean of taskBoards. After a feature ships, the stable conclusions (data model, design rationale, new API contracts) flow into docs/feature/<module>/ and docs/reference/. The spent taskBoard lives in tasks/archive/ for audit only.
Before execution, review the plan:
This structure reduces ambiguity at every layer:
The result is an engineering harness: not just a plan, but a repeatable control structure for moving from vague intent to verified implementation.
When a wave of independent TaskNodes is ready and the user wants to parallelize implementation via subagents (each running in its own git worktree), use the canonical 8-section prompt structure in templates.md §"Per-TaskNode Codex Subagent Prompt Template".
Two implementation paths are supported — same prompt structure, different dispatchers:
| Path | Dispatcher | Skill | When to use |
|---|---|---|---|
| Codex CLI | codex (gpt-5.5/high) | templates.md §"Per-TaskNode Codex Subagent Prompt Template" | 关心 GPT 系列质量;轻量任务;与 sandbox 现有链路对齐 |
| Claude Code CLI headless | claude --bare -p (opus default) | ../claude-headless-subagent/SKILL.md | 需预算硬上限 + 实时进度可观测 + JSON 审计;长任务;混合并行 wave |
Key requirements (apply to both paths):
git worktree (branch feat/<task-id>-<slug>)_shared_context.md + N × <TASK_ID>.prompt.mdA wave can mix Codex + Claude TaskNodes when the prompt files share the same 8-section structure. The default mode (single agent, single worktree, sequential TaskNodes) remains valid and lower-risk.
development
Use when working with the news fetcher REST API at <news-fetcher-host> for supported-site lookup, domain article discovery, URL fetching, batch fetch/crawl workflows, fetch history queries, and Bearer-authenticated integration examples.
development
create and refresh repository-specific development standards for an existing local codebase. use when the user wants to analyze a local repository, extract coding conventions from real files, generate docs/ai-dev-standards, create code review checklists, or update existing agents.md or claude.md files so future coding agents load the right standards before development. do not use for generic programming advice detached from a repository.
documentation
analyze postgresql or mysql database schemas from ddl files, schema-only dumps, migration sql, or read-only database metadata. use when the user wants table structure summaries, primary keys, foreign keys, indexes, inferred table relationships, er diagrams, dbml, mermaid erd, schema documentation, or database relationship analysis for postgres/mysql schemas.
tools
Replace with description of the skill and when Claude should use it.