skills/harness-engineering/SKILL.md
Harness engineering for Harness Kit primitives: skills, shared doctrine, provider roster, harness configs, gates, evals, bootstrap, and sync logic. Use for "improve the harness", "harness engineering", "bootstrap is wrong", "AGENTS.md is stale", "skill health", "skill usage", "undertriggering skill", "description tax", "eval skill", "sync primitives", "roster defaults". Trigger: /harness-engineering, /harness, /skill.
npx skillsauth add phrazzld/spellbook harness-engineeringInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Engineer the harness. Keep it thin.
| Need | Load |
|---|---|
| create skill or prompt | references/mode-create.md |
| eval skill | references/mode-eval.md |
| lint skill | references/mode-lint.md |
| apply skill-design lessons | references/skill-design-principles.md |
| clean Codex skill catalog | external steipete-skill-cleaner |
| convert agent/skill | references/mode-convert.md |
| sync externals | references/mode-sync.md |
| engineer doctrine/gates/hooks | references/mode-engineer.md |
| measure skill usage/health/staleness | references/mode-audit.md |
| current model/provider/harness facts | the roster skill's references/model-provider-harness-index.md |
| open-model defaults | references/open-model-roster.md |
Repo-local skills for consumer repos (bespoke QA drivers, persona probes)
are written directly into that repo's .agents/skills/ with its real
routes and commands; this skill owns the craft either way. For a repo's
verification skill, interview the operator first: the manual checks they
run after the agent responds and before merge are the spec — encode each
check that has a tool. Turning a proven session pattern into a first-party
primitive starts at the primitive test below — most patterns are prompts,
not skills.
SKILL.md is primary.skills/roster/references/model-provider-harness-index.md). Keep that
file factual: model ids, context, price, latency/smoke evidence, tool
support, benchmark sources, deprecations, and freshness. Do not encode
role-fit policy there; the lead agent composes task-specific teams from
current evidence.skills/; repo-local .agents/skills/ and
harness-specific skill bridge dirs are /seed output for consumer repos.AGENTS.md is a router, not a manual. Keep non-obvious facts
only.delegate on judgment per the shared Roster contract: native subagents
by default; add cross-model critics, roster providers, or sprite lanes
(/sprites) only when they answer a distinct question. See
harnesses/shared/AGENTS.md (Roster).
Local lane guidance: Use lanes for doctrine critique, runtime compatibility, gate design, and regression risk. Do not treat a missing repo-local roster as a waiver; use the resolver-backed probe.
Before creating or growing anything, classify it (2026-06 audit, backlog 103):
History: slash commands were collapsed into skills when skills arrived, so saved prompts masqueraded as skills and the catalog tripled. Do not recreate that.
SKILL.md encodes judgment, not a procedure the model already knows.Use when: phrases and Trigger: aliases.references/; keep the entry file short.After changing skills, shared doctrine, generated docs, bootstrap, roster, or harness projections, prove the output is repo-fit, not merely structurally valid.
## Acceptance Evidence
- Live repo evidence read: source skill, shared doctrine, generated docs, bootstrap output, roster, or harness projection inspected.
- Acceptance source: backlog oracle, skill contract, generated index/docs contract, bootstrap contract, or explicit absence.
- Evidence that proves it: command output, diff, generated artifact, bootstrap transcript, or gate output.
- Exact command/path/route exercised: check, generator, bootstrap, smoke path, projection path, or route run.
- Oracle / acceptance artifact hash: sha256 digest for any fixture, generated artifact, transcript, or contract used as the oracle, or state that no artifact-backed oracle exists.
- Contract-change acknowledgment: reason when the change alters an acceptance contract, generated source, or assertion surface, or state that no contract changed.
- Repo-fit check: source/generator/projection agree; no stale generated docs, wrong skill root, stale command, or copied bridge remains.
- Structural gate: `check --repo .` result, or the specific sub-gate exercised.
- Residual risk: skipped harness, external dependency, or none with reason.
Run cargo run --locked -p harness-kit-checks -- check --repo . after
changing harness primitives, gates, roster, bootstrap, or sync logic. For
bootstrap changes, also re-run the bootstrap and confirm the installed
symlinks (skills, prompts, configs) match the source tree.
tools
Enumerates the peer AI agent CLIs installed on this machine (codex, claude, pi, opencode, cursor-agent, grok, agy, hermes, thinktank) and how to invoke each headlessly. A capability map, not a quota: useful for fresh-context adversarial review on a different model family, second opinions, competing attempts, and wide benches. Use when: "ask codex", "ask another model", "second opinion", "cross-model review", "what AI tools do I have", "other agents", "different model family", "adversarial critique from another provider". Trigger: /roster.
development
Run lane cards on Fly Sprites: remote, isolated, scale-to-zero sandboxes for heavy or parallel agent work. Golden-checkpoint provisioning so lanes start on a ready sprite with zero setup tokens. Use when: "run this on a sprite", "remote lane", "offload to a sandbox", "dispatch to sprites", "bake a sprite", "sprite fleet", heavy/long-running/parallel sub-agent work that should not run on this machine. Trigger: /sprites, /sprite-lane.
testing
Compose and launch roster-backed specialist lanes with prompt-native lane cards and receipts. Use when: "dispatch agents", "use subagents", "compose a team", "run provider lanes", "make lane cards". Trigger: /dispatch, /subagents, /lanes.
tools
Fast session-start repository orientation from live local evidence. Use when: "orient yourself", "start of session", "new session", "where are we", "catch me up before acting", "what should I do next", after compaction, after switching worktrees, or before choosing a Harness Kit workflow. Trigger: /orient, /ground, /session-start.