.agents/skills/harness-setup/SKILL.md
Discover your team's development lifecycle through a deep dive conversation, then generate a phased workflow tailored to how you actually work.
npx skillsauth add cowcow02/agentfleet harness-setupInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Safety: This skill is primarily read-only during setup. It scans your codebase (read), asks you questions, then generates new skill files and a .harness/ directory — all with your approval before each write. It does not modify your existing code, delete files, run builds, or start services. Every file write is presented for approval first.
When invoked, print this banner before doing anything else:
██╗ ██╗ █████╗ ██████╗ ███╗ ██╗███████╗███████╗███████╗
██║ ██║██╔══██╗██╔══██╗████╗ ██║██╔════╝██╔════╝██╔════╝
███████║███████║██████╔╝██╔██╗ ██║█████╗ ███████╗███████╗
██╔══██║██╔══██║██╔══██╗██║╚██╗██║██╔══╝ ╚════██║╚════██║
██║ ██║██║ ██║██║ ██║██║ ╚████║███████╗███████║███████║
╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═══╝╚══════╝╚══════╝╚══════╝
Then, before anything else, read the existing agent-facing config if one exists — CLAUDE.md, AGENTS.md, .cursorrules, GEMINI.md, or similar. This gives you a head start on project context, team conventions, tech stack, and workflow preferences. Use what you learn to score coverage against the seven dimensions (see Phase 0) and determine which deep dive questions to skip.
Then output the intro and in the same message, use the Agent tool with run_in_background: true to dispatch the scan (Phase 1). Do NOT wait for the scan to return. Immediately start the deep dive (Phase 2) in the same response.
Output this intro, then ask your first deep dive question right after it:
What is this? Harnessable gives AI agents a disciplined, phased workflow for shipping software. Instead of one big prompt, every task follows focused phases — each with explicit instructions and an exit gate. A universal engine orchestrates them through a state file.
Launcher → Engine → [phase 1] → [phase 2] → ... → [phase N] → COMPLETEI'm scanning your codebase in the background. Let me learn about how you work while that runs.
Read CLAUDE.md, AGENTS.md, .cursorrules, GEMINI.md — whatever exists. Score coverage against seven dimensions to determine which deep dive questions are needed:
| Dimension | Status | | ----------------------- | --------------- | | Tech stack | known / unknown | | Commands | known / unknown | | Lifecycle steps | known / unknown | | Ownership boundaries | known / unknown | | Verification strategies | known / unknown | | Conventions/constraints | known / unknown | | Gates/handoffs | known / unknown |
The question count maps to how many dimensions are unknown:
| Unknown dimensions | Questions needed | | ------------------------------- | --------------------- | | 5-7 (no config) | 6 | | 3-4 (tech stack known) | 4-5 | | 1-2 (lifecycle partially known) | 3-4 | | 0 (fully documented) | 3 (verification only) |
Minimum is always 3 (narrative, ownership, failure points) because even fully documented workflows have implicit knowledge.
Use Agent tool with run_in_background: true to dispatch a scan agent. You will be notified when it completes. Do NOT poll or wait for it.
The scan validates and enriches the deep dive — it does NOT drive phase selection. Phases come from the human's description of their workflow.
The scan agent should explore:
Build & Test Tooling: package.json (npm/yarn/pnpm), Cargo.toml, go.mod, pyproject.toml, Gemfile. Build system (turbo.json, nx.json, Makefile). Test framework (vitest/jest, pytest, go test, rspec). Linter & formatter (eslint, prettier, ruff, golangci-lint). Type checker (tsconfig.json, mypy, pyright). Monorepo structure (workspace config, package directories).
CI/CD & Deploy: CI config (.github/workflows/, .circleci/, .gitlab-ci.yml). Deploy config (vercel.json, railway.toml, Dockerfile). Preview environments.
Project Management: PM tool config (.linear/, jira config, GitHub project references). Issue references in commit messages (KIP-XX, PROJ-XX, #123). Branching model from git log.
Database & Services: Database (PostgreSQL, MySQL, SQLite, MongoDB — migration tools, seed data). External services (APIs, message queues, caches). Dev environment (docker-compose, devcontainers, local setup scripts).
Repository Structure: README and docs. Existing workflow docs (CONTRIBUTING.md, ADRs). Code organization (src layout, test location conventions).
Installed Skills: Use ls or Glob to explicitly list directories inside .claude/skills/, .cursor/skills/, and .agents/ in the project root. For each directory found that is NOT a harness-* skill, read its SKILL.md frontmatter (name, description). Only report skills that have an actual SKILL.md file on disk. Do NOT use your loaded skill context or system-level skills — only what exists in the project's skill directories. If no non-harness skills are found, report "none found".
The agent should return a structured summary of everything found — including the list of installed skills with their names and descriptions — and what remains unclear.
This is the core of setup. Start the deep dive immediately — don't wait for the scan to complete. Use AskUserQuestion (or the platform's equivalent) for each question.
The six first principles serve as a listening framework — use them to parse the human's narrative and identify gaps. They are NOT a questionnaire. The human never hears "what's the trigger for this step?"
"Walk me through the last thing your team shipped — from the moment it started to the moment it was done. Include the boring parts."
If existing config describes conventions but not lifecycle:
"Your config describes [tech stack / conventions]. But I don't know your workflow yet. Walk me through the last thing your team shipped — from the moment it started to the moment it was done."
Listen for all six principles. Extract steps. Most principles (Trigger, Artifact, Verification, Destination, Ownership) surface from this single narrative answer.
Present the extracted lifecycle back to the human:
"Here's what I heard:
- [step] → [step] → [step] → ... → [step]
For each step, who should own it — you, the agent, or automated?"
This defines which steps become phase skills (agent), gates (human), or checklist items (automated).
"Where do things typically break or get stuck?"
Surfaces verification gaps, pain points, and friction that the concept library can address.
"Where do you need to approve before the agent continues?"
"For the steps the agent will own — what would a new engineer get wrong?"
"I found [tools/configs] in your repo. Here's how that maps to what you described:
- [step X]: I'll use
[exact command]for this- [step Y]: I found [tool] but you didn't mention it — include it?
- [step Z]: You mentioned [thing] but I couldn't find it — where is it?"
Environment isolation probe: If the scan detected Docker Compose, a database, or a dev server, AND the user didn't already describe isolation, ask:
"I found [Docker Compose / PostgreSQL / dev server] in your repo. When you have multiple features in flight — do they share the same database and ports, or does each get its own? And how do you clean up when a feature is done?"
This surfaces whether to generate environment setup and cleanup phases with isolation (worktrees, per-feature DBs, dynamic ports) or keep it simple (shared environment). Don't assume isolation is needed — many solo developers share one database and that's fine.
Write to .harness/lifecycle.md for session resilience. If the session drops between synthesis and skill generation, the lifecycle document survives and can be read on resume. This file is overwritten on each /harness-setup run.
Produce from deep dive + scan:
Work unit: [description]
Typical size: [timeframe]
Lifecycle:
1. [Step name in their words]
Owner: human | agent | automated
Trigger: [what starts this step]
Artifact: [what it produces]
Verification: [how correctness is checked]
Destination: [where output goes]
Constraints: [rules]
Gate: [what must be true to proceed]
2. ...
Agent-owned phases (become phase skills):
- [list]
Human-owned gates (become status: "waiting"):
- [list]
Automated steps (become checklist items):
- [list]
Profiles:
[default]: all phases
[lighter]: skip [phases] — for [simpler work]
[lightest]: only [phases] — for [trivial changes]
Friction-to-concept matches:
[pain point] → [concept]
Show the lifecycle to the human for approval:
Your workflow:
| Phase | Owner | Produces | Verified by | Gate | | ------ | ----- | ---------- | ----------- | ------------- | | [name] | Agent | [artifact] | [method] | [condition] | | [name] | You | [decision] | — | Your approval |
Profiles:
[name]: [phases] — for [when]Concepts I'd recommend based on your pain points:
- [concept]: [one sentence]
Launcher name: What should I call it? (default:
/implement)Does this look right?
Human can adjust phases, ownership, names, concepts.
When the engine encounters a phase with status: "waiting", it announces what it's waiting for and stops. The human resolves the gate by re-invoking the launcher:
/<launcher> (or resumes the session)waiting phasedone, engine continues to next phaseblocked with reason, engine stopsSynthesize findings into a ## Harness Context section. Write to the AI config file:
CLAUDE.md.cursorrulesGEMINI.md## Harness Context
### Repository
[from scan]
### Workflow Lifecycle
[lifecycle document in human-readable form]
### Verification
[per-phase verification strategies, enriched with exact commands from scan]
### Conventions
[constraints from deep dive, enriched with scan findings]
### Team & Process
[ownership map, gates, autonomy level]
### CI/CD
[from scan]
### Project Management
[from scan + deep dive if PM tool discussed]
Create the .harness/ directory in the project root:
.harness/
├── lifecycle.md # written in Phase 3, overwritten each setup run
├── conversations/ # per-implementation records (phase progress, decisions, evidence)
└── retros/ # past retro results
Conversations and retros should be committed to the repo — /harness-retro reads them to identify cross-round patterns and improve the workflow over time. Do NOT add .harness/ to .gitignore.
Note: lifecycle.md is already present from Phase 3. The .harness/ directory may have been created then — ensure conversations/ and retros/ subdirectories exist.
Based on the lifecycle document, generate the skill architecture:
Named by the user (default: /implement). The launcher:
--profile flag, default, or task labels.harness/state.json with the lifecycle array — phases and statuses based on profile. Skipped phases get { "status": "skipped", "reason": "profile:<name>" }/harness-engine with the state file pathNamed harness-<phase-name> (using the phase names from the deep dive, NOT a preset list). Each phase skill contains:
null and must be set to true, false, or "skipped".harness/conversations/ at each phase transitionProfile names and shapes come from the deep dive (work unit variance), not from a fixed matrix. Here are examples to illustrate:
Web team:
feature: pickup → understand → plan → execute → verify → shipbugfix: pickup → execute → verify → shipquick: execute → verify → shipAI app team:
full: design-prompt → build-eval → iterate → shadow → promoteprompt-only: iterate → shadow → promoteguardrail: add-rule → test-adversarial → deployData team:
experiment: hypothesis → explore → engineer → train → evaluate → deployretrain: train → evaluate → deployhotfix: fix-pipeline → verify → deployDevOps team:
change: plan → apply-staging → verify → apply-prod → monitorhotfix: apply-staging → verify → apply-prod → monitorIf the deep dive reveals coordination needs (human described parallel work or multi-agent scenarios), generate:
tickets block and coordination fieldIf no coordination is needed, skip entirely — teams adopt coordination later via /harness-retro when friction signals appear.
For each concept the user chose to adopt, integrate it into the appropriate phase skills. The specific phases affected depend on the discovered lifecycle. Accepted concepts are woven into the skills, not bolted on as separate steps.
<placeholder> syntax in any generated skillAfter generating skills but before writing HARNESS.md, cross-reference the generated phase skills against the scan findings to catch obvious gaps. This is a mechanical check, not a second brainstorm.
For each agent-owned phase skill, check: does this phase depend on something the scan detected (Docker, database, dev server, package install) that no earlier phase sets up?
Examples of what to catch:
docker compose commands but no earlier phase starts DockerDoes each phase produce something that a later phase consumes? Is the handoff explicit in the state file outputs?
Examples:
Are there tools, scripts, or configs the scan found that no phase skill uses?
Examples:
cleanup or teardown script but no phase cleans upIf gaps are found, present them concisely:
Pre-flight check — I found some gaps when cross-referencing your skills against the codebase:
- Your build phase runs
compose.sh runbut nothing starts Docker first. Add an environment setup step? (yes/no)- Your test phase starts the server but doesn't run migrations. Add that to the startup? (yes/no)
- I found
compose.sh downin your scripts but no phase cleans up. Add a cleanup step? (yes/no)These would prevent hard failures on the first run.
For each "yes": update the affected phase skill (add the missing step) or generate a new phase skill if needed. Update the lifecycle document, launcher profile matrix, and state file template to match.
If no gaps found, skip this phase silently — don't announce "no gaps found."
Generate a HARNESS.md at the project root. This is the human-readable front door to the harness — it explains what harness engineering is, how the workflow works for this project, and references everything in .harness/.
The content reflects the discovered lifecycle — not a preset template. Include:
/harness-setup (shape) → real work → /harness-retro (reflect) → reshape.harness/conversations/, .harness/retros/)/harness-retro after work, re-run /harness-setup to rethink)HARNESS.md is committed to the repo — it's documentation for the team.
After writing context, initializing .harness/, and generating skills, output:
You're set up. Here's what was created:
Architecture:
/<launcher-name>— your launcher (the skill you invoke)- N phase skills — focused work per phase ([list phase names from the discovered lifecycle])
harness-engine— universal state machine that loops through phases (ships with Harnessable)Data:
.harness/directory for recording progress and stateHARNESS.mddocumenting the workflow for your teamHow to use it: Run
/<launcher-name> ISSUE-123(or a plain text description). The launcher creates a state file, the engine loops through your phase skills — each one does focused work with clear exit gates. Progress is recorded automatically.How it improves: After a few rounds of real work, run
/harness-retro. It reads the recorded conversations, maps friction to specific phase skills, and suggests targeted improvements.The harness loop:
/harness-setup(shape) → real work →/harness-retro(reflect) → reshape
/harness-setup can be run again anytime:
/harness-retro surfaces new concepts to exploreOn re-runs, read HARNESS.md, .harness/lifecycle.md, and existing skills. Show what's changed since last exploration. Regenerate or update skills as needed.
testing
Launcher — pick up a Linear ticket or task description and drive it through the full phased workflow (pickup → understand → plan → implement → quality → verify → ship).
development
Phase skill: start the app on an isolated per-task environment and verify the deliverable works — browser checks via Claude in Chrome for UI, HTTP calls for API
development
Phase skill: explore the codebase to understand what needs to change for the ticket
testing
Phase skill: commit changes, push branch, create a GitHub pull request, and watch CI to green