skills/dev-loop/skills/setup-dev-loop/SKILL.md
Scaffold per-repo dev-loop config (PRD layer, knowledge layer, release config, vault path) and build the project glossary with grill-with-docs. Run once per repo before using dev-loop.
npx skillsauth add karlorz/agent-skills setup-dev-loopInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Scaffold the per-repo configuration that dev-loop consumes:
This is a prompt-driven skill, not a deterministic script. Explore, present findings, confirm with user, then write.
Look at the current repo to understand its starting state:
git remote -v and .git/config — is this a GitHub repo? Which one?
CLAUDE.md and AGENTS.md — does either exist?
CONTEXT.md and CONTEXT-MAP.md at repo root
docs/adr/ and any src/*/docs/adr/ directories
./.claude/dev-loop.config.md — does it already exist?
Installed skills — ls ~/.claude/skills/ for available PRD backends
Installed interview backends — check for grill-with-docs, grill-me under ~/.claude/skills/
skillwiki path — is a vault configured?
Dependency doctor — spawn dev-loop:doctor-worker (sonnet) to enumerate
missing optional plugins:
Agent(description: "Setup dep doctor", subagent_type: "dev-loop:doctor-worker", model: "sonnet", prompt: "Probe skills/dev-loop/dependencies.yaml. Report JSON.")
Use the missing_optional[] output to drive install hints in Sections D
(grill-with-docs), E (work-item interview backends), H (web search MCP for
fact-check), I (deep-research), and J (playwright-cli). Each install hint
pairs the missing ref with its documented fallback so users can decide:
install for richer behavior, or accept the fallback.
If status: broken (any required dep missing), abort setup with the
required install commands — dev-loop can't run without them.
Summarise what's present and what's missing. Walk through decisions one at a time:
Section A — PRD layer.
Explainer: The PRD layer is the skill suite that drives the brainstorm → spec → plan → execute → review pipeline. Pick the workflow that matches how you want to work.
Default posture: if superpowers skills are installed, propose superpowers. If only TDD skills are available, propose tdd. Otherwise manual.
Options:
Section B — Knowledge layer.
Explainer: The knowledge layer controls how dev-loop captures retros, distills patterns, and maintains project knowledge. With skillwiki, everything persists to a queryable vault. Without it, dev-loop uses local files.
Default posture: if skillwiki path succeeds, propose skillwiki. Otherwise none.
Options:
Section C — Release config.
Explainer: How are changes in this repo published and deployed? This controls the PUSH and DEPLOY steps.
Ask:
Section D — Domain glossary (delegate to grill-with-docs).
Explainer: A shared language document (CONTEXT.md) helps agents use precise terminology instead of 20 words where 1 will do. Invest 5 minutes now — it pays off every session.
If grill-with-docs is installed, tell the user: "I'll now invoke grill-with-docs to build the project glossary." Load it via Skill("grill-with-docs") and follow its interview process. When it finishes, resume here.
If grill-with-docs is NOT installed, tell the user: "For a richer glossary-building experience, install grill-with-docs: npx skills@latest add mattpocock/skills --skill grill-with-docs -a claude-code -g -y. For now, I'll capture key terms in CONTEXT.md manually." Then ask 2-3 domain questions and write a basic CONTEXT.md.
Section E — Interview config.
Explainer: Dev-loop can ask clarifying questions before writing a spec. Two capabilities: setup interview (the bootstrap flow you're in now — always available) and work-item interview (optional per-work-item grilling before the SPEC step). The work-item interview uses a backend — native (3 fixed questions, zero dependencies), grill-with-docs (adaptive + terminology + CONTEXT.md), or grill-me (adaptive, no persistent files). You can also control when it fires.
Present the available backends:
| Backend | Install | When to pick |
|---------|---------|--------------|
| native | None (always available) | Quick alignment, CI contexts, minimal interaction |
| grill-with-docs | npx skills@latest add mattpocock/skills --skill grill-with-docs -a claude-code -g -y | Codebases you'll revisit, building shared language |
| grill-me | npx skills@latest add mattpocock/skills --skill grill-me -a claude-code -g -y | Adaptive questioning without persistent docs |
Default posture:
native — always works, no install requiredgrill-with-docs or grill-me is already installed, note it and offer as alternativesThen ask about trigger mode:
Explainer: When does the interview fire? auto detects ambiguity (conflicting prior decisions, vague descriptions, zero prior art) and interviews only when needed. manual only interviews when a work item explicitly requests it. never disables interviews — the loop runs fully automated.
Options:
grill: true trigger an interviewDefault posture: propose auto. Most projects never notice the interview is there — ambiguity detection skips it for clear, well-scoped tasks.
Section F — CI Setup.
Explainer: Dev-loop can create PRs with auto-merge after each cycle, but auto-merge without CI checks means code can merge untested. Setting up a minimal CI workflow ensures every PR is validated before it lands on main.
Check for existing CI:
.github/workflows/ci.yml already exist? If yes, skip to confirmation..github/workflows/ exist with any other workflow? Note it for context.If no CI workflow exists, detect the repo framework:
| Signal | Framework | CI steps |
|--------|-----------|----------|
| package.json with scripts.lint | Node.js | lint + type-check + test |
| package.json without scripts.lint | Node.js (minimal) | npm install + npm test |
| Makefile with check target | Make-based | make check |
| pyproject.toml | Python | ruff check + pytest |
| Cargo.toml | Rust | cargo clippy + cargo test |
| None of the above | Generic | echo "No CI steps detected — add manually" |
Present the detected framework and proposed CI steps. Ask:
"I'll generate
.github/workflows/ci.ymlwith these steps. Should I also enable branch protection on main (require CI to pass before merge)?"
Options:
gh api branch protectionci_configured: falseDefault posture: propose "CI + branch protection" for new projects, "CI workflow only" for existing repos where branch protection might conflict with current workflows.
The generated workflow uses:
concurrency: { group: ci-${{ github.ref }}, cancel-in-progress: true } to avoid duplicate runspush to main and pull_request targeting mainAfter generating the workflow:
ci_configured: true in the dev-loop.config.md outputci_workflow: .github/workflows/ci.yml in the configSection G — Critical paths.
Explainer: Critical paths declare project hot-spots — code files, vault pages, and incident references that matter more than average files. The dev-loop engine biases research, query, and work-item priority toward these paths.
Ask:
"Which areas of the codebase are most critical — the ones where bugs hurt most or changes are most frequent? Name 1-3 critical paths."
For each path the user names, ask:
If the user has no critical paths, leave the section empty. The engine defaults to equal priority for all files.
Default posture: if the repo has a CLAUDE.md or CONTEXT.md with known hot-spots, pre-populate suggestions.
Config emitted:
critical_paths:
<name>:
code:
- <glob-or-path>
vault:
- <concept-or-query-page-slug>
history_pins:
- <free-text incident reference>
Runtime behavior: Loaded at REFRESH into CRITICAL_PATHS. QUERY biases vault search toward *.vault slugs. WORK auto-escalates priority: high if changed files match *.code globs. Research agent ranks coverage gaps in *.code above other files. Schema reference: templates/project-config.md § Critical paths.
Section H — Fact-check tier.
Explainer: Dev-loop agents can consult external knowledge sources (web search, library docs, vault queries) when writing specs and plans. Without fact-checking, agents rely on local context only — which can lead to version-sensitive errors or stale API assumptions.
Detect installed web MCP servers by checking available tool names:
mcp__grok-search__*, mcp__brave-search__*, or built-in WebSearch/WebFetchAsk:
"Should dev-loop agents be able to search the web for facts when writing specs and plans? I've detected: [list installed tools]."
Options:
If the user picks full fact-checking:
Default posture: if grok-search is installed, propose full fact-checking with grok-search as primary. Otherwise, propose local + vault only.
Config emitted:
fact_check:
enabled: true
source_order:
- local_repo
- context7
- vault
- web
web_tools:
primary: mcp__grok-search__web_search
evidence_contract:
require_sources_used_section: true
triggers:
- "version "
- "deprecat"
- "CVE-"
Runtime behavior: Loaded at REFRESH into FACT_CHECK_CAPS (source_order, web_available bool, evidence_contract). Passed to SPEC and PLAN — PRD skills consult sources in declared order for version/API/deprecation claims. Output specs include ## Sources Used if contract requires it. REVIEW gate flags missing section. Schema reference: templates/project-config.md § Fact-check tier.
Section I — Idle deep-research.
Explainer: When dev-loop's IDLE cycle finds no claimable work, it normally exits after maintenance. Idle deep-research turns those dead cycles into a research backlog by invoking
/deep-researchon rotating topics — building up forward-looking ideas that compound over weeks.
Ask:
"Should idle dev-loop cycles run deep-research on rotating topics? This is useful for long-running cron loops that would otherwise exit idle."
Options:
topic_seeds: [user-provided list], bias_toward: critical_paths (if Section G declared).topic_seeds from critical paths declared in Section G. Config output: topic_seeds: [<auto-derived from critical_paths.*.code filenames and vault slugs>], bias_toward: critical_paths. Same as custom topics but pre-filled — the user can edit the derived list before writing.idle_deep_research section in config.If the user picks enable:
critical_paths.*.code and any detected pain points.Default posture: if the project has critical_paths, propose "enable with critical-path bias" as the default. Otherwise, propose "skip for now" — idle deep-research is most valuable on projects with long-running cron loops.
Config emitted:
idle_deep_research:
enabled: true
topic_seeds:
- <topic-1>
- <topic-2>
bias_toward: critical_paths
cooldown_cycles: 3
max_per_day: 4
skip_if_recent_query_page_exists: 7
budget:
web_searches: 3
deep_fetches: 3
context7_calls: 3
Runtime behavior: Idle Discovery step 4.5 — fires only when research step 4 returns no P2+ findings and cooldown allows. Round-robins through topic_seeds, biased toward critical_paths.*.code matches when bias_toward is set. Honors budget.* caps per run. Output ideas route through the schema-compatible vault queue; use raw transcript captures when the active schema lacks a non-executing work-item status. Default score: p_score_default: P3. Schema reference: templates/project-config.md § Idle deep-research.
Section J — Browser verification gate.
Explainer: Browser-facing changes can ship regressions (a11y violations, console errors, broken routes) if there's no automated verification gate. The browser verification step runs
/playwright-clibetween code review and merge to catch these before the PR is created.
Detect web frameworks:
vite.config.*, next.config.*, package.json with React/Vue/Svelte depsplaywright.config.*/playwright-cli skill is availableIf no web framework detected, skip this section with: "No web framework detected — browser verification not applicable."
If web framework detected, ask:
"I detected [framework]. Should dev-loop verify browser changes before merge? This adds a
/playwright-cligate between code review and PR creation."
Options:
If enabled:
apps/**/*.tsx)package.json scripts)http://localhost:5173 for Vite, http://localhost:3000 for Next.js)Default posture: if Vite/React is detected and playwright-cli is installed, propose "enable" with sensible defaults. Otherwise, propose "skip."
Config emitted:
browser_verification:
enabled: true
trigger:
- "apps/**/*.tsx"
prerequisites:
- "curl -fsS http://localhost:5173 >/dev/null"
base_url: http://localhost:5173
smoke_routes:
- /
- /login
reviser_workflow:
- take_snapshot
- list_console_messages
- evaluate_script
e2e_fallback: npm run test:e2e
Runtime behavior: Step 6a (between REVIEW and MERGE). Skipped unless changed files match trigger globs. Validates prerequisites (block on unhealthy), spawns playwright-cli:browser-worker (model: sonnet) to walk reviser_workflow on smoke_routes. Console errors fail the gate → return to EXECUTE. Schema reference: templates/project-config.md § Browser verification.
Section K — Reactive debugging budget.
Explainer: When EXECUTE fails, dev-loop invokes systematic-debugging and retries. Without a budget, a reproducible failure can burn the daily web budget and lock the loop into the same broken step. The reactive-debug budget caps retries, captures evidence, fact-checks external-lib errors, and escalates persistent failures.
Ask:
"Should reactive debugging have a retry budget and escalation policy? Without it, dev-loop retries indefinitely on the same error."
Options:
reactive_debugging: with auto_retry_attempts, evidence_capture, fact_check_tool, escalate_after, and escalation_action. Cap retries, capture evidence, fact-check external libs, escalate after N cycles.reactive_debugging: with auto_retry_attempts and fact_check_tool only. Skip evidence_capture and evidence_dir (not needed). Cap retries + fact-check external libs, no evidence capture.reactive_debugging: section in config. Unbounded retries, no budget, no escalation.If not legacy:
Default posture: if fact-checking is enabled (Section H), propose "enable with budget" using the same fact_check_tool. Otherwise, propose "enable with budget" without fact-check.
Config emitted:
reactive_debugging:
enabled: true
auto_retry_attempts: 2
evidence_dir: .claude/dev-loop-debug/
evidence_capture:
- "make check 2>&1 | tee {evidence_dir}/{cycle}-check.log"
- "git diff > {evidence_dir}/{cycle}-diff.patch"
fact_check_tool: mcp__grok-search__web_search
escalate_after:
consecutive_idle_cycles: 3
same_error_signature: true
Runtime behavior: Sub-step of EXECUTE — fires only when a when: failure, mode: reactive discipline is matched. Captures evidence under evidence_dir (with {evidence_dir}/{cycle} interpolation), hashes error signature, fact-checks external libs via fact_check_tool, retries up to auto_retry_attempts. On exhaustion + escalate_after match, files a P1 finding to raw/transcripts/ keyed by hash (future cycles dedup). evidence_dir MUST be in .gitignore. Schema reference: templates/project-config.md § Reactive debugging.
Section L — Discipline path scoping.
Explainer: Disciplines like TDD can be scoped to specific files via
include_paths/exclude_paths. Instead of TDD mandatory everywhere (high friction) or advisory everywhere (no gate), you can make TDD mandatory on critical paths and advisory on everything else.
Ask:
"Should any disciplines be scoped to specific files? For example, TDD mandatory on critical paths, advisory everywhere else."
Options:
include_paths from critical paths declared in Section G. TDD mandatory on those files, advisory on everything else.If the user picks "scope via critical_paths":
- skill: superpowers:test-driven-development
when: execute
mode: mandatory
include_paths: [<auto-filled from critical_paths.*.code>]
- skill: superpowers:test-driven-development
when: execute
mode: advisory
# catch-all — no include_paths
If the user picks "scope manually":
Default posture: if critical_paths (Section G) were declared AND TDD is in prd_disciplines[], strongly recommend "scope via critical_paths." This is the primary use case for Section L — the trends worked example needs TDD mandatory only on critical_paths.*.code files.
Config emitted:
prd_disciplines:
- skill: superpowers:test-driven-development
when: execute
mode: mandatory
include_paths:
- packages/convex/convex/aiScoring*.ts
# exclude_paths optional — escape hatch from include_paths
- skill: superpowers:test-driven-development
when: execute
mode: advisory
# no include_paths → catch-all for everything else
Runtime behavior: Resolved at REFRESH per {skill, when} group. EXECUTE intersects changed-files-since-WORK with each entry's include_paths (omitted = catch-all), applies exclude_paths, picks first match per group. Different {skill, when} groups are independent — matching one does not suppress another. Backwards-compat: omitted include_paths keeps prior global-scope behavior. Warning emitted at REFRESH when mode: mandatory has no include_paths. Schema reference: templates/project-config.md § Cross-cutting disciplines.
Section M — Code review backends (since v1.15.0).
Explainer: dev-loop's REVIEW step always runs
simplify-worker(sonnet) as the base code reviewer. Optionally, a second reviewer can run in parallel —codex:codex-rescuevia thedev-loop:codex-review-workerwrapper — to provide an independent out-of-distribution second opinion. Two reviewers, two independent reads, no auto-reconciliation. Opt-in per intensity (normal / high) to avoid cost surprises.
Detect: probe whether the Codex runtime is usable via the companion's own self-check (not a file-existence guess).
~/.claude/plugins/cache/*/codex/*/scripts/codex-companion.mjs to
locate the companion script. If zero matches, treat as not-installed
and skip Section M with the install hint below.node <companion-path> setup --json. Parse the JSON. If the
command fails (non-zero exit, missing node, permissions error,
etc.), treat as not-installed per step 5 below — do not crash the
setup flow.ready === true AND
codex.available === true.ready === false AND auth.loggedIn === false, surface the
auth-specific hint instead of the generic install hint: "Codex
installed but not authenticated — run codex login then re-run
/setup-dev-loop."Why runtime probe: filesystem checks for agents/codex-rescue.md give
false negatives when the agent file isn't cached locally even though
codex-cli is installed and authenticated. The companion's
setup --json output is the authoritative signal — same classification
the Codex runtime uses when accepting code-review work.
The doctor-worker filesystem probe (Section 1 Explore / REFRESH step 7)
still drives DEP_DRIFT for generic dependency health. Section M uses
the runtime probe because it answers a stricter question ("is the
runtime usable?") than doctor-worker's generic question ("is the agent
file present?"). Do not reuse doctor-worker's filesystem result here.
If Codex is NOT installed (no companion script found) → skip this
section with install hint: "Install the Codex plugin to enable Codex
code review: /plugin add openai-codex (or check the marketplace for
the current install path). For now, dev-loop will run simplify-worker
only."
If Codex IS installed → present 2 toggles:
"Enable Codex code review for normal-intensity cycles? Each REVIEW step will spawn codex-review-worker in parallel with simplify-worker. Adds latency + Codex cost; useful for catching issues simplify-worker misses (logic errors, security, OOD code paths). Default: no."
"Enable Codex code review for high-intensity cycles (
/dev-loop high)? High mode already raises aggressiveness; enabling here turns 'aggressive' into 'two independent reviewers per cycle.' Default: no."
Default posture: propose no/no even when Codex is installed — opt-in is the safe default; the user can toggle freely later by editing config. If the user accepts both, surface a one-line cost reminder.
Config emitted:
code_review:
parallel: true
codex:
enabled_in_normal: false # or true if user opted in
enabled_in_high: false # or true if user opted in
agent: dev-loop:codex-review-worker
Runtime behavior: Loaded at REFRESH into CODE_REVIEW_BACKENDS session list. Always includes dev-loop:simplify-worker. Appends dev-loop:codex-review-worker when (a) current intensity's enabled_in_* flag is true AND (b) neither dev-loop:codex-review-worker nor codex:codex-rescue is in DEP_DRIFT. REVIEW step 6 spawns each backend in parallel with model: "sonnet"; findings are concatenated under per-backend section headers. No auto-reconciliation. Schema reference: templates/project-config.md § Code review.
Section N — Release policy (since v1.19.0). Controls whether step 10 PUSH auto-bumps version on shippable commits. Optional; omit the block to preserve pre-1.19.0 manual-bump behavior.
Present this section only if publish_via was set to a non-none
value earlier in setup. If publish_via: none, skip Section N (no
PUSH happens regardless).
Ask the user, in order:
"Do you want dev-loop to auto-bump version on shippable commits? (yes/no, default: no)"
If no → skip the rest of Section N; omit release_policy from the
generated config.
If yes:
"Which channel —
beta(pre-release tags likev1.2.3-beta.4) orstable(v1.2.3)? Default: stable."
"Which file globs indicate a shippable commit? (comma-separated fnmatch patterns relative to repo root)" Suggest defaults derived from detected layout:
- If
cli_srcmatchespackages/*/src/..., suggestpackages/cli/**.- If
skills_globis set, suggestpackages/skills/**(or the detected glob's parent directory).- Always include
.claude-plugin/marketplace.jsonand thebump_scriptpath if set.
"Which file globs indicate a noise-only commit to skip? (comma-separated; default shown)" Default:
raw/**, concepts/**, entities/**, queries/**, projects/**, _archive/**, *.mdThe default skip list mirrors typical vault directories and standalone markdown commits.
"Enable
verify_after_push(ls-remote + gh run watch after tag push)? (yes/no, default: yes)"
tag_format is fixed at v{version} (the canonical SemVer tag format).
If a project needs a different tag format, ask the user to set
release_policy.tag_format manually in the generated config — setup
does not expose this field interactively.
Config emitted:
release_policy:
auto_bump: true
channel: <beta|stable>
trigger_globs:
- "<pattern>"
# ...
skip_globs:
- "<pattern>"
# ...
tag_format: "v{version}"
verify_after_push: <true|false>
Runtime behavior: Loaded at REFRESH into RELEASE_POLICY session
variable. PUSH step (10) checks changed files since last tag against
trigger_globs/skip_globs and gates whether to invoke bump_script.
When the block is absent, RELEASE_POLICY = None and PUSH preserves
pre-1.19.0 behavior. Schema reference: templates/project-config.md
§ Release policy.
Show the user a draft of ./.claude/dev-loop.config.md covering all thirteen sections (PRD, knowledge, release, interview, glossary, CI setup, critical paths, fact-check tier, idle deep-research, browser verification, reactive debugging, discipline path scoping, release policy). Let them edit before writing.
Write ./.claude/dev-loop.config.md using the filled-in template from templates/project-config.md.
If vault is available, run skillwiki:proj-init with the project slug.
Tell the user:
./.claude/dev-loop.config.mddocs/agents/ (and CONTEXT.md if grill-with-docs ran)./.claude/dev-loop.config.md directly.github/workflows/ci.yml, ci_configured: true in configci_configured: false — dev-loop MERGE step will warn about missing CIcritical_paths: block in config with 1-3 named hot-spotscritical_paths: {} (empty, engine uses equal priority)fact_check: block with source order, web tools, and evidence contractfact_check section — agents use local context onlyidle_deep_research: block with topic seeds, cooldown, budgetidle_deep_research section — idle cycles exit after maintenancebrowser_verification: block with trigger globs, dev server, smoke routesbrowser_verification section — no browser gate before mergereactive_debugging: block with retry budget, evidence capture, escalationreactive_debugging section — legacy unbounded retry behaviorinclude_paths/exclude_paths on scoped disciplines, first-match-wins resolutiondevelopment
Review and simplify recently changed code for reuse, clarity, and efficiency while preserving behavior. Use when the user asks to simplify, refine, polish, clean up, or make code clearer, or after finishing a logical chunk of implementation that should be tightened before commit.
tools
Use this skill when the user asks to open a browser, browse a website, scrape a page, automate Chrome, take a screenshot, fill out a form, click a button, or otherwise interact with a website. Includes a browser-worker agent (model: sonnet) for mechanical Chrome lifecycle and interaction tasks.
development
Host-level backup and restore with profile system (presets + custom YAML profiles), model-aware agents (sonnet worker for mechanical tasks), post-discovery research, and skillwiki infrastructure capture. Uses rsync with partial-dir for resumable WAN transfers. Use when backing up or restoring Caddy reverse-proxy domains, databases (postgres, mysql, redis, mongodb, sqlite), systemd services, full SSH identity/config, Tailscale state/config, and Hermes agent state on remote Linux hosts.
tools
Hermes Agent CLI commands reference. Use when the user asks about hermes-agent CLI usage, commands, flags, or subcommands. Covers the full hermes terminal command surface.