skills/ralph/SKILL.md
--- name: ralph description: Run an autonomous Ralph loop to implement tasks from a PRD in .ralph/ OR from GitHub issues labelled `ready-for-agent` (agent-native repo pattern). Each iteration picks the highest-priority unfinished story, implements it in a fresh isolated context, opens ONE PR per story (squash-merged with `Closes #N` for GH source), validates, and updates progress. Source defaults defer to /ro:repo-mode — `personal` repos read from GitHub issues; `work` repos read from local `.ra
npx skillsauth add RonanCodes/ronan-skills skills/ralphInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Autonomous coding agent loop based on the Ralph Wiggum technique. Each iteration picks one task from a PRD file under .ralph/ OR from GitHub issues with a configured label (agent-native repo pattern; see --source github:<label>), implements it in a fresh isolated context by default, opens ONE PR per story, validates, commits, and updates progress.
Ralph is one part of the local factory — the family of skills that run autonomous coding loops on Ronan's own machine. Sibling skills in the same family: /ro:planner-worker (alias /ro:swarm), /ro:matt-pocock-coding-workflow, /ro:night-shift, /ro:day-shift. They share conventions: PR-per-story, squash-merge, Husky-pre-push as the trust gate, ready-for-agent labels as the work queue, the per-repo .claude/repo-mode for personal vs work, and the artefact shape described in "Run artefacts" below.
The companion is the remote factory — the separate Factory app (tracked elsewhere) that runs equivalent loops as a service. Where the local factory shells out from Ronan's terminal, the remote factory will run the same shape against a controlled cloud environment. The two are designed to share story formats and PR conventions so an issue queued for one is queued for the other.
The --source flag picks where stories come from:
--source local: reads from .ralph/prd.json (sequential) or .ralph/issues/*.md (with --kanban). Legacy behaviour. Default for work-mode repos (keeps the flow invisible to the work GH/Jira/ADO project) and for repos with no gh remote.--source github:<label>: reads from open GitHub issues with the given label, e.g. --source github:ready-for-agent (or the project's synonym such as Sandcastle). Default for personal-mode repos with a gh remote.Auto-detect order when --source is omitted:
Resolve repo mode via the 4-line snippet from /ro:repo-mode:
mode=""
[ -f .claude/repo-mode ] && mode="$(tr -d '[:space:]' < .claude/repo-mode)"
[ -z "$mode" ] && [ -f "$HOME/.claude/repo-mode" ] && mode="$(tr -d '[:space:]' < "$HOME/.claude/repo-mode")"
case "$mode" in personal|work) ;; *) mode="unset" ;; esac
Resolve repo state:
has_gh="$(gh repo view --json url 2>/dev/null && echo yes || echo no)"
has_local="$(ls .ralph/issues/*.md 2>/dev/null | head -1 && echo yes || echo no)"
Pick the default:
mode == work → always --source local. Never query GH issues; the agent flow must stay invisible. If .ralph/ is empty, prompt the user to run /ro:write-a-prd first.mode == personal AND has_gh == yes AND .ralph/ empty → --source github:ready-for-agent.mode == personal AND .ralph/ populated → --source local to honour existing work in flight.mode == unset → run the first-run prompt from /ro:repo-mode § "First-run prompt", persist, then re-resolve.has_gh == no → force --source local regardless of mode.User can always override with an explicit --source flag.
When --source github:<label> is active, each iteration:
gh issue list --label kind:slice --label ready-for-agent --state open --json number,title,body,labels
(Canonical query per ~/Dev/ronan-skills/canon/labels.md: kind:slice + ready-for-agent is the pickup set. Legacy synonyms like Sandcastle are still supported via the --source github:<label> override.)prd:draft issues. See "Filter / scope: prd:draft is NEVER picked up" below — this is a HARD GUARD.kind:slice is now load-bearing for this. Skip anything with kind:prd (parents are tracking issues).Blocked by (parse ## Blocked by section, treat closed referenced issues as satisfied).### Close-the-loop tests OR ### Close-the-loop verification matrix. Behaviour per .ronan-skills.json swarm.missing_test_acs (refuse default; inject faster but riskier). Mirrors /ro:planner-worker US-2a; see [[close-the-loop-tests-acs]] and [[close-the-loop-verification-matrix]] for the AC shapes. Ralph is serial so there's no auto-split — a slice that fails after retries is marked needs-human and the next iteration picks up the queue. kind:research issues bypass this gate and route to the research-worker flow (deep research → cited doc in docs/research/ → wiki ingest → bidirectional links → docs PR, no tests). See [[canon:research-tasks]]. The pickup query in step 1 should also accept kind:research when the queue holds research work.ready-for-agent → in-progress on the chosen issue. Both labels are mutually exclusive per canon; do this as one gh issue edit --add-label in-progress --remove-label ready-for-agent.gh issue develop <issue-number> --name <slug> --checkout. This produces the issue→branch dev-link so the PR's Closes #N is automatic.local mode).Closes #<slice-number> in the body — GitHub auto-closes the slice on merge.in-progress → ready-for-agent (back into the queue).in-progress → needs-human and post a structured comment explaining the human action needed..swarm/failures.jsonl (same format as /ro:planner-worker US-7) so /ro:night-shift-retro can fold the retry count and failure mode into the retro failures[] block. Ralph's serial nature means there's no auto-split; the retro tracks retries-before-success as a SYSTEM signal across runs.Closes #N in the PR body is the load-bearing convention — without it, slices don't auto-close and the queue silently grows stale.
prd:draft is NEVER picked upprd:draft issues are NEVER picked up by this skill. They represent ideas captured in the agent-native repo's "inbox", not ready work. Drafts have freeform bodies, NOT Pocock's 7-section template, and have not been grilled. Picking one up would mean implementing against an unfinished spec.
To promote a draft into ready work, the user runs /grill on the issue. The grill-with-docs flow rewrites the body into the Pocock 7-section template, then the user swaps the label from prd:draft to the repo's gate label (ready-for-agent by default, or the project synonym like Sandcastle / swarm configured in docs/agents/triage-labels.md).
When querying GitHub for backlog issues, ALWAYS exclude prd:draft. gh label semantics are tricky here: --label <gate> matches issues that have the gate label, but an issue can have BOTH prd:draft AND the gate label (e.g. if someone mis-labelled). Defence in depth:
--label <gate> to scope the initial query.labels[].name contains prd:draft.gh issue list --label <gate> --search "-label:prd:draft" ... to push the exclusion server-side.Reference the user to /ro:list-draft-prds if they want to see what's sitting in the drafts inbox. Tip: if you're not sure what's queued vs what's still an idea, run /ro:list-draft-prds first to see the inbox before kicking off Ralph.
/ralph # One story, fresh context, then stop (default mode=single)
/ralph --mode fresh # Loop indefinitely; each story spawns a NEW subagent (recommended for autonomous overnight runs)
/ralph --mode fresh --max-iterations 5 # Cap at 5 fresh-context iterations
/ralph --mode batched # ALL stories in ONE continuous context (faster, riskier; opt-in only)
/ralph --prd phase-2-onboarding-2026-05-06 # Use .ralph/phase-2-onboarding-2026-05-06.json (recommended naming for multi-phase projects)
/ralph --plan-only # Show what would be done next (no changes)
The --mode flag is the most important decision. Pick the one that matches your risk tolerance:
| Mode | One iteration = | Use when | Risk |
|---|---|---|---|
| single (default) | One story, one fresh context, stop | Interactive dev; you want to review each story before kicking off the next | Low; you're in the loop |
| fresh | One story, one fresh subagent (Agent tool with a clean context). Loops until all stories pass or --max-iterations hit | Overnight autonomous runs, long-running PRD execution | Medium; subagent budgets are bounded but each story starts clean |
| batched | Many stories, ONE continuous context | You explicitly want speed over isolation, AND the stories share so much context that re-loading per story would be wasteful | Higher; context drift compounds across stories. Easy to silently relax PR-per-story. Use ONLY with explicit user permission. |
| unbounded | Same as fresh but no --max-iterations cap | "Run all night, ship as much as you can" | Same as fresh plus the wall-clock-budget concern |
Critical rule: batched mode requires explicit user opt-in via the --mode batched flag. It must NEVER be the silent default. The night-shift run on 2026-05-06 silently became batched-mode because the agent decided one-context-many-stories was easier; that's exactly what this rule prevents.
By default, Ralph reads a single .ralph/prd.json and walks its stories in sequence. With --kanban, Ralph reads one markdown file per story from .ralph/issues/ (produced by /slice-into-issues) and picks the highest-priority unblocked issue each iteration based on blocked-by frontmatter.
Use Kanban mode when:
/slice-into-issues (or /matt-pocock-coding-workflow).In Kanban mode each iteration:
.ralph/issues/*.md, filters to status: ready with no unmet blocked-by.status: done on success; updates any issues whose blocked-by references this one.Background → llm-wiki-ai-research:phase-n-ralph-loop.
The reviewer gate adds Matt Pocock's implementer/reviewer split to every iteration.
--reviewer opus when running against a real backlog (--source github:<label> OR populated .ralph/issues/). Confirmed 2026-05-14: the user's Max 20x plan has Opus headroom for review, and the Factory-style flows assume Pocock split is on.--plan-only, --mode single (one-off story exploration), or user explicitly passes --reviewer none.--reviewer sonnet, --reviewer haiku, etc. when you specifically want a cheaper or cross-provider reviewer.How the split works:
~/.claude/skills/coding-principles/ on demand only.merge: PR can be squash-merged.request-changes: implementer gets the reviewer's specific notes and tries again (single retry per iteration).reject: issue goes back to status: ready with the rejection note on the issue file.Background → llm-wiki-ai-research:push-vs-pull-coding-standards.
Do NOT use --reviewer together with --mode batched; the implementer/reviewer split assumes fresh context per story.
When --mode fresh is selected, each iteration MUST:
.ralph/prd.json to find the next unfinished story.ralph/patterns.md if it exists (carries cross-iteration learnings)subagent_type: general-purpose (or a more specialised type if appropriate).ralph/patterns.md, and an explicit instruction to record started ISO 8601 timestamp before any work + record finished ISO 8601 timestamp before returning + embed both into the progress.txt entry (see "Progress Report Format" below for the exact shape)passes: true, append to progress.txt (with timestamps)--max-iterations)This gives true context isolation: a mistake in story N can't compound into story N+1's implementation.
Every story MUST be a single PR. No batching. The acceptance bar:
ralph/us-NNN-<slug> (one US-NNN per branch)<emoji> <type>(US-NNN): <title> with EXACTLY ONE US-NNN id<emoji> <type>(US-NNN): <title> (#<pr>))If the implementing agent finds itself wanting to bundle multiple stories into one PR, it must STOP and surface the situation back to the user via AskUserQuestion:
"Story US-NNN naturally bundles with US-(NNN+1) because [reason].
Bundle into one PR (and split the story IDs in the title), or
keep as two PRs?"
The default answer is "keep as two PRs". The agent NEVER bundles silently.
To verify post-merge that the rule held: gh pr list --state merged --search "(US-" --json title should show exactly one US-NNN per row. The /lint --artifacts skill (when run on a Ralph-produced repo) flags PRs that bundle.
The local factory leans on the Husky pre-push hook as the trust gate (it runs the repo's full local CI: build, typecheck, lint, tests). Some checks only exist server-side (integration suites needing live infra, bundle gates), so GitHub CI is normally the second gate before merge. But the shared provider can be unavailable for reasons unrelated to the code:
log not found log.When the provider is blocked, do NOT stall the queue. Fall back to:
pre-push (never --no-verify) is the merge signal. Manually re-check the things that only run server-side (integration tests, bundle size) if the change touches them.gh pr merge <n> --squash --admin). Only for infra failures, never to skip a real red test.Per-repo specifics (the deploy command, the migration step, the account creds) live in that repo's CLAUDE.md. Example: factory documents the full runbook under "When GitHub CI is blocked (fallback runbook)". This is an escape hatch around a broken provider, not a license to skip CI: the local gauntlet must be green.
The --prd <name> flag selects which PRD file to work from:
--prd <name> → reads .ralph/<name>.json and writes progress to .ralph/<name>.progress.txt.ralph/prd.json and writes progress to .ralph/progress.txt (legacy / single-PRD projects)When generating a new PRD for a phased project, use the form:
phase-<N>-<slug>-<YYYY-MM-DD>
so the file is .ralph/phase-2-onboarding-2026-05-06.json and the progress file is .ralph/phase-2-onboarding-2026-05-06.progress.txt. Sortable by phase, dated for traceability, slugged for readability. Maintain .ralph/index.md (one row per PRD: file, status, started-at, finished-at, PR count, summary). Append to the index on every new PRD; update the row on completion.
The --prd flag accepts the bare name with or without the .json extension; the skill resolves to .ralph/<name>.json and .ralph/<name>.progress.txt either way.
For one-off / unphased projects (the legacy default), prd.json + progress.txt still work — don't force the phase convention on a small repo that doesn't need it. Apply the phase form when there's a real Phase 1 / Phase 2 / Phase 3 arc.
This lets one repo drive multiple concurrent phases/initiatives without progress-file collisions. Each named PRD is independent: its own story list, its own branchName, its own progress log.
Naming convention: use kebab-case slugs tied to the phase or initiative — phase-2a, phase-2b-presentation, auth-migration, docs-refresh. The slug must match [a-z0-9-]+.
Shared Codebase Patterns: .ralph/patterns.md is the single durable knowledge surface — read at start of every iteration, harvested into at loop close.
Four classes of file, four different lifecycles:
| File | Lifecycle | Git status | Who writes |
|---|---|---|---|
| .ralph/patterns.md | Durable, cross-run | committed | Orchestrator harvests at loop close. Subagents read at iteration start. |
| .ralph/<phase>.session.md | Per-phase rolling aggregate | committed | Orchestrator at session start (heading) and loop close (finish line + aggregates). |
| .ralph/sessions/<session-id>.md | Per-session detail log, write-once | committed | Orchestrator at loop close after harvest. One file per session, per-story sections, post-harvest. |
| .ralph/sessions/<session-id>/<worker-id>.md | Per-worker live scratch | gitignored | Each worker writes its own file. Survives locally for crash inspection; never committed. |
| .ralph/<phase>.json (PRD) | Per-phase spec | committed | Authored manually or by /ro:write-a-prd once. |
Why the split:
patterns.md at session close, there's still a per-iteration story worth keeping: timestamps, PR/SHA, the bits that weren't reusable enough to promote. sessions/<id>.md captures that as one committed file per session, written ONCE by the orchestrator — no append-conflicts (only one writer), no parallel-worker collision (it's written after the workers are done).cat .ralph/sessions/2026-05-19T01-48-54Z.md on any machine after git pull and you read what happened that night. The full audit lives in git + GH issue threads; this is the indexed quick-read.Naming: the session id is the orchestrator's start ISO slugged (2026-05-19T01-48-54Z). The detail log lives AT .ralph/sessions/<id>.md; the worker scratch directory is .ralph/sessions/<id>/. Same id, different file shape — gitignore distinguishes via trailing slash.
Each worker (one per story, in --mode fresh) writes to .ralph/sessions/<session-id>/<worker-id>.md. The session id is the orchestrator's start timestamp slugged (e.g. 2026-05-19T01-48-54Z). The worker id is the story id (us-128). Contents are short:
# Worker scratch — US-128
started: 2026-05-19T01:49:30+02:00
finished: 2026-05-19T02:01:52+02:00
duration: 12m22s
pr: #138 (squash-merged 55e10e0)
## Learnings (for patterns.md harvest)
- <one or two lines a future story would want to know>
- <a gotcha that should not happen again>
## Files changed (informational, already in git)
- src/lib/server/spaced-rep.ts
- src/lib/server/__tests__/spaced-rep-cap.integration.test.ts
The "what was implemented" prose lives in the PR body — don't duplicate it here.
When the loop exits (cleanly, max-iterations, or hard error), the orchestrator MUST:
.ralph/sessions/<session-id>/*.md file written during the session..ralph/patterns.md under an existing or new section heading. Skip per-story / PRD-specific gotchas — those keep going but in the session detail log, not patterns..ralph/sessions/<session-id>.md. One section per story, structure:
# Session <id> detail — <phase>
<session-level metadata: mode, reviewer, total stories, total duration>
## US-NNN: <Story Title>
- started: <ISO> / finished: <ISO> / duration: <Nm Ns>
- PR: #<n> (squash-merged <sha>) → Closes #<issue>
- What shipped: <one paragraph from the worker scratch + PR body>
- Promoted to patterns.md: <yes/no — and which section if yes>
- Local learnings (not promoted): <story-specific gotchas worth keeping for one cat-and-read but not worth carrying forward>
---
.ralph/<phase>.session.md (see "Session timing" below) — that's the rolling per-phase summary; the detail log above is the per-session deep-dive.patterns.md + sessions/<id>.md + <phase>.session.md together as a single 🧹 chore(ralph): session <id> artefacts for <phase> commit. Fresh chore/ralph-session-<id> branch is fine, or directly on main if the run was purely on main.Every subagent that ships a slice MUST post a structured comment on the GH issue BEFORE merging. The Closes #N line in the PR body auto-closes the issue on merge; the comment makes the closed issue self-explanatory without having to crack open the PR.
Shape (post via gh issue comment <N> --body "..." immediately after CI greens, before the merge call):
## Shipped
<one paragraph: what landed, in PR #<n>>
## Surprises encountered
<bullets: anything the original story didn't anticipate. Spec drift, missing dependencies, content gaps, schema discoveries. Empty section is fine.>
## Patterns promoted to .ralph/patterns.md
<bullets, with the section headings where each landed. Empty if nothing reusable.>
## Local learnings (not promoted, kept in sessions/<id>.md only)
<bullets: story-specific gotchas. Empty section is fine.>
## Follow-ups
<bullets: any new `ready-for-agent` issues opened mid-implementation, linked by number. "None" if none.>
The orchestrator prompt template MUST embed this in every implementer-subagent dispatch, immediately after the watchdog-discipline clause and before the squash-merge step.
The new default is fixed:
# Ralph / local-factory artefacts.
# Per-worker scratch directories under .ralph/sessions/<id>/ are ignored;
# the per-session detail log .ralph/sessions/<id>.md (sibling file) stays tracked.
.ralph/sessions/*/
.ralph/*.progress.txt
.ralph/.gitignore-policy
The trailing slash on .ralph/sessions/*/ is load-bearing: */ matches only subdirectories, so .ralph/sessions/<id>/<worker>.md is ignored while .ralph/sessions/<id>.md stays committed.
patterns.md, <phase>.session.md, sessions/<id>.md, and <phase>.json (PRDs) stay tracked. The first-run policy prompt is retired — the shape is now standard across the local factory.
For legacy projects that already have a committed .ralph/progress.txt: leave it in git history for the audit trail, gitignore the path going forward, and write a follow-up 🧹 chore(ralph): retire legacy progress.txt, harvest learnings into patterns.md commit that deletes the file from main and copies any surviving learnings into patterns.md.
.gitignore carries the local-factory artefact rules (see "Gitignore policy" above); add the three-line block if it's missing## Session N heading to .ralph/<name>.session.md with timestamp + flags. Compute the session id (start ISO slugged) and create .ralph/sessions/<session-id>/ for worker scratch files.--prd flag (or default .ralph/prd.json) — find the highest priority story where passes: false.ralph/patterns.md (the durable knowledge surface) — every iteration reads it--mode fresh: spawn a fresh subagent for this story. In --mode batched: continue in current context. In --mode single: do this story then stop.<emoji> <type>(US-NNN): <Story Title>. Weekday timestamps must fall outside 08:30–18:00 (CLAUDE.md rule) — pass GIT_AUTHOR_DATE + GIT_COMMITTER_DATE to git commit if running inside that window.gh issue edit <NUM> --remove-label in-progress so the closed issue doesn't keep a stale lifecycle label (the Closes #N autoclose strips open-state, not labels); then auto-merge via squash.ralph/sessions/<session-id>/<worker-id>.mdstatus: "passed", passes: true, and notes including the squash SHApatterns.md, writes the per-session detail log to .ralph/sessions/<session-id>.md, finalises .ralph/<name>.session.md with finish-line + aggregates, commits the three files as one chore(ralph): session <id> artefacts for <phase> commitIn --mode fresh and --mode single, work on ONE story per iteration. After completing it:
single: stop. Next /ralph invocation picks up the next story.fresh: spawn a NEW subagent for the next story (until --max-iterations or all stories pass).In --mode batched, multiple stories share one context. This is the explicit opt-in for situations where the user accepts the risk in exchange for speed.
ANY /ro:ralph run against a real backlog ends with TWO tail calls, in this order:
/ro:completion-report --prd <name> --no-open — writes a browsable HTML report to <repo>/.completion-reports/<ts>-<prd>.html with per-PR cards, file diffs, per-file rollback commands, and a risk panel. Capture the absolute path it prints./ro:pushover --url file://<path-from-step-1> — sends the done / paused / blocked / crashed ping with the report path as a deep link, so tapping the phone notification opens the report.If step 1 reports an empty range ("no commits in range, nothing to report"), skip the --url flag on step 2 but STILL send the ping with the state message.
Skip BOTH tail calls ONLY when:
--plan-only (nothing actually ran)--mode single and it's the only story (one-shot exploration, not a real run)--no-pingFor the recipes: report path printing in ~/Dev/ronan-skills/skills/completion-report/SKILL.md; Pushover firing in ~/Dev/ronan-skills/skills/pushover/SKILL.md. Message shape: state + one concrete metric + what Ronan needs to do next. Example: "ralph done — 14/14 stories merged, 0 deferred, ready for visual review" with the report URL attached.
Each worker writes one scratch file at .ralph/sessions/<session-id>/<worker-id>.md. Files are gitignored — they die at session close after the orchestrator's harvest step (see "Orchestrator's harvest step" above). The shape:
# Worker scratch — US-128
started: 2026-05-11T23:35:12+02:00
finished: 2026-05-11T23:44:41+02:00
duration: 9m29s
pr: #61 (squash-merged 8240af6)
## Learnings (for patterns.md harvest)
- <one or two lines a future story would want to know>
- <a gotcha that should not happen again>
## Files changed (informational)
- path/one.ts
- path/two.tsx
Both started and finished are real wall-clock ISO 8601 timestamps with offset, NOT the backdated git commit dates. The subagent records started first thing after reading its prompt (capture via date -u +%Y-%m-%dT%H:%M:%S%z or the local-tz equivalent), and finished right before it returns its one-line summary. duration is computed in the implementer subagent and embedded.
The "What was implemented" prose lives in the PR body — don't duplicate it here. The Learnings section is the only part the orchestrator's harvest step cares about; everything else exists for the operator reading the scratch directory locally.
.ralph/<name>.session.md is a per-PRD session log that tracks loop-level start/finish across all iterations. Format:
# Ralph session log — <phase name>
## Session 1 — 2026-05-11T23:33:00+02:00 to 2026-05-12T03:42:18+02:00 (4h 09m 18s)
- mode: fresh
- reviewer: opus
- max-iterations: 12
- stories attempted: 7
- stories passed: 7
- stories deferred / blocked / failed: 0
- total subagent wall-clock: 3h 31m 04s
- orchestrator overhead (between iterations): 38m 14s
- notes: <one or two lines if relevant>
## Session 2 — ...
When the loop starts, append a new ## Session N heading with the start timestamp + flags. When the loop exits (cleanly OR via error OR via max-iterations), edit the session entry to add the finish timestamp + duration + per-story aggregates. If the orchestrator session dies mid-loop (token limit, user disconnect), leave the start timestamp in place; the next /ralph invocation appends a new session heading and the prior one is recognisable as "started but never recorded a finish".
Compute the per-story aggregates by summing each worker's duration field across the session's scratch directory; the orchestrator does that during the harvest step at loop-exit time.
Total wall-clock and orchestrator-overhead can both be derived: total = finished - started from the session line, subagent-sum from the per-worker durations, orchestrator-overhead = total minus subagent-sum.
The session log is committed (it's small, write-once, and conflict-free because only the orchestrator writes it).
.ralph/patterns.md is the single document where carried-over knowledge lives. Read at start of every iteration; written into only at session close by the orchestrator's harvest step.
What goes in:
crypto.subtle doesn't support MD5; use node:crypto createHash via nodejs_compat."data-testid for Playwright. Pattern is <drill-type>-<role>."What stays out:
Shape:
## Codebase Patterns
- Skills use SKILL.md format with YAML frontmatter
- Vault CLAUDE.md files are thin config, not logic
- Use wikilinks [[page]] syntax for cross-references
- Cloudflare Workers `crypto.subtle` does NOT support MD5; use `node:crypto.createHash('md5')` via nodejs_compat
- Drill components carry stable `data-testid` hooks; e2e specs select on those, not class names
If .ralph/patterns.md does not exist, create it on first use (an empty ## Codebase Patterns section is fine). Legacy projects with a ## Codebase Patterns section at the top of .ralph/progress.txt should migrate that section into patterns.md and gitignore the progress file (see "Gitignore policy" above).
After completing a story, check if ALL stories have passes: true.
| Mode | When to stop |
|---|---|
| single | After 1 story (always) |
| fresh | When all stories pass OR --max-iterations hit OR a story fails after 3 retries |
| batched | When all stories pass OR a hard error |
| unbounded | When all stories pass OR a hard error (no iteration cap) |
Report "All tasks complete!" and stop on success. On hard error, report the failed story + reason and exit; the user can resume with another /ralph invocation.
This section captures patterns observed in production Ralph runs that future invocations should respect. Add to it after each meaningful run.
Run shape: --mode batched (silently, due to skill ambiguity at the time). 17 stories shipped across 6 PRs in 65 minutes via one continuous context.
What went well:
What went badly:
continue-on-error: true on the deploy step. Green-CI signal was misleading.What we changed in the skill (this version):
--mode flag with explicit semantics; batched requires opt-in; fresh is the recommended autonomous modeLessons learned section (this one) so future runs inherit the wisdom/spec-to-repo with its full pre-flight first, before /ralph, to catch revoked tokens etc. before the autonomous loop startsAfter the Phase 1 build, the user surfaced six gaps that should have been stories from day one but were one-liner mentions in ADRs the agent didn't translate to user stories:
.dev.vars from secrets so the e2e dev-server bootsEach became a follow-up PR. Root cause: the spec mentioned them in ADRs / DoD bullets, but didn't emit them as US-* stories, so Ralph never had a story-shaped target to build against.
Mitigation upstream: /generate-spec and /write-a-prd now ship a Web-app baseline checklist (sign-in/sign-out UI, lazy auth-mirror, API discoverability, API client collection, integration test layer, CI env injection, per-story deploy verification). Any web-app spec missing one of these gets called out before story generation. See [[ideal-tech-setup#Greenfield Spec Baseline (must-have stories)]] for the canonical list.
What Ralph should do now: before iterating any spec, scan US-* titles for the baseline checklist. If any are missing in a web-app spec, stop and ask the user via AskUserQuestion whether to add them or mark "N/A — <reason>" before starting.
Phase 2 (11 stories) ran in --mode fresh but the Agent/Task tool was NOT available inside the spawned Ralph subagent's context, so the per-story-fresh-subagent pattern collapsed into single-agent execution. Result: 6 of 11 stories shipped (the smaller ones), 3 deferred (UI-heavy: onboarding checklist Home, OAuth device flow, Connections polish), 2 blocked-on-human.
What we changed in the skill (this version):
--mode fresh. If unavailable, downgrade automatically to --mode batched AND surface a warning, OR refuse to start if the user explicitly asked for fresh and the tool is missing. Don't silently single-agent through 11 stories.status field: PRD JSON now uses status: "passed" | "deferred" | "blocked-on-human" | "blocked-on-code" | "not-started" instead of bare passes: bool. passes:bool stays as a derived view (passed === true) for back-compat. Free-text notes is for the why, not for distinguishing the kind of incomplete.drizzle/meta/_journal.json to drop the entry too. Otherwise the next db:generate numbers from the stale max idx.pnpm format:write once and commit any unrelated touch-ups as a separate 🔧 chore: commit. Otherwise per-story PRs drag in unrelated whitespace via lint-staged.[[ideal-tech-setup]] § Greenfield Spec Baseline for the canonical pattern.These mitigations apply going forward; Phase 3 PRD must reflect them.
Phase 4 (7 stories) + Phase 5 (7 stories) shipped over one continuous night-shift orchestration. --mode fresh --reviewer opus worked as designed: 14 fresh subagents (one per story), each spawning a SECOND fresh Opus subagent for review. Zero rejections, zero deferrals. Per-story subagent wall-clock ranged 8m–34m (median ~12m).
What went wrong:
duration_ms came back via task notifications, and git commit dates were backdated to honour the weekday-outside-work-hours rule, so the user could not audit when each story actually ran. The progress.txt entries only had a date prefix (e.g. ## 2026-05-11 - US-407: ...), not a real wall-clock start/finish..ralph/night-shift-state.md and ran git add + commit in the shared working tree, which was at that moment checked out to the subagent's ralph/us-NNN-... branch. The commit landed on the subagent's branch and traveled with its PR. Annoying, not broken.What we changed in the skill (this version):
started, finished, duration. Subagent records both via date -u +%Y-%m-%dT%H:%M:%S%z (or local-tz equivalent) at the very start of work and immediately before returning. The orchestrator does NOT compute or merge times; the subagent owns both fields. See "Progress Report Format" for the exact shape..ralph/<name>.session.md: a separate per-PRD log that captures loop-level start + finish + flags + per-story aggregates. Survives orchestrator session death (start timestamp recorded eagerly; finish timestamp added at loop-exit, OR recognisably absent if the orchestrator died mid-loop).ralph/us-NNN-<slug> branch in the shared repo, the orchestrator's only safe operations are: read PRD JSON files, read progress.txt files, spawn the next subagent. Any git operation the orchestrator wants (e.g. committing a session-state file to main) must be queued for after the loop completes, OR done by a dedicated subagent.Run shape: project scaffolded from PRD with ~35 Phase 1 stories. First batch (7 drill stories) shipped cleanly. Subsequent 5 batch-agent attempts ALL stalled with no merges. After much diagnosis, switched to orchestrator-spawns-fresh-implementer-per-story (the documented --mode fresh pattern) and shipped.
What went wrong:
Background-agent stream watchdog (600s). The Claude Code background-agent harness kills any agent that goes 600 seconds without printing to stdout. Agent thinking time counts as silence. Batch agents trying to do 7-stories-in-one-context spent >10 min "understanding" the codebase / planning the next story between tool calls and got killed. The harness watchdog is not user-disable-able; the only mitigation is keeping the agent's stream chatty.
Earlier batch agents falsely claimed Agent tool was unavailable. When asked to spawn fresh subagents per story, the previous-night agent reported "no Agent/Task sub-agent spawning available in this harness" and pivoted to serial in-context implementation, which then hit the watchdog. The Agent tool IS available to background-spawned agents; it just needed an explicit instruction to use it. (This contradicts the 2026-05-06 Dataforce afternoon finding — that was a different harness configuration.)
PR-process discipline drift. Under time pressure the orchestrator (and I, the calling agent) started direct-pushing to main with git push origin main to bypass the PR + CI flow. By the time we caught it, 5 direct-to-main commits had landed and the audit trail was muddled. Worse, the CI workflow only triggered on push to main — not on pull_request — so PRs had no checks to gate merges with.
Cloudflare Workers env-at-init-time trap. clerkMiddleware() and similar middleware called at module init time on CF Workers cannot read secrets via process.env (always empty on workers) or via import.meta.env.* for non-VITE_* keys. They CAN read public vars from wrangler.jsonc. Forgetting this and registering middleware that reads CLERK_PUBLISHABLE_KEY at module init crashed cold-start with 500 on every route, including /. Hours lost.
Route handlers throwing instead of redirecting. Several scaffolded auth-gated routes threw new Error("Not signed in") when auth().isAuthenticated === false. TanStack Start surfaced this as a 500, not a 302 to /sign-in. Story implementers must use throw redirect({ to: '/sign-in' }) from @tanstack/react-router for auth gates; never throw new Error().
What we changed in the skill (this version):
Watchdog discipline section (mandatory for every subagent prompt): subagent MUST echo a heartbeat echo "[$(date +%H:%M:%S)] <what we're doing>" BEFORE every Bash call that might take >30s. AND immediately after returning from each long step. Multiple echoes are free; agent-thinking-time-between-tool-calls is the killer.
Orchestrator-spawns-fresh-implementer pattern, explicit: orchestrator does NOT touch code. Its only loop is: (1) read PRD, (2) pick next unfinished story, (3) Agent tool call with a self-contained implementer prompt (background=false, blocking), (4) wait for one-line return, (5) log to progress.txt, (6) next story. Implementer is short-lived (~10-20 min) and well under the watchdog. Orchestrator stays cheap in context because its work-per-iteration is just dispatch.
PR-only flow is a HARD GUARD. No git push origin main. No --admin flag on gh pr merge. CI must run on pull_request: events (not just push: branches: [main]) — verify the project's workflow has both triggers BEFORE starting Ralph; if it doesn't, the first story must add a pull_request trigger to the workflow. CI passing is required before squash-merge.
First-iteration pre-flight: before any story, scan the project's .github/workflows/*.yml for on: pull_request:. If missing, story 0 is "Add pull_request trigger to CI workflow". Without this, Ralph cannot enforce CI-gated merges and the whole flow degrades to "merge and pray".
CF Workers + Clerk gotcha: when an upstream story requires Clerk middleware on CF Workers, add CLERK_PUBLISHABLE_KEY AND VITE_CLERK_PUBLISHABLE_KEY to wrangler.jsonc vars (publishable keys ship to browsers; safe to commit). Push CLERK_SECRET_KEY as a secret. Same pattern works in @tanstack/react-start and next.js on CF Workers. Reference implementation: ~/Dev/ai-projects/dataforce/src/start.ts.
Auth gate pattern: server functions that gate on auth must throw redirect({ to: '/sign-in' }) (TanStack Router import), never throw new Error(). Add this as a Codebase Pattern automatically detectable by scanning src/lib/server/*.ts for throw new Error("Not signed in") and similar.
"Ship simpler with TODO" rule: when an implementer hits an API it doesn't understand after 5 min of investigation, ship the simpler working version with // TODO(refinement): <thing> and move on. The watchdog will kill an agent that spends 15 min reading SDK docs silently.
Failure pattern recognition: if 2 consecutive subagent dispatches stall at the SAME story, escalate — either the story spec is bad or there's an environment issue. Don't retry a third time blindly. Mark the story BLOCKED with the failure reason and continue.
Run shape: 6 PRs queued for the merge orchestrator (/tmp/merge-stack-v4.sh); first PR merged at 22:48. The remaining 5 sat for an hour because every E2E job died in 3-10s with the annotation:
The job was not started because recent account payments have failed or your spending limit needs to be increased.
The Quality job (smaller, fits the free tier) ran fine. E2E (longer) needed paid minutes that were unavailable. The orchestrator polled green CI forever and never advanced.
What went wrong:
What we changed in the skill (this version):
gh pr checks <pr> shows a FAILURE check AND gh run view <run-id> annotation contains "recent account payments have failed" OR "spending limit", the orchestrator stops polling that PR and switches to the local-CI fallback path:
PushNotification once per orchestrator run, naming the billing URL + the org.pnpm install --frozen-lockfile && pnpm db:migrate:local && pnpm run quality && pnpm test:e2e. If green, gh pr merge <pr> --squash --delete-branch --admin with a comment quoting the billing block.--admin flag is the deliberate exception to the "PR-only flow is a HARD GUARD" rule above. It is permitted ONLY when (a) the local quality+E2E run passed on the branch tip, (b) the PR comment explaining the billing fallback was posted, and (c) the operator was notified./tmp/local-drain.sh as a canonical pattern; reference implementation in this session.pnpm run quality already. The billing-fallback ALSO runs pnpm test:e2e (which pre-push does not) because E2E is the part GH Actions usually owns. Without this, the local-CI fallback would have a coverage gap vs the real CI.This is the only sanctioned way to skip a GH Actions CI gate. Any other "CI is being slow" excuse must wait it out; the only signal that justifies --admin is the literal "recent account payments have failed" annotation.
Two follow-up lessons from running the billing-fallback path repeatedly:
Pre-existing main-side failures must NOT block unrelated PRs. When the local E2E run fails on tests that ALSO fail on the current main (verified via git checkout main && pnpm test:e2e <spec> or by inspecting the test diff), file the failure as a separate issue and proceed with the merge. The local-CI gate's job is to catch new regressions, not to stop unrelated PRs while a pre-existing bug is unresolved. The PR comment quoting the billing block should also enumerate the pre-existing failures and link the tracking issue.
Use test.fixme() to clear the queue when a deterministic regression lands on main. When a regression lands and starts gating every subsequent PR's local-CI run, the immediate ship the orchestrator does is a one-line test.fixme(...) PR with a comment linking the tracking issue, so the queue can drain. The real fix follows as a separate PR. Marking with fixme (not skip) keeps the test visible in suite output as a known-broken case rather than silently absent.
gh pr merge --squash --admin occasionally fails with invalid character '{' after object key:value pair (gh 2.83.x + certain plugin combos). Fallback: gh api -X PUT repos/<org>/<repo>/pulls/<pr>/merge -f merge_method=squash performs the same merge via the raw REST endpoint and avoids whatever JSON-parse step is choking inside gh pr merge. Keep both forms in the local-drain script's merge step so an upstream gh-cli glitch doesn't stall the queue.
The planner-worker (/ro:swarm) ran 4 waves against RonanCodes/lekkertaal, 18 stories shipped. Four lessons from that run apply equally to Ralph's GitHub-source iteration when multiple stories close in the same session:
Workers must poll CI in foreground bash, NOT via an in-context monitor. One worker exited with the line "I'll wait for the monitor events to come through" — a hallucinated tool flow that doesn't deliver events back into the worker context. The PR sat open with conflicts, no CI run, until the planner intervened. Fix: poll gh api repos/<owner>/<repo>/commits/<sha>/check-runs in a bash loop with sleep 30 between attempts and a hard 15-minute cap. If checks stay pending past the cap, STOP and report; do not retry blindly.
Verify CI fired within 60s of git push. Observed: a fresh branch push that did NOT trigger GitHub Actions. Force-pushing the same branch after a no-op rebase did trigger it. After every push, run gh api .../actions/runs?head_sha=<sha> once and check the count is > 0. If 0, nudge with git commit --amend --no-edit && git push -f. Root cause not isolated (possibly Actions transient or branch-protection quirk); the nudge reliably fixes it.
Rebase onto main when it drifts; do NOT merge main into the branch. Wave 1's prompt-caching PR (#66) rebased cleanly onto wave 1's telemetry PR (#65) that had landed mid-flight, and ended up composing with it (cache token counts flowed into the new telemetry sink). Better outcome than landing in isolation. For Ralph: when push rejects because main moved, git fetch origin && git rebase origin/main && git push -f --force-with-lease. Never git merge main.
Trust local pre-push CI; skip waiting for GitHub Actions. When the repo has a .husky/pre-push hook running the same gauntlet as GH CI (e.g. pnpm test && pnpm build), waiting for GH CI to re-run it costs 1-2 minutes per PR for zero added safety. New default: after git push succeeds (the hook validates), gh api -X PUT .../pulls/<N>/merge -f merge_method=squash immediately. GH CI still runs on the merged commit on main; if a fluke bug lands, post-merge CI flags it and the orchestrator can revert. Frequency of this in the lekkertaal 18-PR run: zero. Setup-time prompt asks the user once: "skip GH CI wait and merge immediately after a clean push? [Y/n]" — persisted in .ralph/config.json as trust-local-ci: true|false. Branch-protection rules with required-status-checks auto-override to false.
These lessons are now baked into the Pocock pattern page: skill-lab:agent-native-repo-pocock and the planner-worker SKILL: see /ro:planner-worker § "Lessons from live runs".
Run shape: PRD #127 (drill catalogue parity + spaced-rep loop closure), 10 stories, --mode fresh --reviewer none. All 10 PRs merged in ~2h. Follow-up e2e sweep added 8 specs across 3 batch PRs.
What worked:
What went wrong:
.ralph/<phase>.progress.txt was ~80% denormalised git history. PR number, SHA, files changed — all already in git log + gh pr diff. The valuable bits (timestamps, learnings) were buried inside per-story entries where nobody read them after merge. And in a parallel worktree future this single file becomes a merge-conflict pinch point.git commit and the commits landed at real wall-clock. Not catastrophic, but the rule exists for a reason.What we changed in the skill (this version):
New artefact shape: patterns.md (committed, durable, harvested at session close) + <phase>.session.md (committed, per-session aggregate) + .ralph/sessions/<id>/<worker>.md (gitignored worker scratch). Old .ralph/<phase>.progress.txt retired; legacy projects gitignore the path and let the historic file live in git history. See "Run artefacts (the canonical shape)" + "Worker scratch format" + "Codebase Patterns (the durable knowledge surface)" above.
Orchestrator harvest step at loop close: read worker scratch files, promote real learnings into patterns.md, finalise <phase>.session.md, commit both as one chore(ralph): session N artefacts for <phase>. Worker scratch dies (gitignored, disk-cheap, operator can read locally if needed).
First-run gitignore-policy prompt is retired. Default rules are now fixed (.ralph/sessions/, .ralph/*.progress.txt, .ralph/.gitignore-policy). No more first-run friction.
Backdate rule for weekday commits is now an explicit subagent-prompt clause. The orchestrator MUST include this in every implementer prompt:
Weekday timestamp rule (CLAUDE.md): commits must fall outside 08:30–18:00 local time. Compute it once and pass via env:
if it's currently inside that window AND it's Mon-Fri:
pick a backdate (this morning's 07:55 if you started before 08:30; last night's 19:00 otherwise)
export GIT_AUTHOR_DATE="<that ISO>"
export GIT_COMMITTER_DATE="<that ISO>"
else: skip the env, use real time.
Use the same backdate for every commit in this iteration so they stay sequential.
Local factory naming: this skill (plus planner-worker, matt-pocock-coding-workflow, day-shift, night-shift) is now called the local factory — collectively, the suite of agent-loop skills that run on Ronan's machine. The remote factory is the companion Factory app (tracked separately) that will run equivalent loops as a cloud service. Sibling skills should reference the local-factory family explicitly. As of Phase 4 (2026-05-22) the remote factory is expanding beyond the coding loop to the full lifecycle (idea intake → build → launch) driven by a conversational PI agent over a shared initiative spine; the build-loop story/PR conventions stay compatible with the local factory, and its build stage references the user's tracker (GitHub today, Jira/Linear later) rather than owning it.
{
"project": "llm-wiki",
"branchName": "ralph/feature-name",
"description": "Feature description",
"userStories": [
{
"id": "US-001",
"title": "Story title",
"description": "As a [user], I want [feature] so that [benefit]",
"acceptanceCriteria": [
"Criterion 1",
"Criterion 2"
],
"priority": 1,
"status": "not-started",
"passes": false,
"notes": ""
}
]
}
status values| Value | Meaning | Implies passes |
|---|---|---|
| not-started | Hasn't been picked up yet | false |
| in-progress | Currently being worked on (only one story at a time per PRD) | false |
| passed | All DoD criteria met, PR merged, deploy verified | true |
| deferred | Skipped for this run because too big for one context window OR depends on a deferred story; the next Ralph run should pick it up | false |
| needs-human | Needs a manual dashboard step (OAuth registration, Nango webhook URL paste, etc.); Ralph can't complete without operator action. Mirrors the canonical needs-human lifecycle label on GH issues. | false |
| blocked-on-code | Needs a fix in another part of the codebase that's outside this PRD's scope; should become its own story in the next phase | false |
passes: bool is kept as a derived field for back-compat with older readers. status is canonical going forward.
When Ralph hits deferred or blocked-*, it MUST:
notes line explaining the reason (one sentence).<ISO> US-NNN <STATUS> — <reason>.Each story must be completable in ONE iteration (one context window). If a story is too big, split it before running.
Right-sized: "Create vault-create skill with SKILL.md" Too big: "Build the entire ingest system"
development
Close the loop on a Linear ticket when its work ships - move the status and post a deploy comment with the PR link, what shipped, and a try-it link, mentioning the collaborator. Used as the tail of /ro:linear-nightshift for every merged mirror, or manually after an ad-hoc build. Triggers on "linear update", "update the linear ticket", "mark NUT-x done", "tell eoin it shipped", "/ro:linear-update".
devops
Run a night-shift against a collaborator's Linear board. Pulls the team's Grilled tickets (/ro:linear-grill moves a ticket to Grilled once its questions are answered), VERIFIES the questions were actually answered (unanswered → bounce the ticket to the "Question for <name>" state), mirrors verified tickets to ephemeral GitHub issues with ready-for-agent, then runs the standard /ro:night-shift machinery on GitHub. Tail-calls /ro:linear-update for everything that merged + deployed. Triggers on "linear nightshift", "nightshift linear", "drain the linear board", "run the shift off linear", "/ro:linear-nightshift".
development
Grill a collaborator's Linear tickets and move every processed ticket to where it belongs. Resolves the board from the repo's .ro-linear.json, reads the collaborator's Backlog / Ready-for-agent issues, then per ticket either posts 3-5 decision-extracting questions (state moves to "Question for <name>") or confirms it build-ready (state moves to "Grilled", the gate /ro:linear-nightshift consumes); shipped-and-confirmed tickets close as Done. The async-collaborator counterpart of /ro:day-shift for people who never touch GitHub. Triggers on "grill linear", "grill eoin's tickets", "linear grill", "add questions to the linear tickets", "/ro:linear-grill".
development
--- name: about-page description: Add a standard About page to any web app, what it is, the tech stack, and an FAQ, wired into a footer link with a sticky footer. Built with Spartan + Tailwind (the canonical component layer) and falls back to semantic HTML so it ships reliably. Use whenever building, polishing, or shipping an app, every app should have one. Triggers on "add an about page", "about page", "footer about link", or as a standard step in app build/polish. category: frontend argument-h