skills/ralphify-spec/SKILL.md
Generate a ralphify-approved ralph directory (RALPH.md + optional scripts) from a plain-English description of repetitive or iterative work. Use this skill whenever the user says "ralphify", "create a ralph", "ralph wiggum", "autonomous loop", "/ralphify", references Geoffrey Huntley's Ralph Wiggum method, or asks to wrap iterative work in ralphify (test-until-green, refactor-until-done, lint-until-clean, coverage-until-90, burn-down-todos, resolve-review-comments). Trigger even when the user does not explicitly name ralphify but describes an open-ended loop ("keep fixing tests until they pass", "port files one by one until the directory is done"). Do not trigger for one-shot tasks — ralphs exist for work that benefits from running N times against a stop condition.
npx skillsauth add paulnsorensen/dotfiles ralphify-specInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Translate a plain-English iterative task into a valid, runnable ralphify ralph directory — a RALPH.md with well-formed YAML frontmatter, useful command blocks, and a prompt body that follows the Ralph Wiggum method.
The user does not need to know how ralphify works. Do not explain frontmatter, placeholders, or shlex quirks to them. Translate their goal into a working ralph and hand them the suggested run command.
Ralphs pay off for work where each iteration makes incremental progress and a stop condition tells the loop when to halt:
If the task is one-shot ("add a button to this page", "explain this function"), a ralph adds nothing. Recommend /fromage or direct implementation and stop.
Ask only what you cannot already infer from the conversation or the current working directory. Skip questions the user already answered.
pyproject.toml, Cargo.toml, or package.json first and only ask if ambiguous.Do not ask about command blocks, frontmatter fields, placeholders, or YAML. That is your job.
ralphs/<name>/ inside the current repo. Confirm only if the cwd is not a sensible home for it.Start from ralphify's own canonical template so the file parses and you begin from the upstream-endorsed shape:
ralph init ralphs/<name>
If ralph is not on PATH, fall back to ~/.local/bin/ralph — that is where uv tool install ralphify places the binary. After scaffolding, rewrite the file for the user's task rather than shipping the stock template.
REFERENCE.md is the schema spec. Read it when you need exact rules (required vs optional fields, constraints, defaults). Don't re-derive them from this skill body.
Default agent: claude -p --dangerously-skip-permissions, unless the user is on a different agent (Gemini, Cursor agent, etc.).
When the ralph has a clear "all done" condition checkable before spinning up an agent, wrap the agent call in a guard script and point agent: at the script instead:
#!/usr/bin/env bash
# guard.sh — exit 1 to stop ralphify before wasting an agent iteration
set -euo pipefail
TODO="$(dirname "$0")/TODO.md"
if ! grep -q '^\- \[ \]' "$TODO" 2>/dev/null; then
echo "No unchecked items — stopping." >&2
exit 1
fi
exec claude -p --dangerously-skip-permissions "$@"
agent: ./guard.sh
The guard runs before the agent, so a failed pre-condition skips the iteration entirely (no token cost). Use this when a check-done.sh command would still burn an agent invocation just to see "nothing to do". Common guards: grep for remaining TODOs, check coverage threshold, verify lint error count > 0.
Set credit: false if the repo forbids automated commit trailers — by default ralphify appends a co-author instruction to each iteration's prompt.
Default commands picks by stack:
git-log → git log --oneline -10tests → uv run pytest, lint → uv run ruff check .tests → cargo test, lint → cargo clippy --all-targets -- -D warningstests → npm test, lint → npm run lint./check-done.shAdd args only if the ralph is meant to be reusable across targets (module, dir, issue). When in doubt, hardcode — generalizing later is cheap.
commands[].run is parsed with shlex.split. Shell features (pipes, &&, redirects, $(...)) parse fine but fail at runtime — see REFERENCE.md for the exhaustive metachar list. When you need any of them, write a script in the ralph directory and reference it with ./name.sh:
commands:
- name: coverage
run: ./check-coverage.sh
chmod +x)../ prefix run with the ralph directory as cwd; commands without the prefix run from the project root.The body is the prompt piped to the agent every iteration, with {{ commands.X }}, {{ args.X }}, and {{ ralph.X }} resolved. Because each iteration starts with a fresh context, the prompt must re-establish enough situation every time to be useful. Follow the ralphify-canonical shape:
## Iteration: {{ ralph.iteration }} so the agent knows where it is in the loop — useful for "on final iteration, do cleanup" logic.{{ commands.<name> }} under ## <Title> headers. The agent can only see what the prompt shows it — if it needs to see failing tests, the prompt needs a ## Test results section.Use HTML comments (<!-- ... -->) for notes to yourself about why a rule exists or TODOs for prompt maintenance — ralphify strips them before piping to the agent, so they never waste tokens.
Run the bundled validator against the draft. It is the gate — do not skip it, and do not try to mentally re-implement what it checks:
uv run --with pyyaml python ~/.claude/skills/ralphify-spec/scripts/validate.py <ralph-path>/RALPH.md
It enforces the schema rules in REFERENCE.md — required fields, name regex, shlex safety, placeholder coverage, agent binary on PATH, timeout type. Exit 0 = clean (warnings are advisory), exit 1 = errors that must be fixed before reporting back, exit 2 = environment problem.
Pay attention to the warnings — declared-but-unused commands or args are usually cleanup signals. The ralph init scaffold ships with args: [focus] that you almost certainly need to remove if you scaffolded from it.
Show the user:
ralph run <path> -n 50 -t 1800 -s -l <path>/logs — three iterations, 30-minute timeout, stop on error, logs captured. Starting with -n 50 lets them see the loop work before going unbounded.rw shell function (defined in zsh/claude.zsh) — rw ralphs/<name> is exactly the suggested command above. To run more iterations: rw ralphs/<name> -n 50.ralphs/coverage-climber/
├── RALPH.md
└── check-coverage.sh
RALPH.md:
---
agent: claude -p --dangerously-skip-permissions
commands:
- name: git-log
run: git log --oneline -10
- name: tests
run: uv run pytest
- name: coverage
run: ./check-coverage.sh
---
You are an autonomous Python testing agent running in a loop. Each iteration starts with a fresh context. Your progress lives in the code and git.
## Iteration: {{ ralph.iteration }}
## Recent changes
{{ commands.git-log }}
## Test results
{{ commands.tests }}
If any tests above are failing, fix them before writing new tests.
## Coverage
{{ commands.coverage }}
## Task
Pick one untested function in `src/` and add tests for it. One function per iteration — the goal is steady progress, not breadth.
## Rules
- Do not modify `src/` beyond what is needed to make code testable (dependency injection, extract helpers).
- Do not edit generated files or `tests/fixtures/`.
- Stop when the coverage script reports >= 90%.
- One commit per iteration.
## Commit
Conventional Commits: `test(<module>): cover <function>`.
check-coverage.sh:
#!/usr/bin/env bash
set -euo pipefail
uv run coverage report --format=total
agent, commands, args, credit is noise.run:. Use a script instead.-n unbounded on the first run. Start with -n 50 so the user can see the loop work before committing to unbounded runs.tools
Reconstruct what a past coding-agent session was doing so you can resume it — goal, files touched, last verified state, and the next step — by querying the session logs. Use when the user says "what was I working on", "recover that session", "reconstruct where I left off", "resume my last session", "what did that session change", "rebuild context from logs", or invokes /work-recovery. Report-only — it never scores or judges. Do NOT use for usage scoring (that is /skill-improver, /tool-efficiency, /prompt-analytics) or one-off interactive log queries (that is /session-analytics).
development
Curate this repo's hallouminate wiki (.hallouminate/wiki/, the repo:dotfiles:wiki corpus) — add or update architecture pages, per-harness docs, and gotchas. Use when the user says "update the wiki", "document this in the wiki", "refresh the harness docs", "add a wiki page", "curate the wiki", "the wiki is stale", or invokes /wiki-curator. Also use at session end to write back a non-obvious decision or gotcha worth preserving. Grounds the existing wiki first, follows one-topic-per-file conventions, verifies every external doc URL before writing, and reindexes. Do NOT use for general code search (that is cheez-search) or for editing AGENTS.md command reference.
tools
Audit how a tool, command, or MCP server is actually used across coding-agent sessions and produce calibrated recommendations — tool-vs-task fit, error forensics, fix recommendations, permission friction, MCP health, and token economics. Use when the user says "tool efficiency", "am I using X efficiently", "audit tool usage", "why does X keep failing", "how do I fix this error", "what should I change", "permission friction", "is this MCP worth it", "tool error rate", "fix recommendations", or invokes /tool-efficiency. Do NOT use for auditing a skill or agent definition (that is /skill-improver) or for one-off interactive log queries (that is /session-analytics).
tools
Analyze how prompts and skill routing behave across coding-agent sessions and produce calibrated recommendations — prompt-pattern analysis, routing accuracy, and knowledge gaps. Use when the user says "analyze my prompts", "prompt patterns", "is routing working", "which skill should have fired", "knowledge gaps", "what do I keep asking", or invokes /prompt-analytics. Do NOT use for auditing a single skill/agent definition (that is /skill-improver), tool/MCP efficiency (that is /tool-efficiency), or one-off interactive log queries (that is /session-analytics).