Ralph Wiggum Loop - Autonomous Coding System

Execute autonomous coding tasks using Ralph loops - a pattern that runs fresh Claude instances per iteration to avoid context degradation and maintain consistent code quality.

When to Use This Skill

Activate this skill when the user:

Says "use ralph" or mentions "ralph loop"
Wants autonomous implementation ("build this overnight", "implement while I sleep")
Mentions autonomous coding or unattended execution
Asks to create a spec for Ralph
Wants to review a spec before running Ralph
Needs help with Ralph workflow

Core Concept

Ralph loops solve the context window problem in autonomous coding:

Traditional Approach:

All work happens in one context window
Context grows with each change
At ~100k tokens, model performance degrades (the "dumb zone")
Requires summarization, which poisons future work
Quality decreases as project grows

Ralph Loop Approach:

Each iteration starts with fresh claude -p (headless mode)
spec.md + implementation-plan.md are source of truth, not chat history
Every iteration stays under 100k tokens
No summarization needed
Consistent quality throughout

The Three Workflows

Workflow 1: Full Implementation (Bulletproof Spec Required)

Use when the user has clear requirements and wants autonomous execution.

Step 1: Initialization

cd project-directory/
~/agent-tools/ralph-init.sh "Build a REST API for task management"

This launches bidirectional planning with Claude:

Ask clarifying questions to surface implicit assumptions
Generate spec.md (what the system does)
Generate implementation-plan.md (task checklist)

Step 2: Critical Review

MANDATORY: Review and sign off on EVERY line of both files.

Ask the user to read:

spec.md - Is every decision explicit? Any ambiguity?
implementation-plan.md - Are tasks properly sized? Ordered by dependency?

If the user doesn't understand any line, the spec is not ready.

Step 3: Create Prompt Template

cp ~/agent-tools/ralph-prompt-template.md prompt.md

Edit prompt.md to include:

Repository structure
Tech stack and versions
Coding conventions
Testing approach
Completion criteria

See references/ralph-prompt-template.md for the template.

Step 4: Execute with Watch Mode

Start with watch mode to verify Ralph stays on track:

~/agent-tools/ralph-run.sh --watch

Watch 2-3 iterations. If Ralph goes off track:

Stop (Ctrl+C)
Edit spec.md to add clarity
Restart

Step 5: Autonomous Execution

Once Ralph looks good, run without watch:

~/agent-tools/ralph-run.sh --max-iterations 20

or unlimited:

~/agent-tools/ralph-run.sh

Monitor progress in another terminal:

~/agent-tools/ralph-status.sh
tail -f .ralph-loop.log

Step 6: Verification

When Ralph completes:

# Run all tests
npm test  # or pytest, etc.

# Review changes
git diff

# Check logs
cat .ralph-loop.log

# Skim code for quality

Workflow 2: Spec Creation & Review

Use when the user wants help architecting the spec without running Ralph yet.

Help Create Spec

Ask clarifying questions using bidirectional prompting:

What is the system supposed to do?
What are the core entities and their exact schemas?
What are the API contracts (if applicable)?
What are the technical constraints? (language, framework, database)
What error cases exist and how should they be handled?

Key principle: Surface implicit assumptions. Ask about things that seem "obvious" - those are where bugs hide.

Create spec.md following the pattern in references/example-spec-good.md:

Concise (under 5k tokens)
Every decision explicit
Exact schemas and types
Specific error cases
Technical constraints stated

Avoid patterns in references/example-spec-bad.md:

Vague requirements ("modern tech", "good UX")
Implicit assumptions
Unmeasurable qualities
Missing technical decisions

Help Create Implementation Plan

Create implementation-plan.md following the pattern in references/example-implementation-plan-good.md:

Task Sizing Rules:

Each task must complete in ONE iteration (< 100k tokens)
Rule of thumb: 15-30 minutes of focused work
If you can't describe it in 2-3 sentences, it's too big

Dependency Ordering:

Schema/database first
Backend logic second
UI components third
Integration/testing last

Checklist Format:

- [ ] Create database schema with id, name, status fields
- [ ] Add migration for tasks table
- [ ] Implement GET /api/tasks endpoint

Each task should be verifiable - clear completion criteria.

Review Existing Spec

When reviewing a spec the user has created:

Check for:

Ambiguity - Any vague terms? ("modern", "clean", "fast")
Missing decisions - Tech stack? Database? API format?
Implicit assumptions - What seems "obvious" but isn't stated?
Task sizing - Can each complete in one iteration?
Dependencies - Are tasks ordered correctly?

Provide specific feedback:

"The 'user management' task is too large. Split into: create users table, add auth middleware, implement login endpoint"
"The spec doesn't specify error response format. Add exact JSON structure for errors"
"Task 3 depends on the database schema from task 5. Reorder dependencies"

Workflow 3: Exploration Mode (Low-Stakes Testing)

Use when the user wants to explore an idea without perfect planning, or has tokens expiring.

Perfect for:

Back burner projects
Research spikes
"I wonder if..." explorations
Using unused tokens before reset

Quick Start

cd ~/scratch-project
~/agent-tools/ralph-init.sh "Spike: Redis caching layer for API"

# Quick review, don't sweat perfection
vim spec.md

# Copy template
cp ~/agent-tools/ralph-prompt-template.md prompt.md

# Run with iteration limit
~/agent-tools/ralph-run.sh --max-iterations 5

Wake up to rough prototype and notes about what worked/didn't work.

Benefits:

No downside if tokens would expire anyway
Learn Ralph behavior on low-stakes work
Get rough prototypes for decision-making
Explore multiple approaches in parallel

Critical Success Factors

1. Planning is Everything

The biggest skill in Ralph loops is architecting a bulletproof plan.

If the spec is vague or tasks are too large, errors cascade and amplify because you're out of the loop. Each iteration builds on the previous one.

Invest heavily upfront:

Spend 30-60 minutes on bidirectional planning
Read every line of spec and plan
Sign off on every decision
If you don't understand it, Ralph won't either

2. Task Sizing Matters

Too large: Ralph runs out of context mid-task, produces broken code Too small: Excessive overhead, slow progress Just right: 15-30 minutes of focused work, clear completion criteria

Examples of right-sized tasks:

Add a database column and migration
Implement one API endpoint
Create one UI component
Add one test suite

3. Watch Initially

Don't go autonomous on first run. Watch 2-3 iterations to verify:

Ralph picks the right tasks
Code quality is acceptable
Tests actually pass before checking off tasks
Spec is clear enough

If Ralph struggles, stop and improve the spec.

4. Keep Specs Small

Entire context must stay under 100k tokens:

spec.md (target: < 5k tokens)
implementation-plan.md (target: < 3k tokens)
prompt.md (target: < 2k tokens)
Repo context loaded

If specs are too large, Ralph hits the dumb zone even within single iterations.

Common Pitfalls

Vague Specs

Problem: "Build a modern, scalable API with good UX" Fix: "Node.js 20 + Express + PostgreSQL REST API with specific endpoints: ..."

Tasks Too Large

Problem: "- [ ] Build the entire dashboard" Fix:

[ ] Create dashboard schema
[ ] Implement data queries
[ ] Add chart component
[ ] Add filter controls

Wrong Dependency Order

Problem: UI component task before database schema task Fix: Always go schema → backend → UI

No Completion Criteria

Problem: "- [ ] Make it work well" Fix: "- [ ] GET /api/users returns 200 with user array, tests pass"

Monitoring & Control

Check Status

~/agent-tools/ralph-status.sh

Shows iteration count, tasks completed, remaining work.

Watch Logs

tail -f .ralph-loop.log

See what Ralph is doing in real-time.

Stop Gracefully

~/agent-tools/ralph-stop.sh

Lets current iteration finish, then stops.

Force Stop

touch .ralph-stop

Or Ctrl+C the ralph-run.sh process.

Integration with Existing Workflow

Before Ralph

git checkout -b feature/new-feature
git commit -m "feat: checkpoint before ralph"

After Ralph

# Review changes
git diff main

# Run tests
npm test

# Manual verification
# Check .ralph-loop.log for issues

# Commit or iterate
git commit -m "feat: implement X via ralph loop"

If Ralph Fails

# Review logs
cat .ralph-loop.log

# Check what went wrong
git diff

# Fix spec
vim spec.md

# Restart from checkpoint
git reset --hard HEAD
~/agent-tools/ralph-run.sh

Reference Materials

Load these references as needed:

references/README-ralph.md - Full CLI documentation
references/ralph-prompt-template.md - Template for prompt.md
references/example-spec-good.md - What good specs look like
references/example-spec-bad.md - What to avoid
references/example-implementation-plan-good.md - Proper task breakdown
references/example-implementation-plan-bad.md - Common mistakes

CLI Commands Reference

All commands in ~/agent-tools/:

ralph-init.sh "description"     # Initialize with planning
ralph-run.sh [options]          # Execute the loop
ralph-stop.sh                   # Stop gracefully
ralph-status.sh                 # Check progress

Options for ralph-run.sh:

--max-iterations N - Stop after N iterations
--watch - Pause between iterations for review

Example Session

User: "I want to build a REST API for task management overnight using Ralph"

You should:

Clarify requirements through bidirectional prompting:
- What entities? (Task with id, title, description, status)
- What endpoints? (GET, POST, PATCH, DELETE)
- What tech stack? (Node + Express + SQLite)
- What error handling? (JSON errors with status codes)

Run initialization:

cd ~/task-api
~/agent-tools/ralph-init.sh "Build REST API for task management with CRUD operations"

Guide spec review: "I've created spec.md and implementation-plan.md. Please review every line and confirm:
- Is the task entity schema complete?
- Are the API endpoints clearly specified?
- Are error cases covered?
- Are tasks properly sized and ordered?"
Help with prompt.md:
```
cp ~/agent-tools/ralph-prompt-template.md prompt.md
```
"I've created prompt.md. Please edit to add your coding conventions."
Start with watch mode:
```
~/agent-tools/ralph-run.sh --watch
```
"I'll run the first few iterations with watch mode so we can verify Ralph is on track."
Go autonomous: After 2-3 good iterations: "Ralph looks good. Running autonomously with max 20 iterations. Monitor with:
```
~/agent-tools/ralph-status.sh
tail -f .ralph-loop.log
```"
```
Verify results: "Ralph completed! Next steps:
- Run tests: npm test
- Review changes: git diff
- Check logs: cat .ralph-loop.log"

Key Insight

Ralph loops trade tokens for leverage. The further out of the loop you go, the more you multiply your output - but the more critical your planning becomes.

The skill hierarchy:

Beginner: Use Ralph with watch mode (semi-autonomous)
Intermediate: Run Ralph overnight on well-spec'd features (autonomous)
Advanced: Launch multiple Ralph loops in parallel on different features (extreme leverage)

Start with watch mode on low-stakes work. Build your planning muscles. Then scale up.

Ralph Wiggum Loop - Autonomous Coding System

Execute autonomous coding tasks using Ralph loops - a pattern that runs fresh Claude instances per iteration to avoid context degradation and maintain consistent code quality.

When to Use This Skill

Activate this skill when the user:

Says "use ralph" or mentions "ralph loop"
Wants autonomous implementation ("build this overnight", "implement while I sleep")
Mentions autonomous coding or unattended execution
Asks to create a spec for Ralph
Wants to review a spec before running Ralph
Needs help with Ralph workflow

Core Concept

Ralph loops solve the context window problem in autonomous coding:

Traditional Approach:

All work happens in one context window
Context grows with each change
At ~100k tokens, model performance degrades (the "dumb zone")
Requires summarization, which poisons future work
Quality decreases as project grows

Ralph Loop Approach:

Each iteration starts with fresh claude -p (headless mode)
spec.md + implementation-plan.md are source of truth, not chat history
Every iteration stays under 100k tokens
No summarization needed
Consistent quality throughout

The Three Workflows

Workflow 1: Full Implementation (Bulletproof Spec Required)

Use when the user has clear requirements and wants autonomous execution.

Step 1: Initialization

cd project-directory/
~/agent-tools/ralph-init.sh "Build a REST API for task management"

This launches bidirectional planning with Claude:

Ask clarifying questions to surface implicit assumptions
Generate spec.md (what the system does)
Generate implementation-plan.md (task checklist)

Step 2: Critical Review

MANDATORY: Review and sign off on EVERY line of both files.

Ask the user to read:

spec.md - Is every decision explicit? Any ambiguity?
implementation-plan.md - Are tasks properly sized? Ordered by dependency?

If the user doesn't understand any line, the spec is not ready.

Step 3: Create Prompt Template

cp ~/agent-tools/ralph-prompt-template.md prompt.md

Edit prompt.md to include:

Repository structure
Tech stack and versions
Coding conventions
Testing approach
Completion criteria

See references/ralph-prompt-template.md for the template.

Step 4: Execute with Watch Mode

Start with watch mode to verify Ralph stays on track:

~/agent-tools/ralph-run.sh --watch

Watch 2-3 iterations. If Ralph goes off track:

Stop (Ctrl+C)
Edit spec.md to add clarity
Restart

Step 5: Autonomous Execution

Once Ralph looks good, run without watch:

~/agent-tools/ralph-run.sh --max-iterations 20

or unlimited:

~/agent-tools/ralph-run.sh

Monitor progress in another terminal:

~/agent-tools/ralph-status.sh
tail -f .ralph-loop.log

Step 6: Verification

When Ralph completes:

# Run all tests
npm test  # or pytest, etc.

# Review changes
git diff

# Check logs
cat .ralph-loop.log

# Skim code for quality

Workflow 2: Spec Creation & Review

Use when the user wants help architecting the spec without running Ralph yet.

Help Create Spec

Ask clarifying questions using bidirectional prompting:

What is the system supposed to do?
What are the core entities and their exact schemas?
What are the API contracts (if applicable)?
What are the technical constraints? (language, framework, database)
What error cases exist and how should they be handled?

Key principle: Surface implicit assumptions. Ask about things that seem "obvious" - those are where bugs hide.

Create spec.md following the pattern in references/example-spec-good.md:

Concise (under 5k tokens)
Every decision explicit
Exact schemas and types
Specific error cases
Technical constraints stated

Avoid patterns in references/example-spec-bad.md:

Vague requirements ("modern tech", "good UX")
Implicit assumptions
Unmeasurable qualities
Missing technical decisions

Help Create Implementation Plan

Create implementation-plan.md following the pattern in references/example-implementation-plan-good.md:

Task Sizing Rules:

Each task must complete in ONE iteration (< 100k tokens)
Rule of thumb: 15-30 minutes of focused work
If you can't describe it in 2-3 sentences, it's too big

Dependency Ordering:

Schema/database first
Backend logic second
UI components third
Integration/testing last

Checklist Format:

- [ ] Create database schema with id, name, status fields
- [ ] Add migration for tasks table
- [ ] Implement GET /api/tasks endpoint

Each task should be verifiable - clear completion criteria.

Review Existing Spec

When reviewing a spec the user has created:

Check for:

Ambiguity - Any vague terms? ("modern", "clean", "fast")
Missing decisions - Tech stack? Database? API format?
Implicit assumptions - What seems "obvious" but isn't stated?
Task sizing - Can each complete in one iteration?
Dependencies - Are tasks ordered correctly?

Provide specific feedback:

"The 'user management' task is too large. Split into: create users table, add auth middleware, implement login endpoint"
"The spec doesn't specify error response format. Add exact JSON structure for errors"
"Task 3 depends on the database schema from task 5. Reorder dependencies"

Workflow 3: Exploration Mode (Low-Stakes Testing)

Use when the user wants to explore an idea without perfect planning, or has tokens expiring.

Perfect for:

Back burner projects
Research spikes
"I wonder if..." explorations
Using unused tokens before reset

Quick Start

cd ~/scratch-project
~/agent-tools/ralph-init.sh "Spike: Redis caching layer for API"

# Quick review, don't sweat perfection
vim spec.md

# Copy template
cp ~/agent-tools/ralph-prompt-template.md prompt.md

# Run with iteration limit
~/agent-tools/ralph-run.sh --max-iterations 5

Wake up to rough prototype and notes about what worked/didn't work.

Benefits:

No downside if tokens would expire anyway
Learn Ralph behavior on low-stakes work
Get rough prototypes for decision-making
Explore multiple approaches in parallel

Critical Success Factors

1. Planning is Everything

The biggest skill in Ralph loops is architecting a bulletproof plan.

If the spec is vague or tasks are too large, errors cascade and amplify because you're out of the loop. Each iteration builds on the previous one.

Invest heavily upfront:

Spend 30-60 minutes on bidirectional planning
Read every line of spec and plan
Sign off on every decision
If you don't understand it, Ralph won't either

2. Task Sizing Matters

Too large: Ralph runs out of context mid-task, produces broken code Too small: Excessive overhead, slow progress Just right: 15-30 minutes of focused work, clear completion criteria

Examples of right-sized tasks:

Add a database column and migration
Implement one API endpoint
Create one UI component
Add one test suite

3. Watch Initially

Don't go autonomous on first run. Watch 2-3 iterations to verify:

Ralph picks the right tasks
Code quality is acceptable
Tests actually pass before checking off tasks
Spec is clear enough

If Ralph struggles, stop and improve the spec.

4. Keep Specs Small

Entire context must stay under 100k tokens:

spec.md (target: < 5k tokens)
implementation-plan.md (target: < 3k tokens)
prompt.md (target: < 2k tokens)
Repo context loaded

If specs are too large, Ralph hits the dumb zone even within single iterations.

Common Pitfalls

Vague Specs

Problem: "Build a modern, scalable API with good UX" Fix: "Node.js 20 + Express + PostgreSQL REST API with specific endpoints: ..."

Tasks Too Large

Problem: "- [ ] Build the entire dashboard" Fix:

[ ] Create dashboard schema
[ ] Implement data queries
[ ] Add chart component
[ ] Add filter controls

Wrong Dependency Order

Problem: UI component task before database schema task Fix: Always go schema → backend → UI

No Completion Criteria

Problem: "- [ ] Make it work well" Fix: "- [ ] GET /api/users returns 200 with user array, tests pass"

Monitoring & Control

Check Status

~/agent-tools/ralph-status.sh

Shows iteration count, tasks completed, remaining work.

Watch Logs

tail -f .ralph-loop.log

See what Ralph is doing in real-time.

Stop Gracefully

~/agent-tools/ralph-stop.sh

Lets current iteration finish, then stops.

Force Stop

touch .ralph-stop

Or Ctrl+C the ralph-run.sh process.

Integration with Existing Workflow

Before Ralph

git checkout -b feature/new-feature
git commit -m "feat: checkpoint before ralph"

After Ralph

# Review changes
git diff main

# Run tests
npm test

# Manual verification
# Check .ralph-loop.log for issues

# Commit or iterate
git commit -m "feat: implement X via ralph loop"

If Ralph Fails

# Review logs
cat .ralph-loop.log

# Check what went wrong
git diff

# Fix spec
vim spec.md

# Restart from checkpoint
git reset --hard HEAD
~/agent-tools/ralph-run.sh

Reference Materials

Load these references as needed:

references/README-ralph.md - Full CLI documentation
references/ralph-prompt-template.md - Template for prompt.md
references/example-spec-good.md - What good specs look like
references/example-spec-bad.md - What to avoid
references/example-implementation-plan-good.md - Proper task breakdown
references/example-implementation-plan-bad.md - Common mistakes

CLI Commands Reference

All commands in ~/agent-tools/:

ralph-init.sh "description"     # Initialize with planning
ralph-run.sh [options]          # Execute the loop
ralph-stop.sh                   # Stop gracefully
ralph-status.sh                 # Check progress

Options for ralph-run.sh:

--max-iterations N - Stop after N iterations
--watch - Pause between iterations for review

Example Session

User: "I want to build a REST API for task management overnight using Ralph"

You should:

Clarify requirements through bidirectional prompting:
- What entities? (Task with id, title, description, status)
- What endpoints? (GET, POST, PATCH, DELETE)
- What tech stack? (Node + Express + SQLite)
- What error handling? (JSON errors with status codes)

Run initialization:

cd ~/task-api
~/agent-tools/ralph-init.sh "Build REST API for task management with CRUD operations"

Guide spec review: "I've created spec.md and implementation-plan.md. Please review every line and confirm:
- Is the task entity schema complete?
- Are the API endpoints clearly specified?
- Are error cases covered?
- Are tasks properly sized and ordered?"
Help with prompt.md:
```
cp ~/agent-tools/ralph-prompt-template.md prompt.md
```
"I've created prompt.md. Please edit to add your coding conventions."
Start with watch mode:
```
~/agent-tools/ralph-run.sh --watch
```
"I'll run the first few iterations with watch mode so we can verify Ralph is on track."
Go autonomous: After 2-3 good iterations: "Ralph looks good. Running autonomously with max 20 iterations. Monitor with:
```
~/agent-tools/ralph-status.sh
tail -f .ralph-loop.log
```"
```
Verify results: "Ralph completed! Next steps:
- Run tests: npm test
- Review changes: git diff
- Check logs: cat .ralph-loop.log"

Key Insight

Ralph loops trade tokens for leverage. The further out of the loop you go, the more you multiply your output - but the more critical your planning becomes.

The skill hierarchy:

Beginner: Use Ralph with watch mode (semi-autonomous)
Intermediate: Run Ralph overnight on well-spec'd features (autonomous)
Advanced: Launch multiple Ralph loops in parallel on different features (extreme leverage)

Start with watch mode on low-stakes work. Build your planning muscles. Then scale up.

Adoption

szoloth/ralph-loop

$ install --global

Security Scan Results

SKILL.md

Ralph Wiggum Loop - Autonomous Coding System

When to Use This Skill

Core Concept

The Three Workflows

Workflow 1: Full Implementation (Bulletproof Spec Required)

Step 1: Initialization

Step 2: Critical Review

Step 3: Create Prompt Template

Step 4: Execute with Watch Mode

Step 5: Autonomous Execution

Step 6: Verification

Workflow 2: Spec Creation & Review

Help Create Spec

Help Create Implementation Plan

Review Existing Spec

Workflow 3: Exploration Mode (Low-Stakes Testing)

Quick Start

Critical Success Factors

1. Planning is Everything

2. Task Sizing Matters

3. Watch Initially

4. Keep Specs Small

Common Pitfalls

Vague Specs

Tasks Too Large

Wrong Dependency Order

No Completion Criteria

Monitoring & Control

Check Status

Watch Logs

Stop Gracefully

Force Stop

Integration with Existing Workflow

Before Ralph

After Ralph

If Ralph Fails

Reference Materials

CLI Commands Reference

Example Session

Key Insight

Related Skills

szoloth/youtube-transcript

szoloth/writing-editor

szoloth/webapp-testing

szoloth/web-research

szoloth/ralph-loop

$ install --global

Security Scan Results

SKILL.md

Ralph Wiggum Loop - Autonomous Coding System

When to Use This Skill

Core Concept

The Three Workflows

Workflow 1: Full Implementation (Bulletproof Spec Required)

Step 1: Initialization

Step 2: Critical Review

Step 3: Create Prompt Template

Step 4: Execute with Watch Mode

Step 5: Autonomous Execution

Step 6: Verification

Workflow 2: Spec Creation & Review

Help Create Spec

Help Create Implementation Plan

Review Existing Spec

Workflow 3: Exploration Mode (Low-Stakes Testing)

Quick Start

Critical Success Factors

1. Planning is Everything

2. Task Sizing Matters

3. Watch Initially

4. Keep Specs Small

Common Pitfalls

Vague Specs

Tasks Too Large

Wrong Dependency Order