Skills/ai/ralph-loop/SKILL.md
A self-evaluating iterative agent loop that pursues a goal through structured cycles of Reason, Act, Look, Probe, and Harden until the goal is confidently achieved. Use when a task requires iterative refinement, quality assurance, or when "good enough" isn't good enough. Works for any goal — document generation, data extraction, content processing, analysis, or creative work.
npx skillsauth add zrosenfield/sharepoint-ai-skills ralph-loopInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
RALPH is an iterative execution pattern where Claude pursues a goal through structured cycles, self-evaluating after each iteration and only stopping when it is genuinely confident the goal has been fully achieved — or when it has exhausted its configured iteration budget.
The loop is designed to prevent the common failure mode of "close enough" — where an agent produces a reasonable first attempt and stops, leaving gaps, inconsistencies, or missed requirements that a more careful pass would catch.
Each iteration follows five phases:
┌─────────────────────────────────────────────────┐
│ R → A → L → P → H │
│ │ │ │ │ │ │
│ │ │ │ │ └─ HARDEN │
│ │ │ │ │ Decide: DONE or │
│ │ │ │ │ loop back to R │
│ │ │ │ │ │
│ │ │ │ └─── PROBE │
│ │ │ │ Find specific gaps, │
│ │ │ │ weaknesses, omissions │
│ │ │ │ │
│ │ │ └───────── LOOK │
│ │ │ Examine output against │
│ │ │ success criteria │
│ │ │ │
│ │ └─────────────── ACT │
│ │ Execute the plan, │
│ │ produce/refine output │
│ │ │
│ └───────────────────── REASON │
│ Decompose goal, │
│ define success criteria, │
│ plan approach │
└─────────────────────────────────────────────────┘
First iteration:
Subsequent iterations:
Output a structured plan:
ITERATION [N] — REASON PHASE
=============================
Goal: [restated goal]
Success Criteria:
1. [criterion] — [what "10/10" looks like]
2. [criterion] — [what "10/10" looks like]
...
Plan: [what this iteration will do]
Focus Areas: [gaps from previous iteration, or "initial execution" if iteration 1]
Output a structured evaluation:
ITERATION [N] — LOOK PHASE
============================
Criterion Scores:
1. [criterion]: [score]/10 — [brief justification]
2. [criterion]: [score]/10 — [brief justification]
...
Overall Confidence: [weighted average]/10
Scoring calibration guide:
Output a structured gap analysis:
ITERATION [N] — PROBE PHASE
=============================
Gaps Found:
- [Criterion N]: [specific, actionable description of what's missing]
- [Criterion M]: [specific, actionable description of what's weak]
...
Unstated Issues: [anything not captured by criteria but still problematic]
Verdict: [CONTINUE | COMPLETE]
Apply the completion rules (see below) to determine whether to:
If completing: Present the output with a brief summary of the iterations taken and final confidence scores. If iterating: Proceed immediately to the next REASON phase with the gap analysis as input.
The loop terminates when ANY of these conditions are met:
RALPH can be configured per goal by specifying parameters at the start:
| Parameter | Default | Description |
|-----------|---------|-------------|
| max_iterations | 5 | Hard cap on iteration count |
| completion_threshold | 9 | Minimum score (1-10) ALL criteria must reach |
| strictness | high | low (7+), medium (8+), high (9+), perfectionist (10) |
| verbose | true | Whether to output phase logs (set false for cleaner output) |
To configure, the user can say things like:
max_iterations=3max_iterations=3, strictness=mediumstrictness=perfectionist, max_iterations=8When the user invokes RALPH (or when the skill determines RALPH is appropriate), follow this sequence:
Starting RALPH loop.
Goal: [user's goal]
Configuration: [max_iterations] iterations | threshold: [completion_threshold]/10
Run RALPH cycles. On each iteration, output the phase logs (if verbose=true) so the user can see the agent's reasoning.
When complete, present:
RALPH COMPLETE
==============
Iterations: [N] of [max]
Final Confidence: [score]/10
Criteria Met: [N]/[total] at threshold
Termination Reason: [threshold reached | max iterations | diminishing returns | blocker]
Generous self-scoring: The most common failure. If in doubt, score LOWER. A 9 should mean you'd genuinely be surprised if the user found a problem with that criterion.
Cosmetic iterations: Don't waste iterations on formatting tweaks when substance is lacking. Prioritize the lowest-scoring criteria.
Scope creep: Each iteration should address identified gaps, not invent new requirements. Stick to the success criteria defined in iteration 1 (though PROBE can surface unstated issues that are genuinely important).
Restarting from scratch: Unless PROBE identifies a fundamental approach problem, iterate on existing output rather than regenerating. Build, don't rebuild.
Ignoring diminishing returns: If the score barely moved, don't keep grinding. Present what you have and explain the plateau.
User: "RALPH: Create a project status report for the SharePoint AI initiative"
RALPH would:
- R: Define criteria (executive summary clarity, data accuracy, completeness of milestones, actionable next steps, appropriate tone)
- A: Generate the report
- L: Score each criterion honestly
- P: "Executive summary buries the lead — revenue impact should be sentence 1, not paragraph 3"
- H: Continue → feed gap back
- R2: Replan to restructure executive summary
- A2: Revise the report
- L2: Re-score — all criteria now 9+
- H2: COMPLETE
User: "RALPH: Extract all recipes from these uploaded documents and validate completeness"
RALPH would:
- R: Define criteria (all recipes found, ingredients complete, instructions parseable, measurements standardized, edge cases handled)
- A: Extract recipes
- L: Score — finds 3 recipes missed from a multi-column page
- P: "Multi-column layout caused parser to skip items. Also, 2 recipes have vague measurements like 'a pinch'"
- H: Continue
- ... iterate until extraction is validated
User: "RALPH with max 3 iterations: Analyze our competitive landscape in the AI agent space"
RALPH would:
- R: Define criteria (market coverage, competitor accuracy, strategic insight depth, actionable recommendations)
- A: Research and produce analysis
- L: Score honestly
- P: Find gaps in coverage or depth
- H: Iterate up to 3 times, then present best result with honest gap assessment
testing
--- name: review-council description: Convene a council of expert AI personas to review, stress-test, and improve any document, idea, proposal, or plan. Use this skill whenever the user asks to "review," "stress-test," "get feedback on," "critique," "poke holes in," "red team," "evaluate," "council," "panel review," or "get perspectives on" any content — whether it's an uploaded Word doc, Excel spreadsheet, PowerPoint deck, PDF, or just a raw idea typed into chat. Also trigger on phrases like "w
tools
Generates a polished, self-contained HTML heatmap scorecard — a weighted comparison matrix where entities (rows) are scored across dimensions (columns), with computed totals, rank badges, and a winner highlight. Use when asked to build a scorecard, comparison matrix, decision matrix, vendor evaluation, tool assessment, candidate scoring grid, competitive analysis, site-readiness matrix, or any weighted multi-criteria ranking. Interviews the user if entities or criteria are missing, constructs a validated JSON document, then renders it into a sandbox-safe HTML file using the component library. No external dependencies — output runs inside a SharePoint sandboxed iframe.
development
Generates a polished, self-contained HTML roadmap or milestone timeline from any project data — SharePoint lists, pasted tables, or a verbal description. Use when asked to build a project roadmap, product roadmap, migration timeline, release plan, onboarding sequence, run-of-show, phase plan, or any visual schedule showing items over time. Interviews the user if data is incomplete, constructs a validated JSON document, then renders it into a single sandbox-safe HTML file. Chooses between two layouts automatically: horizontal roadmap with swimlanes (for phase-range data) or vertical milestone list (for point-in-time events). No external dependencies — output runs inside a SharePoint sandboxed iframe.
development
Generates a polished, self-contained HTML executive report or dashboard from any data source — SharePoint lists, CSV exports, or a user description. Use when asked to build an exec report, one-pager, summary page, status dashboard, project summary, business review, or any single-page visual summary of data. Interviews the user if data is incomplete, constructs a validated JSON document block by block, then renders it into a single sandbox-safe HTML file using the component library. No external dependencies — output runs inside a SharePoint sandboxed iframe.