skills/rlm-debugging/SKILL.md
Use when RLM requirement involves debugging a bug, test failure, or unexpected behavior. Insert Phase 1.5 between Phase 1 and Phase 2 to perform systematic root cause analysis before attempting any fixes. Trigger phrases: "debug", "investigate", "failing tests", "crash", "root cause".
npx skillsauth add doubleuuser/rlm-workflow rlm-debuggingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When a requirement involves fixing a bug or investigating unexpected behavior, ad-hoc fixes waste time and create new bugs. Systematic debugging finds the root cause before any fix is attempted.
Core Principle: ALWAYS find root cause before attempting fixes. Symptom fixes are failure.
The Iron Law for RLM Debugging:
NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
Tests are failing after the last change; debug itFix crash on empty inputInvestigate why the API returns wrong dataDo a root cause analysis before making changesMandatory Phase 1.5 when:
Use ESPECIALLY when:
Don't skip when:
Phase 1.5 is inserted between Phase 1 (AS-IS) and Phase 2 (TO-BE Plan) when debugging is required:
Phase 0: 00-requirements.md
->
Phase 1: 01-as-is.md (captures current behavior)
->
Phase 1.5: 01.5-root-cause.md <- NEW (this skill)
->
Phase 2: 02-to-be-plan.md (includes fix plan based on root cause)
digraph debugging_phases {
rankdir=TB;
phase1 [label="Step 1:\nRoot Cause Investigation", shape=box, style=filled, fillcolor="#ffcccc"];
phase2 [label="Step 2:\nPattern Analysis", shape=box, style=filled, fillcolor="#ffffcc"];
phase3 [label="Step 3:\nHypothesis & Testing", shape=box, style=filled, fillcolor="#ccffcc"];
phase4 [label="Step 4:\nFix Implementation", shape=box, style=filled, fillcolor="#ccccff"];
phase1 -> phase2 -> phase3 -> phase4;
// Feedback loops
phase3 -> phase1 [label="hypothesis\nfailed", style=dashed];
phase4 -> phase1 [label="fix\nfailed", style=dashed];
}
BEFORE attempting ANY fix:
Record in Phase 1.5 artifact:
## Error Analysis
**Error Message:** [verbatim]
**Stack Trace:** [key frames]
**File:Line:** [locations]
**Error Code:** [if applicable]
**Key Insight:** [what the error is telling you]
Record in Phase 1.5 artifact:
## Reproduction Verification
**Steps:**
1. [exact step]
2. [exact step]
3. [exact step]
**Reproducible:** Yes / No / Intermittent
**Frequency:** [X out of Y attempts]
**Deterministic:** Yes / No
Record in Phase 1.5 artifact:
## Recent Changes Analysis
**Git History:** [relevant commits]
**Dependency Changes:** [package.json, requirements.txt, etc.]
**Config Changes:** [relevant files]
**Environment:** [OS, runtime versions]
**Likely Culprit:** [most suspicious change]
WHEN system has multiple components (CI -> build -> signing, API -> service -> database):
BEFORE proposing fixes, add diagnostic instrumentation:
For EACH component boundary:
Run once to gather evidence showing WHERE it breaks, THEN analyze evidence.
Record in Phase 1.5 artifact:
## Multi-Layer Evidence
**Layer 1: [Component Name]**
- Input: [data]
- Output: [data]
- Status: ✅ Working / ❌ Broken
**Layer 2: [Component Name]**
- Input: [data from Layer 1]
- Output: [data]
- Status: ✅ Working / ❌ Broken
**Failure Boundary:** Layer X -> Layer Y
**Root Cause Location:** [specific component]
WHEN error is deep in call stack:
Trace backward:
Record in Phase 1.5 artifact:
## Data Flow Trace
**Error Location:** [file:line - function]
**Bad Value:** [what was wrong]
**Call Stack Trace:**
1. [deepest] `functionA()` at fileA:line - received [value]
2. `functionB()` at fileB:line - passed [value]
3. `functionC()` at fileC:line - passed [value]
4. [source] `functionD()` at fileD:line - ORIGIN of bad value
**Root Cause:** [source location] - [explanation]
Find the pattern before fixing:
Record in Phase 1.5 artifact:
## Pattern Analysis
**Working Example:** [file:location]
**Broken Code:** [file:location]
**Key Differences:**
| Aspect | Working | Broken |
|--------|---------|--------|
| [X] | [value] | [value] |
| [Y] | [value] | [value] |
**Likely Cause:** [difference that explains the bug]
**Dependencies:** [what the code needs to work]
Scientific method:
Record in Phase 1.5 artifact:
## Hypothesis Testing
### Hypothesis 1
**Statement:** [clear hypothesis]
**Rationale:** [why you think this]
**Test:** [minimal change to verify]
**Result:** [confirmed/rejected]
**Evidence:** [output/observation]
### Hypothesis 2 (if needed)
[...]
**Confirmed Root Cause:** [final hypothesis]
Fix the root cause, not the symptom:
Record in Phase 1.5 artifact:
## Root Cause Summary
**Root Cause:** [one sentence]
**Location:** [file:line]
**Explanation:** [paragraph explaining why]
**Fix Approach:** [high-level]
**Test Strategy:** [how to verify fix]
## Phase 1.5 Gate
Coverage: [Did we find root cause?]
Approval: [Ready to proceed to Phase 3 with fix plan?]
If you catch yourself thinking:
ALL of these mean: STOP. Return to Phase 2.
| Excuse | Reality | |--------|---------| | "Issue is simple, don't need process" | Simple issues have root causes too. Process is fast for simple bugs. | | "Emergency, no time for process" | Systematic debugging is FASTER than guess-and-check thrashing. | | "Just try this first, then investigate" | First fix sets the pattern. Do it right from the start. | | "I'll write test after confirming fix" | Untested fixes don't stick. Test first proves it. | | "Multiple fixes at once saves time" | Can't isolate what worked. Causes new bugs. | | "I see the problem, let me fix it" | Seeing symptoms ≠ understanding root cause. | | "One more fix attempt" (after 2+ failures) | 3+ failures = architectural problem. Question pattern, don't fix again. |
Pattern indicating architectural problem:
STOP and question fundamentals:
Document in Phase 1.5:
## Architectural Concern
**Fix Attempts:** [number]
**Pattern:** [what happens with each fix]
**Recommendation:** [architectural change vs. symptom fix]
**Next Steps:** [escalate, refactor, or accept risk]
File: /.codex/rlm/<run-id>/01.5-root-cause.md
Run: `/.codex/rlm/<run-id>/`
Phase: `01.5 Root Cause Analysis`
Status: `DRAFT` | `LOCKED`
Inputs:
- `/.codex/rlm/<run-id>/01-as-is.md`
- [relevant addenda]
Outputs:
- `/.codex/rlm/<run-id>/01.5-root-cause.md`
Scope note: This document records systematic debugging process and identified root cause.
## Error Analysis
[Section 2.1 - verbatim errors, stack traces]
## Reproduction Verification
[Section 2.2 - exact steps, reproducibility]
## Recent Changes Analysis
[Section 2.3 - git history, dependencies]
## Evidence Gathering
[Section 2.4 - multi-layer diagnostics if applicable]
## Data Flow Trace
[Section 2.5 - backward trace to source]
## Pattern Analysis
[Section 3 - working vs broken comparison]
## Hypothesis Testing
[Section 4 - scientific method log]
## Root Cause Summary
**Root Cause:** [one sentence]
**Location:** [file:line]
**Detailed Explanation:** [paragraph]
**Fix Strategy:** [approach for Phase 3]
**Test Plan:** [how to verify]
## Traceability
- R1 (Bug fix requirement) -> Root cause identified at [location] | Evidence: [section]
## Coverage Gate
- [ ] Error messages analyzed
- [ ] Reproduction verified
- [ ] Recent changes reviewed
- [ ] Data flow traced to source
- [ ] Pattern analysis completed
- [ ] Hypothesis tested and confirmed
- [ ] Root cause documented
- [ ] Fix strategy defined
Coverage: PASS / FAIL
## Approval Gate
- [ ] Root cause identified (not just symptom)
- [ ] Fix approach clear
- [ ] Test strategy defined
- [ ] No "quick fixes" attempted
- [ ] Ready to proceed to Phase 3
Approval: PASS / FAIL
LockedAt: [when locked]
LockHash: [sha256]
When Phase 1 (AS-IS) identifies a bug/issue that needs fixing:
01-as-is.md)01.5-root-cause.md) with Status: DRAFTPhase 3 (02-to-be-plan.md) builds ON Phase 1.5:
## Root Cause Reference
Root cause identified in `01.5-root-cause.md`:
- Location: [file:line]
- Cause: [summary]
- Full analysis: [reference]
## Fix Plan
Based on root cause analysis:
1. [specific fix steps]
2. [test strategy from Phase 1.5]
01.5-root-cause.md artifacttesting
Use when starting any RLM requirement to set up an isolated git worktree. REQUIRED before Phase 1 - creates isolated workspace, verifies clean test baseline, and prevents main branch pollution. Trigger phrases: "create worktree", "worktree isolation", "set up worktree", "do not work on main".
development
Use when implementing any code in RLM Phase 3. Enforces strict RED-GREEN-REFACTOR discipline with The Iron Law - no production code without a failing test first. Trigger phrases: "implement this", "add feature", "fix bug", "write a failing test", "TDD".
development
Master skill for parallel subagent-driven execution with automatic fallback to single-agent sequential mode. Use when implementing plans with multiple independent sub-phases (SP1, SP2...) to dispatch parallel subagents, or when requiring code review between implementation and testing. Trigger phrases: "parallelize", "dispatch subagent", "split into sub-phases", "code review subagent", "parallel testing".
testing
Orchestrates the RLM repo workflow end-to-end with phase gates, locked artifacts, addenda, traceability, and automatic bootstrap/upsert of AGENTS/PLANS scaffolding. Trigger phrases: "Implement requirement <run-id>", "Run RLM Phase <N>", "resume requirement", "lock Phase <N>", "verify locks".