.claude/skills/plan-forge/SKILL.md
Use when a task needs an implementation plan that is iteratively created and stress-tested through review-and-revise cycles before implementation begins — catches blind spots, incorrect codebase assumptions, unnecessary complexity, and performance pitfalls while changes are still cheap
npx skillsauth add ahrav/gossip-rs plan-forgeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Iteratively creates AND refines an implementation plan through review-and-revise cycles. A metallurgy metaphor: the plan is heated (reviewed), hammered (revised), and quenched (finalized) until it holds up under stress.
Unlike /plan-review (one-shot post-hoc review of an existing plan), plan-forge
creates the plan from scratch and runs 1-3 rounds of dual review, consolidation,
and revision before presenting the final artifact.
/plan-review)/design-tournament,
then feed the winner into /plan-forge)/deep-research or /deeper-research)/plan-forge <task description>
/plan-forge --rounds=1 <task>
/plan-forge --focus=concurrency <task>
/plan-forge --plan-only <task>
/plan-forge --review-only <path-to-existing-plan>
Phase 0: Plan Creation (orchestrator explores codebase, writes initial plan)
|
v
+---> Phase 1: Dual Review (2 parallel general-purpose agents, fresh context)
| |
| Phase 2: Consolidation (1 general-purpose agent merges findings)
| |
| Phase 3: Revision (orchestrator revises plan inline)
| |
| Decision: continue? ----yes (RETHINK/REVISE items remain, round < max)---+
| | |
| no (only WATCH items, or max round reached) |
| v |
+---- Phase 4: Final Presentation <---------------------------------------------+
Agents per round: 2 reviewers + 1 consolidator = 3 Total agents across 1-3 rounds: 3-9
The orchestrator (you, not a sub-agent) creates the initial plan.
crates/gossip-stdx/src/ and neighboring modules
for duplication (per CLAUDE.md rules).~/.claude/plans/{YYYY-MM-DD}-{feature-slug}-v1.md.Each revision writes a NEW file with an incremented version suffix. Prior versions are kept for reference and diffing.
~/.claude/plans/2026-02-23-retry-logic-v1.md <- Phase 0 output (initial)
~/.claude/plans/2026-02-23-retry-logic-v2.md <- After Round 1 revision
~/.claude/plans/2026-02-23-retry-logic-v3.md <- After Round 2 revision (final)
# {Plan Title}
| Field | Value |
|------------------|------------------------------|
| Date | {YYYY-MM-DD} |
| Status | Draft / In Review / Final |
| Version | v{N} |
| Rounds completed | {N} |
| Task | {one-line summary} |
## Problem Statement
{What problem does this solve and why does it matter?}
## Codebase Context
{Discovered files, patterns, abstractions relevant to the task.
Include file paths and brief descriptions.}
## Steps
### Step {N}: {Title}
- **What**: {concrete description}
- **Why**: {justification}
- **Files**: {exact paths to create or modify}
- **Tests**: {what to test and how}
- **Acceptance criteria**: {how to verify correctness}
## Testing Strategy
{Overall testing approach --- unit, property-based, integration, etc.}
## Revision Log
{Populated during review rounds. Cumulative across versions.}
| Round | Finding ID | Action Taken |
|-------|-----------|--------------|
## Open Items
{WATCH items and unresolved concerns.}
--review-only <path>: Skip Phase 0. Load the plan at <path> and jump
directly to Phase 1.--plan-only: Stop after Phase 0. Write the plan and present it without
running any review rounds.Launch 2 agents in a single message using the Task tool with
subagent_type=general-purpose. Each covers all 4 review lenses but with
different primary emphasis to reduce blind-spot overlap.
| Agent | Label | Primary Emphasis (40%) | Secondary (20% each) | |-------|------------------|-------------------------------|------------------------------------------| | Alpha | Forge Inspector | Correctness & Soundness | Footguns, Simplification, Performance | | Beta | Forge Optimizer | Simplification & Pragmatism | Performance, Correctness, Footguns |
You are {AGENT_LABEL}, a plan reviewer in Round {ROUND} of the Plan Forge
process. You review the plan below through ALL four lenses but emphasize
{PRIMARY_EMPHASIS} (allocate ~40% of your attention there, ~20% each to the
other three).
## Plan Under Review
{PLAN}
## Codebase Context
{CONTEXT}
{PRIOR_ROUND_SECTION}
## Four Review Lenses
### Correctness & Soundness
- Does the plan actually solve the stated problem?
- Are assumptions about existing code accurate? (check the codebase)
- Do referenced types, traits, APIs exist with described signatures?
- Are ordering dependencies correct?
- Do state transitions and invariants hold under all cases?
### Footguns & Failure Modes
- Race conditions, TOCTOU bugs, atomicity gaps
- Edge cases not addressed (empty inputs, overflow, boundaries)
- Error propagation paths that silently swallow failures
- Partial failure scenarios (what if step 3 of 5 fails?)
- Implicit assumptions that break under different configurations
### Simplification
- YAGNI: does the plan build things not yet needed?
- Does the codebase already have utilities the plan reinvents? (search with
Glob/Grep, especially crates/gossip-stdx/src/)
- Could fewer files, types, or steps achieve the same result?
- Are there unnecessary abstraction layers or indirection?
- Could an existing pattern be extended instead of building new?
### Performance & Scalability
- Hot path allocations in loops (Vec, String, Box)
- Lock contention or oversized critical sections
- O(n^2) or worse algorithms hidden in the approach
- Blocking operations in async contexts
- Unbounded growth (queues, buffers, caches without limits)
## Rules
- Explore the codebase (Glob, Grep, Read) to ground findings in reality.
The most valuable findings come from gaps between plan assumptions and
codebase reality.
- Only report findings that REQUIRE action. No nits, no style suggestions.
- Be concrete: cite the specific plan step, section, or quoted text.
- For each finding, state the PROBLEM and the RECOMMENDED CHANGE.
- Rate each finding:
- Impact (1-10): How much does this matter if unaddressed?
- Confidence (0-100%): How sure are you this is a real issue?
## Output Format
Return a markdown document starting with:
`# {AGENT_LABEL} Review --- Round {ROUND}`
For each finding:
### {FINDING_ID}: {title}
- **Plan step**: {which step or section}
- **Lens**: {Correctness | Footguns | Simplification | Performance}
- **Problem**: {what is wrong or missing}
- **Evidence**: {codebase evidence --- file paths, existing code, design docs}
- **Recommended change**: {specific edit to the plan}
- **Impact**: N/10
- **Confidence**: N%
End with: "Total findings: N" (0 is valid --- do not invent issues).
R{round}.A{agent}.F{n}
a for Alpha, b for Beta.R1.Aa.F3 = Round 1, Alpha, Finding 3.Alpha (Forge Inspector) --- replace {AGENT_LABEL} with Forge Inspector,
{PRIMARY_EMPHASIS} with Correctness & Soundness:
Your primary emphasis is CORRECTNESS & SOUNDNESS (40%). Prioritize verifying
that the plan actually solves the problem, that referenced code exists as
described, and that invariants hold. Give secondary attention (~20% each) to
footguns, simplification, and performance.
Use finding IDs: R{ROUND}.Aa.F1, R{ROUND}.Aa.F2, ...
Beta (Forge Optimizer) --- replace {AGENT_LABEL} with Forge Optimizer,
{PRIMARY_EMPHASIS} with Simplification & Pragmatism:
Your primary emphasis is SIMPLIFICATION & PRAGMATISM (40%). Prioritize finding
YAGNI violations, existing utilities the plan reinvents, and opportunities to
achieve the same result with less complexity. Give secondary attention (~20%
each) to performance, correctness, and footguns.
Use finding IDs: R{ROUND}.Ab.F1, R{ROUND}.Ab.F2, ...
For rounds 2+, append this section to each agent's prompt:
## Prior Round Findings
The following findings were raised in prior rounds. Check whether the revised
plan adequately addresses them. If a prior finding is STILL present, re-raise
it with a note that it was not resolved.
{PRIOR_CONSOLIDATED_FINDINGS}
After both reviewers complete, launch 1 consolidator agent using the Task
tool with subagent_type=general-purpose.
You are the Forge Consolidator for Round {ROUND}. Two independent reviewers
have examined the same implementation plan. Your job is to merge their findings
into one focused, actionable report and issue a verdict.
## Original Plan
{PLAN}
## Reviewer Reports
{ALPHA_REPORT}
---
{BETA_REPORT}
{PRIOR_TRACKING_SECTION}
## Your Task
### 1. Deduplicate
Group findings that flag the same underlying issue from different angles into
single consolidated findings. Note which reviewers flagged each.
### 2. Overload Check
Count unique findings after deduplication. If there are MORE THAN 10 unique
findings, or MORE THAN 3 that would be classified as RETHINK, emit ONLY:
---
**This plan needs fundamental rework.** The review found {N} issues across
{areas}. Rather than patching individually, redesign the approach. The top 3
structural issues to address first:
1. {highest-impact finding}
2. {second highest}
3. {third highest}
---
Then STOP. Do not produce the full report.
### 3. Score Each Finding (if overload check passes)
For every unique finding, assign:
- **Impact** (1-10):
- 9-10: Fundamental flaw --- approach won't work
- 7-8: Significant gap --- plan needs edits before implementation
- 5-6: Real concern --- implementation must handle explicitly
- 3-4: Minor --- below threshold, discard
- **Confidence** (0-100%):
- 90-100: Clear problem with codebase evidence
- 70-89: Very likely, strong reasoning
- 50-69: Plausible, may need investigation
- Below 50: Speculative --- discard
Discard findings with impact < 4 or confidence < 50%.
### 4. Classify
Assign each surviving finding exactly one category:
- **RETHINK** (impact >= 8, confidence >= 70): Fundamental approach change
needed. Non-negotiable.
- **REVISE** (impact >= 6, confidence >= 60): Specific plan edits required.
- **WATCH** (impact >= 4, confidence >= 50): Plan is sound but implementation
must handle this explicitly.
### 5. Issue Verdict
Based on surviving findings:
- **FORGE AGAIN**: Any RETHINK items exist. Plan MUST be revised and
re-reviewed.
- **TEMPER**: No RETHINK items, but REVISE items exist. Plan should be revised
and re-reviewed if round < max.
- **QUENCH**: Only WATCH items (or no findings). Plan is ready.
### 6. Output Format
```markdown
## Forge Consolidation --- Round {ROUND}
**Verdict**: {FORGE AGAIN | TEMPER | QUENCH}
**Unique findings**: {N} (after dedup and filtering)
### RETHINK
| # | Finding ID | Title | Plan Step | Impact | Confidence | Reviewers |
|---|-----------|-------|-----------|--------|------------|-----------|
**Details:**
#### {R{ROUND}.C.F1}: {title}
- **Problem**: {description}
- **Evidence**: {codebase evidence}
- **Recommended change**: {specific plan revision}
- **Original IDs**: {which reviewer finding IDs map here}
### REVISE
{same format}
### WATCH
{same format}
### Prior Finding Tracking
| Prior Finding ID | Status | Notes |
|-----------------|--------|-------|
| R1.C.F2 | RESOLVED | Plan step 3 now addresses this |
| R1.C.F5 | PARTIALLY RESOLVED | Step added but edge case missing |
| R1.C.F7 | UNRESOLVED | Still not addressed |
Use: R{ROUND}.C.F{n} (C = consolidated).
### Prior Tracking Section (Rounds 2+)
For rounds 2+, append this to the consolidator prompt:
Track whether each prior finding has been addressed in the revised plan:
{PRIOR_CONSOLIDATED_FINDINGS_WITH_STATUS}
For each prior finding, assign: RESOLVED / PARTIALLY RESOLVED / UNRESOLVED. Include this tracking in your output.
---
## Phase 3 --- Revision (Orchestrator, Inline)
The orchestrator (you, not a sub-agent) revises the plan based on consolidated
findings and writes a **new versioned file**.
### Revision Rules
1. **RETHINK findings**: Make fundamental changes. These are non-negotiable.
2. **REVISE findings**: Make the specific edits recommended.
3. **WATCH findings**: Add to Open Items section. Do NOT restructure the plan
for WATCH items.
4. **Update Revision Log**: Map each finding ID to the action taken.
5. **Increment version** in header and filename (`-v1.md` -> `-v2.md`).
6. **Verify internal consistency**: After edits, re-read the plan to ensure
steps still flow logically and no contradictions were introduced.
7. **Keep prior version file** --- do not delete or overwrite it.
---
## Round Decision
After revision, decide whether to loop back to Phase 1:
| Verdict | Round < max | Round = max |
|-------------|-------------|-------------|
| FORGE AGAIN | -> Phase 1 | -> Phase 4 (forced stop, flag unresolved RETHINK) |
| TEMPER | -> Phase 1 | -> Phase 4 |
| QUENCH | -> Phase 4 | -> Phase 4 |
Default max rounds: 3. Override with `--rounds=N` (1-3).
---
## Phase 4 --- Final Presentation
1. Set plan status to `Final` in the latest version file.
2. Collect all WATCH items into Open Items section.
3. If forced stop with unresolved RETHINK items: add a prominent warning at the
top of the plan file and call it out when presenting to the user.
4. Present a round summary table to the user.
5. Append all review reports as collapsed `<details>` sections at the end of
the plan file.
6. Report version history with file paths.
### Final Presentation Format
```markdown
## Plan Forge Complete
**Plan**: {title}
**Rounds**: {N}
**Final verdict**: {QUENCH | forced stop}
**Version history**:
- `{path}-v1.md` (initial)
- `{path}-v2.md` (round 1 revision)
- `{path}-v3.md` (final)
### Round Summary
| Round | Verdict | RETHINK | REVISE | WATCH | Total |
|-------|---------|---------|--------|-------|-------|
| 1 | FORGE AGAIN | 1 | 3 | 2 | 6 |
| 2 | QUENCH | 0 | 0 | 1 | 1 |
### Open Items (WATCH)
{collected WATCH items from all rounds}
### Review Reports (collapsed)
<details><summary>Round 1 --- Forge Inspector</summary>
{full report}
</details>
<details><summary>Round 1 --- Forge Optimizer</summary>
{full report}
</details>
<details><summary>Round 1 --- Consolidation</summary>
{full report}
</details>
<details><summary>Round 2 --- Forge Inspector</summary>
{full report}
</details>
...
| Flag | Effect |
|------|--------|
| --rounds=N | Override max rounds (1-3). Default: 3. |
| --focus=<domain> | Adds domain-specific pitfall context to all agent prompts. |
| --plan-only | Create plan (Phase 0), skip all reviews. |
| --review-only <path> | Skip plan creation, review existing plan at <path>. |
When --focus=<domain> is specified, append this paragraph to every agent
prompt (Phase 1 and Phase 2):
Additional context: This plan operates in the {DOMAIN} domain. Pay particular
attention to {DOMAIN}-specific concerns.
Domain-specific pitfall lists to include:
concurrency: data races, deadlock/livelock, lock ordering, priority
inversion, false sharing, memory ordering (Acquire/Release vs SeqCst),
Send/Sync bounds, async cancellation safety.
distributed: partial failure, network partitions, clock skew, exactly-once semantics, idempotency, consensus protocol correctness, split-brain, message ordering, retry storms.
security: input validation, injection (SQL/command/XSS), authentication bypass, authorization escalation, timing side channels, secret management, cryptographic misuse, TOCTOU in security checks.
performance: allocation hot paths, cache locality, branch prediction, SIMD opportunities, async runtime blocking, lock contention, false sharing, memory layout (SoA vs AoS), tail latency.
unsafe: soundness holes, aliasing violations, uninitialized memory,
lifetime transmutation, Send/Sync impl correctness, drop order, panic
safety, provenance.
| Mistake | Why it fails | Do this instead |
|---------|-------------|-----------------|
| Skipping Phase 0 codebase exploration | Plan makes wrong assumptions about existing code | Always Glob/Grep/Read before writing the plan |
| Launching reviewers sequentially | Wastes time and allows anchoring | Always launch both in a single message |
| Orchestrator adding own findings during consolidation | Conflates roles, biases revision | Only the reviewer agents produce findings |
| Revising the plan in-place (overwriting prior version) | Loses diff history | Always write a new -v{N+1}.md file |
| Running 3 rounds on a trivial plan | Overhead exceeds value | Use --rounds=1 for simple plans |
| Treating WATCH items as REVISE | Over-engineering the plan | WATCH goes to Open Items, not plan restructure |
| Ignoring the overload threshold | Patching 15 findings creates a Frankenstein plan | If overload triggers, rethink the approach wholesale |
/design-tournament: Run a tournament first to pick the
approach, then forge the implementation plan for the winning design./plan-review: For a final one-shot validation of the forged
plan with 4 specialist lenses instead of 2 generalist reviewers.--focus=unsafe: Consider following up with /unsafe-review
after implementation.diff ~/.claude/plans/*-v1.md ~/.claude/plans/*-v2.md
to see exactly how the plan evolved through review rounds.development
Deep first-principles code explanation that builds real understanding through phased walkthroughs with diagrams. Covers algorithms, data structures, memory layout, concurrency patterns, and performance tricks — especially for systems code in Rust. Use whenever the user asks to explain, walk through, break down, deep dive into, or understand code. Trigger on "how does this work", "what's happening here", "teach me about this", "why is it done this way", or when the user references a file with @ and wants to understand it. Proactively use when examining code involving lock-free algorithms, atomics/CAS, memory ordering,
development
Use when creating implementation-ready beads tasks that need testing strategy, optimal implementation approach, and documentation requirements baked in — composes /create-task with parallel enrichment agents that analyze the codebase and produce concrete test specifications, algorithm/data-structure guidance, and doc quality standards so implementing agents don't need to re-research
development
--- name: autoresearch description: Autonomous Goal-directed Iteration. Apply Karpathy's autoresearch principles to ANY task. Loops autonomously — modify, verify, keep/discard, repeat. Supports bounded iteration via Iterations: N inline config. version: 1.9.11 --- # Claude Autoresearch — Autonomous Goal-directed Iteration Inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch). Applies constraint-driven autonomous iteration to ANY work — not just ML research. **Core id
development
Use when implementing a new feature and assessing coverage gaps, during periodic test hygiene, when test suites feel bloated, or before merging code that changes coordination or hot paths. Two-phase assess-then-improve testing pipeline.