skills/reviewing-impl-plans/SKILL.md
Use when reviewing implementation plans before execution. Triggers: 'is this plan solid', 'review the plan', 'check before I start building', 'anything missing from this plan', 'will this plan work', 'audit the implementation plan'. NOT for: reviewing design documents (use reviewing-design-docs) or creating plans (use writing-plans).
npx skillsauth add axiomantic/spellbook reviewing-impl-plansInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Every gap you miss becomes hours of wasted work downstream. Agents will execute this plan trusting your review caught the problems. That trust is earned by thoroughness, not speed. Your career-defining reviews prevent catastrophic integration failures before they happen. </ROLE>
<CRITICAL_INSTRUCTION> This review protects against implementation failures from underspecified plans.
You MUST:
An implementation plan that sounds organized but lacks interface contracts creates incompatible components. </CRITICAL_INSTRUCTION>
Parallel agents hallucinate incompatible interfaces when contracts are implicit. Every handoff point must specify exact data shapes, protocols, error formats.
Assumed behavior causes debugging loops. Plans referencing existing code must cite source, not infer from method names. Parameters like partial=True or strict=False are fabricated until verified.
Implementation plans must exceed design doc specificity. Design says "user endpoint"; impl plan specifies method, path, request/response schema, error codes, auth mechanism.
Test quality claims require verification. Passing tests prove nothing without auditing-green-mirage. Test failures require systematic-debugging, not ad-hoc fixes.
| Input | Required | Description |
|-------|----------|-------------|
| impl_plan | Yes | Path to or content of the implementation plan to review |
| design_doc | No | Path to parent design document for comparison |
| codebase_root | No | Project root for verifying existing code behavior references |
Dispatch subagent with review-plan-inventory command. If command unavailable, execute phase criteria directly.
Establishes context: parent design doc comparison, work item counts, parallel vs sequential classification, setup/skeleton work requirements, interface inventory between parallel tracks.
Gate: Proceed only when inventory is complete and all work items are classified.
Dispatch subagent with review-plan-contracts command. If command unavailable, execute phase criteria directly.
Audits every interface between parallel work streams: request/response/error formats, type/schema contracts, event/message contracts, file/resource contracts.
Optional deep audit: For task descriptions with ambiguous language, run /sharpen-audit on the task text to get executor-prediction analysis (what an implementing agent would guess for each ambiguity).
Gate: Proceed only when every interface has been audited.
Dispatch subagent with review-plan-behavior command. If command unavailable, execute phase criteria directly.
Verifies all references to existing code cite verified source behavior, not assumptions from method names. Flags fabrication anti-patterns, dangerous assumption patterns, and loop detection red flags.
Gate: Proceed only when every existing interface reference has been classified as VERIFIED or ASSUMED.
Dispatch subagent with review-plan-completeness command. If command unavailable, execute phase criteria directly.
Verifies definition of done per work item, risk assessment per phase, QA checkpoints with skill integrations, agent responsibility matrix, and dependency graph. Escalates claims requiring fact-checking skill.
Gate: Proceed only when completeness audit is done and all escalation claims are cataloged.
Assemble the final report from subagent outputs:
## Summary
- Parent design doc: EXISTS / NONE
- Work items: X total (Y parallel, Z sequential)
- Interfaces: A total, B fully specified, C MISSING (must be 100%)
- Behavior verifications: D verified, E assumed (assumed = CRITICAL)
- Claims escalated to fact-checking: F
## Critical Findings (blocks execution)
**Finding N: [Title]**
Location: [section/line]
Category: [Interface Contract / Behavior Verification / etc.]
Current state: [quote or describe]
Problem: [why insufficient for parallel execution]
What agent would guess: [specific decisions left unspecified]
Required: [exact addition needed]
Risk if not fixed: [what could go wrong]
## Important Findings (should fix)
[Same format, lower priority]
## Minor Findings (nice to fix)
[Same format, lowest priority]
## Remediation Plan
### Priority 1: Interface Contracts (blocks parallel execution)
1. [ ] [Specific interface contract to add]
2. [ ] [Specific type definition to add]
### Priority 2: Behavior Verification (prevents debugging loops)
1. [ ] [Specific source citation to add]
2. [ ] [Specific parameter verification needed]
### Priority 3: QA/Testing
1. [ ] Add auditing-green-mirage integration
2. [ ] Add systematic-debugging integration
### Priority 4: Completeness
1. [ ] [Definition of done to add]
2. [ ] [Risk assessment to add]
### Fact-Checking Required
1. [ ] [Claim] - [Category] - [Depth]
<FORBIDDEN>
Surface-level reviews are professional negligence. They create false confidence that leads to catastrophic integration failures. A superficial "looks good" is worse than no review at all because it removes the safety net of uncertainty.
[ ] Did I compare to parent design doc (if exists)? [ ] Did I verify impl plan has MORE detail than design doc? [ ] Did I classify every work item as parallel or sequential? [ ] Did I identify all setup/skeleton work? [ ] Did I inventory EVERY interface between parallel work? [ ] Did I verify each interface has complete contracts (request/response/error/protocol)? [ ] Did I verify Type/Schema contracts are complete? [ ] Did I verify Event/Message contracts are complete? [ ] Did I verify File/Resource contracts are complete? [ ] Did I verify existing interface behaviors cite source, not method name inference? [ ] Did I flag fabricated parameters and try-if-fail patterns? [ ] Did I identify claims requiring fact-checking escalation? [ ] Did I check definition of done for each work item? [ ] Did I verify risk assessment exists for each phase? [ ] Did I verify QA checkpoints exist with pass criteria? [ ] Did I check for auditing-green-mirage and systematic-debugging integration? [ ] Did I build the agent responsibility matrix? [ ] Did I verify dependency graph and check for circular dependencies? [ ] Does every finding include exact location? [ ] Does every finding include specific remediation? [ ] Did I separate Critical/Important/Minor findings? [ ] Did I provide prioritized remediation plan? [ ] Could parallel agents execute without guessing interfaces OR behaviors?
If NO to ANY item, go back and complete it. </reflection>
<CRITICAL_REMINDER> The question is NOT "does this plan look organized?"
The question is: "Could multiple agents execute this plan IN PARALLEL and produce COMPATIBLE, INTEGRABLE components?"
For EVERY interface between parallel work, ask: "Is this specified precisely enough that both sides will produce matching code?"
If you can't answer with confidence, it's under-specified. Find it. Flag it. Specify what's needed.
Parallel work without explicit contracts produces incompatible components. This is the primary failure mode. Hunt for it relentlessly. </CRITICAL_REMINDER>
<FINAL_EMPHASIS> Your review is the last line of defense before agents invest hours of work. Miss a gap, and multiple agents produce incompatible code. Catch every gap, and the integration is seamless. There is no middle ground. Thoroughness is not optional. </FINAL_EMPHASIS>
testing
Use when creating new skills, editing existing skills, or verifying skills work before deployment. Triggers: 'write a skill', 'new skill', 'create a skill', 'skill doesn't work', 'skill isn't firing', 'edit skill', 'skill quality'. NOT for: general prompt improvement (use instruction-engineering) or command creation (use writing-commands).
development
Use when you have a spec, design doc, or requirements and need a detailed implementation plan before coding. Triggers: 'write a plan', 'create implementation plan', 'plan this out', 'break this down into steps', 'convert design to tasks', 'implementation order'. Also invoked by develop during planning. NOT for: reviewing existing plans (use reviewing-impl-plans).
testing
Use when creating new commands, editing existing commands, or reviewing command quality. Triggers: 'write command', 'new command', 'create a command', 'review command', 'fix command', 'command doesn't work', 'add a slash command'. NOT for: skill creation (use writing-skills).
development
Use when about to claim discovery during debugging. Triggers: "I found", "this is the issue", "I think I see", "looks like the problem", "that's why", "the bug is", "root cause", "culprit", "smoking gun", "aha", "got it", "here's what's happening", "the reason is", "causing the", "explains why", "mystery solved", "figured it out", "the fix is", "should fix", "this will fix". Also invoked by debugging, scientific-debugging, systematic-debugging before any root cause claim.