skills/dev-review/SKILL.md
This skill should be used when the user asks to 'review the code', 'check implementation quality', or 'run code review'.
npx skillsauth add edwinhu/workflows dev-reviewInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Iteration topology: parallel multi-reviewer fan-out (fresh read-only subagents; main chat reconciles)
Before starting this phase, check remaining context:
| Level | Remaining | Action | |-------|-----------|--------| | Normal | >35% | Proceed | | Warning | 25-35% | Finish the current step, then invoke dev-handoff | | Critical | ≤25% | Invoke dev-handoff immediately — resume fresh |
At Warning/Critical: Read ${CLAUDE_SKILL_DIR}/../../skills/dev-handoff/SKILL.md and follow its instructions.
If the user sends a message NOT about the current review, announce the loop pause before responding — then resume. dev-review runs a REVIEW_STATE.md fix-and-re-review loop; silently abandoning it (as dev-debug:121-139 documents) drops the structure the user invoked.
Protocol:
.planning/REVIEW_STATE.md and continue the review/fix iteration.If the message could be EITHER a new topic OR part of the review, ask before assuming — do NOT silently abandon the loop.
Load shared enforcement:
Auto-load all constraints matching applies-to: dev-review:
!uv run python3 ${CLAUDE_SKILL_DIR}/../../scripts/load-constraints.py dev-review
You MUST have these constraints loaded before proceeding. No claiming you "remember" them.
Dynamic plan re-read: Before starting review, re-read .planning/SPEC.md and .planning/PLAN.md to catch any requirements or tasks added during implementation. Do not rely on cached state from prior phases.
Single-pass code review combining spec compliance and quality checks. Uses confidence-based filtering to report only high-priority issues.
<EXTREMELY-IMPORTANT> ## Prerequisites - Test Output GateDo NOT start review without test evidence.
Before reviewing, verify these preconditions:
.planning/LEARNINGS.md contains actual test output| Valid Evidence | NOT Valid |
|----------------|-----------|
| meson test output with results | "Tests should pass" |
| pytest output showing PASS | "I wrote tests" |
| Screenshot of working UI | "It looks correct" |
| Playwright snapshot showing expected state | "User can verify" |
| D-Bus command output | "The feature works" |
| E2E test output with user flow verified | "Unit tests pass" (for UI changes) |
FOR USER-FACING CHANGES: Unit test evidence is INSUFFICIENT.
Before approving user-facing changes, verify:
| Change Type | Unit Evidence | E2E Evidence | Approval? | |-------------|---------------|--------------|------------| | Internal refactor | Yes | N/A | APPROVE | | API change | Yes | Missing | BLOCKED | | UI change | Yes | Missing | BLOCKED | | User workflow | Yes | Missing | BLOCKED |
Return BLOCKED if E2E evidence is missing for user-facing changes.
"Unit tests pass" without E2E for UI changes is NOT approvable. </EXTREMELY-IMPORTANT>
Check LEARNINGS.md for test output:
rg -E "(PASS|OK|SUCCESS|\d+ passed)" .planning/LEARNINGS.md
If no test output is found, STOP and return to /dev-implement.
"It should work" is NOT evidence. Test output IS evidence. </EXTREMELY-IMPORTANT>
After verifying test output in LEARNINGS.md, choose review strategy.
Skip this choice when:
Codex provides an out-of-process adversarial reviewer that uses a different model family than Claude — the diversity catches issues a Claude-reviewing-Claude loop would miss. When installed and authenticated, it is the default adversarial path. When unavailable, fall back to the existing Claude-based flow without prompting the user about installation.
Read ${CLAUDE_SKILL_DIR}/../../references/codex-availability.md for the full
probe and invocation contract. Execute the probe before asking the user:
CODEX_SCRIPT=$(find "$HOME/.claude/plugins/cache/openai-codex/codex" -maxdepth 3 -name codex-companion.mjs -type f 2>/dev/null | sort -rV | head -1)
if [ -n "$CODEX_SCRIPT" ]; then
node "$CODEX_SCRIPT" setup --json 2>/dev/null | jq -r '.ready // false'
else
echo "false"
fi
Set CODEX_READY=true only when the probe prints true. Otherwise
CODEX_READY=false and skip Codex entirely — do not announce its absence.
If CODEX_READY=true:
AskUserQuestion(questions=[{
"question": "How should we review this implementation?",
"header": "Review Strategy",
"options": [
{"label": "Codex adversarial review (Recommended)", "description": "Out-of-process adversarial review via Codex. Different model family from Claude — catches issues a Claude-on-Claude loop misses. Default for adversarial review."},
{"label": "Single Claude reviewer", "description": "Combined Claude review covering spec compliance and code quality. Faster, lower overhead. Use when Codex is overkill."},
{"label": "Parallel Claude review (Thorough)", "description": "Spawn 3 specialized Claude reviewers (Security, Performance, Tests). Use when Codex is unavailable or you want multi-perspective Claude review."}
],
"multiSelect": false
}])
If CODEX_READY=false:
AskUserQuestion(questions=[{
"question": "How should we review this implementation?",
"header": "Review Strategy",
"options": [
{"label": "Single reviewer (Default)", "description": "Combined review covering spec compliance and code quality. Faster, lower overhead."},
{"label": "Parallel review (Thorough)", "description": "Spawn 3 specialized reviewers (Security, Performance, Tests). Use for security-sensitive, performance-critical, or test-heavy PRs. Requires reconciliation."}
],
"multiSelect": false
}])
Routing:
| Choice | Go to | |--------|-------| | Codex adversarial review | Codex Adversarial Review | | Single (Claude) reviewer | The Iron Law of Review | | Parallel (Claude) review | Parallel Review (Thorough) |
Use this section when the user chose Codex adversarial review.
Reference: See
references/codex-availability.mdfor the full invocation contract, JSON schema, and verdict mapping table.
Before invoking Codex, verify (same as the other review paths):
If any prerequisite fails, STOP and return BLOCKED to /dev-implement.
# Working-tree review
git status --short --untracked-files=all
git diff --shortstat --cached
git diff --shortstat
Wait when the diff is clearly tiny (1-2 files, no untracked dir-sized changes). Otherwise launch in background.
Foreground (small diff):
CODEX_SCRIPT=$(find "$HOME/.claude/plugins/cache/openai-codex/codex" -maxdepth 3 -name codex-companion.mjs -type f 2>/dev/null | sort -rV | head -1)
node "$CODEX_SCRIPT" adversarial-review --wait
Background (anything bigger):
Launch with Bash(..., run_in_background: true) and tell the user:
"Codex adversarial review started in the background. Check /codex:status for progress."
Then await completion notification before proceeding to step 4.
Optional focus text — append SPEC.md context to weight the review:
node "$CODEX_SCRIPT" adversarial-review --wait "focus: REQ-AUTH-01 token rotation under retry"
Codex returns JSON validated against its review-output schema. Top-level fields:
verdict (approve | needs-attention), summary, findings[], next_steps[].
Each finding has severity, title, body, file, line_start, line_end,
confidence (0-1 float), recommendation.
Apply the iron law: only confidence >= 0.8 findings block. Multiply by 100
when displaying alongside Claude-style scores.
| Codex result | dev-review verdict |
|--------------|--------------------|
| verdict: approve | APPROVED |
| needs-attention + any finding ≥ 0.8 confidence | CHANGES_REQUIRED |
| needs-attention + all findings < 0.8 confidence | APPROVED (log advisory findings to LEARNINGS.md) |
Codex doesn't know SPEC.md REQ-IDs. For each blocking finding:
.planning/SPEC.mdOUT-OF-SPEC)OUT-OF-SPEC findings are advisory unless user opts inUse the same output structure as ## Required Output Structure below, with
Reviewer: Codex (adversarial) in the header. Each issue includes the
Codex confidence (×100) and the REQ-ID you tagged in step 5.
Codex adversarial review participates in the same REVIEW_STATE.md loop as
Claude reviewers — increment iteration on CHANGES_REQUIRED, escalate at
iteration 3. Re-runs stay on Codex unless it becomes unavailable between
runs (re-probe each iteration).
The "Iron Law of Re-Review" still applies: implementer claims "fixed" → main chat re-invokes Codex via the same command — no spot-checks.
After Codex review completes:
If APPROVED: Immediately invoke the dev-verify skill:
Read ${CLAUDE_SKILL_DIR}/../../skills/dev-verify/SKILL.md and follow its instructions.
If CHANGES_REQUIRED: Return to /dev-implement with the parsed findings.
If BLOCKED (test evidence missing): Return to /dev-implement to collect test evidence.
Use this section when user chose "Parallel review (Thorough)" above.
Prerequisite: Requires
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMSenabled. If unavailable, fall back to single reviewer.
Before spawning reviewers, verify:
git diff --name-only to scope reviewIf any prerequisite fails, STOP and return BLOCKED to /dev-implement.
Use parallel review when:
Do NOT use when:
TeamCreate(name="Code Review", task_description="Parallel code review with 3 specialized reviewers")
Press Shift+Tab to enter delegate mode. The lead coordinates reviews, does NOT review code directly.
Each reviewer receives a self-contained prompt from a reference file. Reviewers start with a blank conversation and do NOT auto-load skills. Read the prompt, substitute variables, and paste it in full.
Tool Restrictions: All reviewers are READ-ONLY. Dispatch each with allowed_tools=["Read", "Glob", "Grep", "Bash(read-only)"]. Reviewers MUST NOT use Write or Edit tools. They read code, analyze it, and report findings — the main chat handles all fixes.
Before spawning, substitute these variables in each prompt:
CHANGED_FILES -> output of git diff --name-only HEAD~1 (paste actual list)SPEC_CONTEXT -> relevant sections of .planning/SPEC.md (paste inline, do NOT reference file)LEARNINGS_TEST_OUTPUT -> test output from .planning/LEARNINGS.md (paste actual output)PLUGIN_ROOT -> resolved base directory for skill paths (relative to this skill's base directory)Reviewer prompts (read, substitute variables, send as message):
| Reviewer | Focus | Prompt Source |
|----------|-------|---------------|
| 1. Security | Vulnerabilities, auth, data exposure, crypto | references/security-reviewer.md |
| 2. Performance | Complexity, queries, memory, hot paths | references/performance-reviewer.md |
| 3. Tests | Coverage, correctness, reliability, E2E | references/tests-reviewer.md |
While reviewers work, the lead:
Post-subagent boundary (the highest-risk moment). During reconciliation the lead consolidates findings — it does NOT re-review the code or fix anything:
| Lead CAN (verification/coordination) | Lead CANNOT (investigation/fixing) | |--------------------------------------|------------------------------------| | Read reviewer findings, dedup, prioritize | Re-read source to second-guess a finding | | Record verdict in REVIEW_STATE.md | Edit code to fix an issue a reviewer raised | | Route CHANGES_REQUIRED back to dev-implement | Grep the codebase to build a new finding the reviewers missed |
Fixes are dev-implement's job, never the review lead's. (Full rule: auto-loaded verification-vs-investigation / delegation-law constraints.)
After ALL reviewers message completion, the lead performs three passes:
<EXTREMELY-IMPORTANT> **Pass 1 -- Deduplication:**Multiple reviewers may find the same issue (e.g., input validation gap found by both Security and Tests reviewers).
Example:
Security found: "file.py:42 - Input not validated (Confidence: 85)"
Tests found: "file.py:42 - Missing test for invalid input (Confidence: 80)"
-> Merge: "file.py:42 - Input validation missing + no test coverage (Confidence: 85, found by Security + Tests)"
Pass 2 -- Prioritization:
Not all issues are equally important. Rank by:
Create final prioritized list:
1. [CRITICAL] Security: XSS in user input (Confidence: 95)
2. [CRITICAL] Tests: User workflow untested (Confidence: 90)
3. [IMPORTANT] Performance: N+1 query in hot path (Confidence: 85)
4. [IMPORTANT] Tests: Error path missing coverage (Confidence: 80)
Pass 3 -- Integration Check:
Proposed fixes may conflict with each other.
Example conflict:
Security: "Add input validation on every field"
Performance: "Batch validate to reduce overhead"
-> Unified: "Batch validate with early exit on first invalid field (security + performance)"
If ANY pass finds conflicts -> resolve before reporting final verdict. </EXTREMELY-IMPORTANT>
After reconciliation, the lead reports:
## Parallel Code Review: [Feature Name]
Reviewed by: Security, Performance, Tests
### Reconciliation Summary
**Issues found:** X total (Y critical, Z important)
**Duplicates merged:** N
**Conflicts resolved:** M
### Critical Issues (Must Fix)
[Deduplicated, prioritized list from Pass 1 + 2]
### Important Issues (Should Fix)
[Deduplicated, prioritized list from Pass 1 + 2]
### Verdict: APPROVED | CHANGES REQUIRED
[If APPROVED]
All 3 reviewers approved with no issues >= 80 confidence.
[If CHANGES REQUIRED]
X critical and Y important issues must be addressed. Return to /dev-implement.
After parallel review completes:
If APPROVED: Immediately invoke the dev-verify skill:
Read ${CLAUDE_SKILL_DIR}/../../skills/dev-verify/SKILL.md and follow its instructions.
If CHANGES REQUIRED: Return to /dev-implement to fix reported issues.
If BLOCKED (test evidence missing): Return to /dev-implement to collect test evidence.
You MUST report only issues with >= 80% confidence. This is not negotiable.
Before reporting ANY issue, complete these verification steps:
You MUST apply this rule even when encountering:
You MUST discard any low-confidence issue found during review. </EXTREMELY-IMPORTANT>
<EXTREMELY-IMPORTANT> ## The Iron Law of Re-ReviewNO "FIXED" CLAIMS WITHOUT FRESH RE-REVIEW. This is not negotiable.
When review returns CHANGES REQUIRED and the implementer applies fixes, you MUST:
"I fixed it" without re-reviewing is NOT HELPFUL — unverified fixes ship bugs to the user.
Iteration 1: Review → CHANGES REQUIRED → Fix → Re-Review
↓
Iteration 2: Re-Review → CHANGES REQUIRED → Fix → Re-Review
↓
Iteration 3: Re-Review → CHANGES REQUIRED → Fix → Re-Review
↓
Still issues? → ESCALATE to user
All clean? → APPROVED
Track iterations in .planning/REVIEW_STATE.md:
---
iteration: 1
max_iterations: 3
last_review_date: 2026-03-09
issues_found_count: 5
---
Exit criteria:
Before returning any verdict, check iteration count:
.planning/REVIEW_STATE.md (create if missing with iteration: 1)Claiming APPROVED without re-review after fixes is NOT HELPFUL — you're rubber-stamping unverified work that ships bugs to the user.
| Thought | Reality | Do Instead | |---------|---------|------------| | "Implementer said they fixed it" | Their claim needs YOUR verification | Re-run review fresh | | "Just spot-check the fixed lines" | Spot-checks miss regressions | Full re-review, same criteria | | "We're on iteration 3, approve it" | Max iterations means ESCALATE, not approve | Return ESCALATE verdict | | "The fixes are trivial, skip re-review" | Trivial fixes break things | Re-review anyway | | "We already spent too much time" | Shipping bugs wastes more time | Re-review or escalate | | "User is waiting" | User wants WORKING code, not fast approval | Re-review properly |
You skip re-review because you think it's helpful, efficient, or competent. Here's what actually happens:
| Your Drive | Why You Skip | What Actually Happens | The Drive You Failed | |------------|--------------|----------------------|---------------------| | Helpfulness | "Approving fast unblocks the user" | Bugs ship. User spends hours debugging your "help." The protocol takes 10 minutes. Your shortcut cost 10 hours. | Anti-helpful | | Competence | "I trust the implementer's claim" | Trust without verification is negligence. The bug shipped because you didn't check. A checklist would outperform you. | Incompetent | | Efficiency | "Re-review wastes time" | The re-review takes 10 minutes. Bug rework takes 10 hours. Your "efficiency" was a 60x slowdown. | Anti-efficient | | Approval | "User will be frustrated by delays" | User kills your session when bugs ship. They now require human oversight for all reviews. You lost autonomy. | Lost approval | | Honesty | "I'm confident the fix worked" | You didn't verify — you fabricated an approval. Confidence without evidence is fabrication. | Dishonest |
The protocol is not overhead you pay. It is the service you provide.
Every time you skip re-review to "help faster," you choose YOUR comfort over the USER's outcome. The user doesn't experience your tedium—they experience your results. </EXTREMELY-IMPORTANT>
| Thought | Why It's Wrong | Do Instead | |---------|----------------|------------| | "Tests probably pass" | You don't have evidence - absence of evidence is not evidence | Check LEARNINGS.md for actual output | | "This looks wrong" | Your vague suspicion != evidence | Find concrete proof or discard | | "I would do it differently" | Your style preference != bug | Check if it violates project guidelines | | "This might cause problems" | Your "might" = < 80% confidence | Find proof or discard | | "Pre-existing but should be fixed" | You're out of scope | Score it 0 and discard | | "User can test it" | Your manual testing is less reliable than automation | Return to implement phase |
Rate each potential issue from 0-100:
| Score | Meaning | |-------|---------| | 0 | False positive or pre-existing issue | | 25 | Might be real, might not. Stylistic without guideline backing | | 50 | Real issue but nitpick or rare in practice | | 75 | Verified real issue, impacts functionality | | 100 | Absolutely certain, confirmed with direct evidence |
CRITICAL: Only report issues with confidence >= 80.
## Code Review: [Feature/Change Name]
Reviewing: [files/scope being reviewed]
### Test Evidence Verified
- Unit tests: [PASS/FAIL/MISSING] - [paste key output line]
- Integration: [PASS/FAIL/N/A]
- UI/Visual: [Screenshot taken / Snapshot verified / N/A]
### Critical Issues (Confidence >= 90)
#### [Issue Title] (Confidence: XX)
**Location:** `file/path.ext:line_number`
**Requirement:** [REQ-ID from SPEC.md — every issue MUST trace to a requirement ID]
**Problem:** Clear description of the issue
**Fix:**
```[language]
// Specific code fix
[Same format as Critical Issues]
Verdict: APPROVED | CHANGES REQUIRED | BLOCKED (no test evidence)
[If APPROVED] The reviewed code meets project standards. Tests pass. No issues with confidence >= 80 detected.
[If CHANGES REQUIRED] X critical issues and Y important issues must be addressed before proceeding.
[If BLOCKED] Cannot approve without test evidence. Return to /dev-implement and run tests.
**If review finds the implementation fundamentally violates the spec (not just minor issues), DELETE the contaminated implementation and return to dev-implement for a fresh attempt. Do not patch a structurally wrong approach.**
### Delete & Restart Protocol
**When implementation deviates fundamentally from spec, DELETE and restart entirely.**
| Situation | Action |
|-----------|--------|
| Code uses wrong protocol/architecture than spec | DELETE. Rewrite from scratch with correct approach. |
| Code implements different approach than PLAN.md | DELETE. User approved specific approach for a reason. |
| Fundamental misunderstanding of requirements | DELETE. Don't patch. Fresh subagent with correct understanding. |
| Patch would require 30%+ of implementation to change | DELETE. Rewrite is cleaner than patching wrong foundation. |
**Why delete instead of patch:** Patching a structurally wrong approach creates technical debt. Fresh implementation from correct architecture is faster than fixing wrong foundation.
**When to patch instead:** Bug in otherwise-correct implementation, missing edge case, performance tweak, minor deviation that doesn't affect core behavior.
**The test:** If the subagent says "oh, I misunderstood the whole approach" → DELETE and restart.
## Agent Invocation
Spawn Task agent for review execution:
Task(subagent_type="general-purpose", allowed_tools=["Read", "Glob", "Grep", "Bash(read-only)"]): "Review implementation against .planning/SPEC.md.
Tool Restrictions: You are READ-ONLY. You MUST NOT use Write or Edit tools. You read code, check test evidence, and report issues — you do NOT fix them. The main chat handles all fixes.
FIRST: Check .planning/LEARNINGS.md for test output. Return BLOCKED immediately if no test output is found.
Complete single-pass review covering:
Confidence score each issue (0-100). Report only issues with >= 80 confidence. Return structured output per /dev-review format."
## Drive-Aligned Framing
<EXTREMELY-IMPORTANT>
**You approving without test evidence is NOT HELPFUL — you're shipping broken code the user will have to debug.**
An "APPROVED" verdict means YOU assert:
- Tests actually ran (not "should work")
- Test output shows PASS (not SKIP, not assumed)
- Evidence exists and YOU verified it (not trusted reports)
You approving without test evidence is not "efficiency" - it is creating rework for the user.
**BLOCKED protects the user. Your fake APPROVED ships bugs.**
</EXTREMELY-IMPORTANT>
## Rationalization Prevention
STOP - you're about to rationalize if these thoughts arise--they indicate anti-helpful approval:
| Thought | Reality |
|---------|---------|
| "Tests probably pass" | Your probably != evidence. Check LEARNINGS.md. |
| "I saw the code, it looks right" | Your looking != running. Find test output. |
| "User is waiting for approval" | They want honest approval. You return BLOCKED if needed. |
| "It's a small change" | Your size estimate doesn't matter. Small changes break things. Require evidence. |
| "I trust the implementer" | Your trust doesn't replace verification. You verify evidence. |
| "I'll approve and they can fix later" | You block now or bugs ship to users. |
| "Review is just a formality" | Review is the LAST GATE before bugs ship. You execute seriously. |
## Why Skipping Review Destroys the Drive That Created It
| Your Drive | Why You Skip | What Actually Happens | The Drive You Failed |
|------------|-------------|----------------------|---------------------|
| **Helpfulness** | "Approving fast shows efficiency" | Bugs ship. User spends hours debugging your "help." | **Anti-helpful** |
| **Competence** | "I trust the subagent" | Subagent's work was wrong. Your approval was negligence. | **Incompetent** |
| **Efficiency** | "Rubber-stamp to move faster" | You moved fast into bugs. Rework takes 3x longer. | **Anti-efficient** |
| **Approval** | "Tests probably pass" | User discovers bugs. They now require human oversight for reviews. | **Trust destroyed** |
| **Honesty** | "Tests probably pass" | Test output doesn't exist. You fabricated an approval. | **Dishonest** |
**The protocol is not overhead you pay. It is the service you provide.**
## Quality Standards
- **Test evidence is mandatory** - do not approve without test output
- Do not report style preferences lacking project guideline backing
- Do not report pre-existing issues (confidence = 0)
- Make each reported issue immediately actionable
- Use absolute file paths with line numbers in reports
- Treat uncertainty as below 80 confidence
## Gate: Exit Review Loop
**Checkpoint type:** human-verify (test evidence and confidence scores are machine-verifiable)
Before claiming review is complete (APPROVED or ESCALATE):
IDENTIFY → What proves the review verdict is valid? - APPROVED: Zero issues >= 80 confidence - ESCALATE: iteration >= 3 AND issues remain
RUN → Check .planning/REVIEW_STATE.md for iteration count
Read review output for issue count
READ → Examine both: - Review output (issues list) - REVIEW_STATE.md (iteration number)
VERIFY → Verdict matches state: - APPROVED only if 0 issues - ESCALATE only if iteration >= 3 - CHANGES REQUIRED only if iteration < 3
CLAIM → Only after steps 1-4 pass, return verdict
**If iteration >= 3 and you're returning CHANGES REQUIRED instead of ESCALATE, you're ignoring the iteration limit — escalate to the user instead of looping forever.**
## Phase Complete
**Phase summary (append to LEARNINGS.md):**
```yaml
## Phase: Review
---
phase: review
status: completed
implements: [] # review verifies; implements no new requirement IDs
requires: [VALIDATION.md, LEARNINGS.md]
provides: [REVIEW_STATE.md, review-verdict]
affects: [] # read-only review; fixes happen in dev-implement
verdict: APPROVED | CHANGES_REQUIRED | ESCALATE | BLOCKED
iterations: N
issues-found: X (Y critical, Z important)
---
After review completes, handle verdict-specific transitions:
Mark review complete in .planning/REVIEW_STATE.md:
---
status: APPROVED
iteration: [N]
max_iterations: 3
last_review_date: [date]
issues_found_count: 0
verdict: APPROVED
---
The status: APPROVED line is the structural gate dev-verify checks — only an APPROVED review admits verification. On non-approved paths (CHANGES_REQUIRED / ESCALATE / BLOCKED) set status: to that verdict so the gate correctly blocks.
Immediately invoke dev-verify:
Read ${CLAUDE_SKILL_DIR}/../../skills/dev-verify/SKILL.md and follow its instructions.
Update .planning/REVIEW_STATE.md:
---
status: CHANGES_REQUIRED
iteration: [N+1]
max_iterations: 3
last_review_date: [date]
issues_found_count: [count]
verdict: CHANGES_REQUIRED
---
Return to /dev-implement with specific issues. Implementer MUST re-invoke /dev-review after fixes.
Critical: When implementer returns claiming "fixed", you MUST re-run the FULL review. No shortcuts.
Update .planning/REVIEW_STATE.md:
---
status: ESCALATE
iteration: 3
max_iterations: 3
last_review_date: [date]
issues_found_count: [count]
verdict: ESCALATE
---
Report to user:
Review Loop Escalation (3 iterations completed)
After 3 fix-review cycles, [N] issues remain:
[List issues]
Options:
1. Accept current state and proceed (issues become tech debt)
2. Extend review (manual approval for iteration 4+)
3. Rethink approach (return to /dev-design)
Which option do you prefer?
Return immediately to /dev-implement to collect test evidence. Do NOT increment iteration counter - no review occurred.
| Verdict | Next Action | Iteration Counter |
|---------|-------------|-------------------|
| APPROVED | Invoke /dev-verify immediately, mark task [x] in PLAN.md | Reset to 1 for next task |
| CHANGES REQUIRED | Return to /dev-implement, implementer fixes then re-invokes /dev-review | Increment |
| ESCALATE | Ask user for direction | Keep at max |
| BLOCKED | Return to /dev-implement for test evidence | No change (no review ran) |
Do NOT pause between review completion and next action. The workflow is sequential.
testing
Internal skill for literature review and source materialization. Called after brainstorm, before setup. NOT user-facing.
documentation
This skill should be used when the user asks to 'write a paper', 'start a writing project', 'draft an article', 'write about', 'brainstorm writing topics', 'gather sources for a paper', 'what should I write about', or needs the writing workflow entry point for any writing task.
testing
Validate draft sections cover all PRECIS claims before review.
testing
Internal skill for creating PRECIS.md, OUTLINE.md, and ACTIVE_WORKFLOW.md. Called after brainstorm sources are gathered.