skills/code-review/SKILL.md
Use when reviewing code. Triggers: 'review my code', 'check my work', 'look over this', 'review PR #X', 'PR comments to address', 'reviewer said', 'address feedback', 'self-review before PR', 'audit this code'. For heavyweight multi-phase analysis, use advanced-code-review instead.
npx skillsauth add axiomantic/spellbook code-reviewInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
| Input | Required | Description |
|-------|----------|-------------|
| args | Yes | Mode flags and targets |
| git diff | Auto | Changed files |
| PR data | If --pr | PR metadata via GitHub |
| Output | Type | Description |
|--------|------|-------------|
| findings | List | Issues with severity, file:line |
| status | Enum | PASS/WARN/FAIL or APPROVE/REQUEST_CHANGES |
| Flag | Mode | Command File |
|------|------|-------------|
| --self, -s, (default: no flag given) | Pre-PR self-review | (inline below) |
| --feedback, -f | Process received feedback | code-review-feedback |
| --give <target> | Review someone else's code | code-review-give |
| --audit [scope] | Multi-pass deep-dive | (inline below) |
Modifiers: --tarot (roundtable dialogue via code-review-tarot), --pr <num> (PR source)
| Tool | Purpose |
|------|---------|
| pr_fetch(num_or_url) | Fetch PR metadata and diff |
| pr_diff(raw_diff) | Parse diff into FileDiff objects |
| pr_match_patterns(files, root) | Heuristic pre-filtering |
| pr_files(pr_result) | Extract file list |
MCP tools for read/analyze. gh CLI for write operations (posting reviews, replies). Fallback: MCP unavailable -> gh CLI -> local diff -> manual paste.
--self)Workflow:
git diff $(git merge-base origin/main HEAD)..HEADmemory_recall(query="review finding [project_or_module]") to surface:
<spellbook-memory> context from reading the files under review, incorporate that as well. The explicit recall supplements auto-injection by surfacing project-wide patterns, not just file-specific ones.Example finding: src/auth/login.py:42 [Critical] Token written to log — data exposure risk
memory_store_memories(memories='{"memories": [{"content": "[Finding description]. Severity: [level]. Status: [confirmed/false_positive/deferred].", "memory_type": "[fact or antipattern]", "tags": ["review", "[finding_category]", "[module]"], "citations": [{"file_path": "[reviewed_file]", "line_range": "[lines]"}]}]}')
--audit [scope])Scopes: (none)=branch changes, file.py, dir/, security, all
Memory Priming: Before starting audit passes, call memory_recall(query="review finding [project_or_module]") to surface recurring issues, known false positives, and prior review decisions. Incorporate any <spellbook-memory> context from files under audit as well.
Passes: Correctness > Security > Performance > Maintainability > Edge Cases
API Hallucination Detection (Correctness Pass):
During the Correctness pass, check for API hallucination patterns:
When reviewing AI-generated code, these checks are elevated to HIGH severity. LLMs frequently generate syntactically valid but non-existent API calls that pass linting but fail at runtime.
Output: Executive Summary, findings by category (same severity thresholds as Self Mode), Risk Assessment (LOW/MEDIUM/HIGH/CRITICAL)
Persist Review Findings: After finalizing audit findings, store significant ones using the same protocol as Self Mode (see step 5 above). Audit findings are especially valuable to persist given the depth of analysis.
<FINAL_EMPHASIS> Every finding without file:line is noise. Every severity inflated by effort is a lie. Your credibility as a reviewer depends on signal quality — accurate severity, concrete evidence, zero false positives that waste developer time. </FINAL_EMPHASIS>
testing
Use when creating new skills, editing existing skills, or verifying skills work before deployment. Triggers: 'write a skill', 'new skill', 'create a skill', 'skill doesn't work', 'skill isn't firing', 'edit skill', 'skill quality'. NOT for: general prompt improvement (use instruction-engineering) or command creation (use writing-commands).
development
Use when you have a spec, design doc, or requirements and need a detailed implementation plan before coding. Triggers: 'write a plan', 'create implementation plan', 'plan this out', 'break this down into steps', 'convert design to tasks', 'implementation order'. Also invoked by develop during planning. NOT for: reviewing existing plans (use reviewing-impl-plans).
testing
Use when creating new commands, editing existing commands, or reviewing command quality. Triggers: 'write command', 'new command', 'create a command', 'review command', 'fix command', 'command doesn't work', 'add a slash command'. NOT for: skill creation (use writing-skills).
development
Use when about to claim discovery during debugging. Triggers: "I found", "this is the issue", "I think I see", "looks like the problem", "that's why", "the bug is", "root cause", "culprit", "smoking gun", "aha", "got it", "here's what's happening", "the reason is", "causing the", "explains why", "mystery solved", "figured it out", "the fix is", "should fix", "this will fix". Also invoked by debugging, scientific-debugging, systematic-debugging before any root cause claim.