Systematic Code Review Skill

Systematic 4-phase code review: UNDERSTAND changes, VERIFY claims against actual behavior, ASSESS security/performance/architecture risks, DOCUMENT findings with severity classification. Each phase has an explicit gate that must pass before proceeding because skipping phases causes missed context, incorrect conclusions, and incomplete risk assessment.

Reference Loading Table

| Signal | Load These Files | Why | |---|---|---| | implementation patterns | go-review-patterns.md | Loads detailed guidance from go-review-patterns.md. | | tasks related to this reference | receiving-feedback.md | Loads detailed guidance from receiving-feedback.md. | | tasks related to this reference | severity-classification.md | Loads detailed guidance from severity-classification.md. |

Instructions

Phase 1: UNDERSTAND

Goal: Map all changes and their relationships before forming any opinions.

Step 1: Read CLAUDE.md

Read and follow repository CLAUDE.md files first because project conventions override default review criteria and may define custom severity rules, approved patterns, or scope constraints.

Step 2: Read every changed file

Use Read tool on EVERY changed file completely because reviewing summaries or reading partial files misses dependencies between changes and leads to incorrect conclusions.
Map what each file does and how changes affect it.
Check affected dependencies and identify ripple effects because changes in one file can break consumers that aren't in the diff.

Step 3: Identify dependencies

Use Grep to find all callers/consumers of changed code.
Note any comments that make claims about behavior (these are claims to verify in Phase 2, not facts to trust).

Step 3a: Caller Tracing (mandatory when diff modifies function signatures, parameter semantics, or introduces sentinel/special values)

When the change modifies how a function/method is called or what parameters mean:

Find ALL callers — Grep for the function name with receiver syntax (e.g., .GetEvents( not just GetEvents) across the entire repo. For Go repos, prefer gopls go_symbol_references via ToolSearch("gopls") — it's type-aware and catches interface implementations.
Trace the VALUE SPACE — For each parameter source, classify what values can flow through:
- Query parameters (r.FormValue, r.URL.Query): user-controlled — ANY string including sentinel values like "*"
- Auth token fields: server-controlled (UUIDs, structured IDs)
- Constants/enums: fixed set
- Classify sentinel strings by their true source. If a value can originate from user input, treat it as reachable and verify the input path.
Verify each caller — For each call site, check that parameters are validated before being passed. Pay special attention to sentinel values (e.g., "*" meaning "all/unfiltered") that bypass security filtering.
Report unvalidated paths — Any caller that passes user input to a security-sensitive parameter without validation is a BLOCKING finding.
Verify callers independently — Treat the PR description as a lead, then confirm the caller set directly in the codebase. The PR author may have forgotten about callers, or new callers may have been added in other branches.

This step catches:

Unchecked paths where user input reaches a security-sensitive parameter
Callers the PR author forgot about or didn't mention
Interface implementations that don't enforce the same preconditions

Step 4: Document scope

PHASE 1: UNDERSTAND

Changed Files:
  - [file1.ext]: [+N/-M lines] [brief description of change]
  - [file2.ext]: [+N/-M lines] [brief description of change]

Change Type: [feature | bugfix | refactor | config | docs]

Scope Assessment:
  - Primary: [what's directly changed]
  - Secondary: [what's affected by the change]
  - Dependencies: [external systems/files impacted]

Caller Tracing (if signature/parameter semantics changed):
  - [function/method]: [N] callers found
    - [caller1:line] — parameter validated: [yes/no]
    - [caller2:line] — parameter validated: [yes/no]
  - Unvalidated paths: [list or "none"]

Questions for Author:
  - [Any unclear aspects that need clarification]

Gate: All changed files read (not just some — reading 2 of 5 files and saying "I get the gist" fails this gate), scope fully mapped, callers traced (if applicable). Proceed only when gate passes.

Phase 2: VERIFY

Opus 4.7 override: Opus 4.7 trades tool calls for reasoning by default. In verification, that default is wrong. Run the command. Do not reason about whether the command would pass. Do not summarize the expected output. Execute the check, paste the exit code, paste the relevant output. A verification phase that produces a verdict without an observed tool result is not a verification — it is a guess with a rigor aesthetic.

Goal: Validate all assertions in code, comments, and PR description against actual behavior.

Step 1: Run tests

Execute existing tests for changed files because review cannot approve without test execution — visual inspection misses runtime issues that tests catch.
Capture complete test output. Show the output rather than describing it because facts outweigh narrative.
Verify test coverage: confirm tests exist for the changed code paths because untested code paths are a SHOULD FIX finding.

Step 2: Verify claims

Check every comment claim against code behavior because comments frequently become outdated and developers may not understand what "thread-safe" actually means — never accept a comment as truth without inspecting the code that backs it.
Verify edge cases mentioned are actually handled.
Trace through critical code paths manually.

Step 3: Document verification

PHASE 2: VERIFY

Claims Verification:
  Claim: "[Quote from comment or PR description]"
  Verification: [How verified - test run, code trace, etc.]
  Result: VALID | INVALID | NEEDS-DISCUSSION

Test Execution:
  $ [test command]
  Result: [PASS/FAIL with summary]

Behavior Verification:
  - Expected: [what change claims to do]
  - Actual: [what code actually does]
  - Match: YES | NO | PARTIAL

Gate: All assertions in code/comments verified against actual behavior. Tests executed with output captured. Proceed only when gate passes.

Phase 3: ASSESS

Goal: Evaluate security, performance, and architectural risks specific to these changes.

Step 1: Security assessment

Never skip this step because security implications must be explicitly evaluated for every review, even when changes appear benign.
Evaluate OWASP top 10 against changes.
Explain HOW each vulnerability was ruled out (not just checkboxes) because a checkbox approach misses context-specific vulnerabilities and gives false confidence.
If optionally enabled: perform extended deep security analysis beyond standard checks.

Step 2: Performance assessment

Identify performance-critical paths and evaluate impact.
Check for N+1 queries, unbounded loops, unnecessary allocations.
If optionally enabled: benchmark affected code paths with profiling.

Step 3: Architectural assessment

Compare patterns to existing codebase conventions.
Assess breaking change potential.
If optionally enabled: check for similar past issues via historical analysis.

Step 4: Extraction severity escalation

If the diff extracts inline code into named helper functions, re-evaluate all defensive guards.
A missing check rated LOW as inline code (1 caller, "upstream validates") becomes MEDIUM as a reusable function (N potential callers).

Step 5: Document assessment

PHASE 3: ASSESS

Security Assessment:
  SQL Injection: [N/A | CHECKED - how verified | ISSUE - details]
  XSS: [N/A | CHECKED - how verified | ISSUE - details]
  Input Validation: [N/A | CHECKED - how verified | ISSUE - details]
  Auth: [N/A | CHECKED - how verified | ISSUE - details]
  Secrets: [N/A | CHECKED - how verified | ISSUE - details]
  Findings: [specific issues or "No security issues found"]

Performance Assessment:
  Findings: [specific issues or "No performance issues found"]

Architectural Assessment:
  Findings: [specific issues or "Architecturally sound"]

Risk Level: LOW | MEDIUM | HIGH | CRITICAL

Gate: Security, performance, and architectural risks explicitly evaluated (not skipped or hand-waved). Proceed only when gate passes.

Phase 4: DOCUMENT

Goal: Produce structured review output with clear verdict and rationale.

Report facts without self-congratulation. Show command output rather than describing it. Be concise but informative because the review consumer needs actionable findings, not commentary.

Only flag issues within the scope of the changed code because suggesting features outside PR scope is over-engineering — but DO flag all issues IN the changed code even if fixing them requires touching other files. No speculative improvements.

When classifying severity, use the Severity Classification Rules below and classify UP when in doubt because it is better to require a fix and have the author push back than to let a real issue slip through as "optional."

PHASE 4: DOCUMENT

Review Summary:
  Files Reviewed: [N]
  Lines Changed: [+X/-Y]
  Test Status: [PASS/FAIL/SKIPPED]
  Risk Level: [LOW/MEDIUM/HIGH/CRITICAL]

Findings (use Severity Classification Rules - when in doubt, classify UP):

BLOCKING (cannot merge - security/correctness/reliability):
  1. [Issue with file:line reference and category from rules]

SHOULD FIX (fix unless urgent - patterns/tests/debugging):
  1. [Issue with file:line reference and category from rules]

SUGGESTIONS (author's choice - purely stylistic):
  1. [Suggestion with benefit - only if genuinely optional]

POSITIVE NOTES:
  1. [Good practice observed]

Verdict: APPROVE | REQUEST-CHANGES | NEEDS-DISCUSSION

Rationale: [1-2 sentences explaining verdict]

After producing the review, remove any temporary analysis files, notes, or debug outputs created during review because only the final review document should persist.

Gate: Structured review output with clear verdict. Review is complete.

Reference Material

Trust Hierarchy

When conflicting information exists, trust in this order:

Running code (highest) - What tests show
HTTP/API requests - Verified external behavior
Grep results - What exists in codebase
Reading source - Direct file inspection
Comments/docs - Claims that need verification
PR description (lowest) - May be outdated or incomplete

Severity Classification Rules

Three tiers: BLOCKING (cannot merge — security, correctness, reliability), SHOULD FIX (fix unless urgent — patterns, tests, debugging), SUGGESTIONS (genuinely optional — style, naming, micro-optimizations). When in doubt, classify UP.

See references/severity-classification.md for full classification tables, the decision tree, and common misclassification examples.

Go-Specific Review Patterns

Watch for patterns that linters miss: type export design, concurrency patterns (batch+callback, loop variable capture, commit callbacks), resource management (defer placement, connection pool reuse), metrics pre-initialization, testing deduplication, and unnecessary function extraction.

For projects using shared organization libraries: check for manual SQL row iteration instead of helpers, incorrect assertion depth, raw sql.Open() in tests, dead migration files, and database-specific naming violations.

See references/go-review-patterns.md for full checklists and red flags.

Receiving Review Feedback

When receiving feedback: read completely, restate requirement to confirm understanding, verify against codebase, evaluate technical soundness, respond with reasoning or just fix it. Never performative agreement. Apply YAGNI check before implementing "proper" features. Stop and clarify before implementing anything unclear — items may be related.

See references/receiving-feedback.md for the full reception pattern, pushback examples, implementation order, and external vs internal reviewer handling.

Error Handling

Error: "Incomplete Information"

Cause: Missing context about the change or its purpose Solution:

Ask for clarification in Phase 1
Do not proceed past UNDERSTAND with unanswered questions
Document gaps in scope assessment

Error: "Test Failures"

Cause: Tests fail during Phase 2 verification Solution:

Document in Phase 2
Automatic BLOCKING finding in Phase 4
Cannot APPROVE with failing tests

Error: "Unclear Risk"

Cause: Cannot determine severity of an issue Solution:

Default to higher risk level (classify UP)
Document uncertainty in assessment
REQUEST-CHANGES if critical uncertainty exists

References

Examples

Example 1: Pull Request Review

User says: "Review this PR" Actions:

Read CLAUDE.md, then read all changed files, map scope and dependencies (UNDERSTAND)
Run tests, verify claims in comments and PR description (VERIFY)
Evaluate security/performance/architecture risks (ASSESS)
Produce structured findings with severity and verdict (DOCUMENT) Result: Structured review with clear verdict and rationale

Example 2: Pre-Merge Verification

User says: "Check this before we merge" Actions:

Read CLAUDE.md, then read all changes, identify breaking change potential (UNDERSTAND)
Run full test suite, verify backward compatibility claims (VERIFY)
Assess risk level for production deployment (ASSESS)
Document findings with APPROVE/REQUEST-CHANGES verdict (DOCUMENT) Result: Go/no-go decision with evidence

Systematic Code Review Skill

Reference Loading Table

Instructions

Phase 1: UNDERSTAND

Goal: Map all changes and their relationships before forming any opinions.

Step 1: Read CLAUDE.md

Read and follow repository CLAUDE.md files first because project conventions override default review criteria and may define custom severity rules, approved patterns, or scope constraints.

Step 2: Read every changed file

Use Read tool on EVERY changed file completely because reviewing summaries or reading partial files misses dependencies between changes and leads to incorrect conclusions.
Map what each file does and how changes affect it.
Check affected dependencies and identify ripple effects because changes in one file can break consumers that aren't in the diff.

Step 3: Identify dependencies

Use Grep to find all callers/consumers of changed code.
Note any comments that make claims about behavior (these are claims to verify in Phase 2, not facts to trust).

Step 3a: Caller Tracing (mandatory when diff modifies function signatures, parameter semantics, or introduces sentinel/special values)

When the change modifies how a function/method is called or what parameters mean:

Find ALL callers — Grep for the function name with receiver syntax (e.g., .GetEvents( not just GetEvents) across the entire repo. For Go repos, prefer gopls go_symbol_references via ToolSearch("gopls") — it's type-aware and catches interface implementations.
Trace the VALUE SPACE — For each parameter source, classify what values can flow through:
- Query parameters (r.FormValue, r.URL.Query): user-controlled — ANY string including sentinel values like "*"
- Auth token fields: server-controlled (UUIDs, structured IDs)
- Constants/enums: fixed set
- Classify sentinel strings by their true source. If a value can originate from user input, treat it as reachable and verify the input path.
Verify each caller — For each call site, check that parameters are validated before being passed. Pay special attention to sentinel values (e.g., "*" meaning "all/unfiltered") that bypass security filtering.
Report unvalidated paths — Any caller that passes user input to a security-sensitive parameter without validation is a BLOCKING finding.
Verify callers independently — Treat the PR description as a lead, then confirm the caller set directly in the codebase. The PR author may have forgotten about callers, or new callers may have been added in other branches.

This step catches:

Unchecked paths where user input reaches a security-sensitive parameter
Callers the PR author forgot about or didn't mention
Interface implementations that don't enforce the same preconditions

Step 4: Document scope

PHASE 1: UNDERSTAND

Changed Files:
  - [file1.ext]: [+N/-M lines] [brief description of change]
  - [file2.ext]: [+N/-M lines] [brief description of change]

Change Type: [feature | bugfix | refactor | config | docs]

Scope Assessment:
  - Primary: [what's directly changed]
  - Secondary: [what's affected by the change]
  - Dependencies: [external systems/files impacted]

Caller Tracing (if signature/parameter semantics changed):
  - [function/method]: [N] callers found
    - [caller1:line] — parameter validated: [yes/no]
    - [caller2:line] — parameter validated: [yes/no]
  - Unvalidated paths: [list or "none"]

Questions for Author:
  - [Any unclear aspects that need clarification]

Phase 2: VERIFY

Opus 4.7 override: Opus 4.7 trades tool calls for reasoning by default. In verification, that default is wrong. Run the command. Do not reason about whether the command would pass. Do not summarize the expected output. Execute the check, paste the exit code, paste the relevant output. A verification phase that produces a verdict without an observed tool result is not a verification — it is a guess with a rigor aesthetic.

Goal: Validate all assertions in code, comments, and PR description against actual behavior.

Step 1: Run tests

Execute existing tests for changed files because review cannot approve without test execution — visual inspection misses runtime issues that tests catch.
Capture complete test output. Show the output rather than describing it because facts outweigh narrative.
Verify test coverage: confirm tests exist for the changed code paths because untested code paths are a SHOULD FIX finding.

Step 2: Verify claims

Check every comment claim against code behavior because comments frequently become outdated and developers may not understand what "thread-safe" actually means — never accept a comment as truth without inspecting the code that backs it.
Verify edge cases mentioned are actually handled.
Trace through critical code paths manually.

Step 3: Document verification

PHASE 2: VERIFY

Claims Verification:
  Claim: "[Quote from comment or PR description]"
  Verification: [How verified - test run, code trace, etc.]
  Result: VALID | INVALID | NEEDS-DISCUSSION

Test Execution:
  $ [test command]
  Result: [PASS/FAIL with summary]

Behavior Verification:
  - Expected: [what change claims to do]
  - Actual: [what code actually does]
  - Match: YES | NO | PARTIAL

Gate: All assertions in code/comments verified against actual behavior. Tests executed with output captured. Proceed only when gate passes.

Phase 3: ASSESS

Goal: Evaluate security, performance, and architectural risks specific to these changes.

Step 1: Security assessment

Never skip this step because security implications must be explicitly evaluated for every review, even when changes appear benign.
Evaluate OWASP top 10 against changes.
Explain HOW each vulnerability was ruled out (not just checkboxes) because a checkbox approach misses context-specific vulnerabilities and gives false confidence.
If optionally enabled: perform extended deep security analysis beyond standard checks.

Step 2: Performance assessment

Identify performance-critical paths and evaluate impact.
Check for N+1 queries, unbounded loops, unnecessary allocations.
If optionally enabled: benchmark affected code paths with profiling.

Step 3: Architectural assessment

Compare patterns to existing codebase conventions.
Assess breaking change potential.
If optionally enabled: check for similar past issues via historical analysis.

Step 4: Extraction severity escalation

If the diff extracts inline code into named helper functions, re-evaluate all defensive guards.
A missing check rated LOW as inline code (1 caller, "upstream validates") becomes MEDIUM as a reusable function (N potential callers).

Step 5: Document assessment

PHASE 3: ASSESS

Security Assessment:
  SQL Injection: [N/A | CHECKED - how verified | ISSUE - details]
  XSS: [N/A | CHECKED - how verified | ISSUE - details]
  Input Validation: [N/A | CHECKED - how verified | ISSUE - details]
  Auth: [N/A | CHECKED - how verified | ISSUE - details]
  Secrets: [N/A | CHECKED - how verified | ISSUE - details]
  Findings: [specific issues or "No security issues found"]

Performance Assessment:
  Findings: [specific issues or "No performance issues found"]

Architectural Assessment:
  Findings: [specific issues or "Architecturally sound"]

Risk Level: LOW | MEDIUM | HIGH | CRITICAL

Gate: Security, performance, and architectural risks explicitly evaluated (not skipped or hand-waved). Proceed only when gate passes.

Phase 4: DOCUMENT

Goal: Produce structured review output with clear verdict and rationale.

Report facts without self-congratulation. Show command output rather than describing it. Be concise but informative because the review consumer needs actionable findings, not commentary.

PHASE 4: DOCUMENT

Review Summary:
  Files Reviewed: [N]
  Lines Changed: [+X/-Y]
  Test Status: [PASS/FAIL/SKIPPED]
  Risk Level: [LOW/MEDIUM/HIGH/CRITICAL]

Findings (use Severity Classification Rules - when in doubt, classify UP):

BLOCKING (cannot merge - security/correctness/reliability):
  1. [Issue with file:line reference and category from rules]

SHOULD FIX (fix unless urgent - patterns/tests/debugging):
  1. [Issue with file:line reference and category from rules]

SUGGESTIONS (author's choice - purely stylistic):
  1. [Suggestion with benefit - only if genuinely optional]

POSITIVE NOTES:
  1. [Good practice observed]

Verdict: APPROVE | REQUEST-CHANGES | NEEDS-DISCUSSION

Rationale: [1-2 sentences explaining verdict]

After producing the review, remove any temporary analysis files, notes, or debug outputs created during review because only the final review document should persist.

Gate: Structured review output with clear verdict. Review is complete.

Reference Material

Trust Hierarchy

When conflicting information exists, trust in this order:

Running code (highest) - What tests show
HTTP/API requests - Verified external behavior
Grep results - What exists in codebase
Reading source - Direct file inspection
Comments/docs - Claims that need verification
PR description (lowest) - May be outdated or incomplete

Severity Classification Rules

See references/severity-classification.md for full classification tables, the decision tree, and common misclassification examples.

Go-Specific Review Patterns

See references/go-review-patterns.md for full checklists and red flags.

Receiving Review Feedback

See references/receiving-feedback.md for the full reception pattern, pushback examples, implementation order, and external vs internal reviewer handling.

Error Handling

Error: "Incomplete Information"

Cause: Missing context about the change or its purpose Solution:

Ask for clarification in Phase 1
Do not proceed past UNDERSTAND with unanswered questions
Document gaps in scope assessment

Error: "Test Failures"

Cause: Tests fail during Phase 2 verification Solution:

Document in Phase 2
Automatic BLOCKING finding in Phase 4
Cannot APPROVE with failing tests

Error: "Unclear Risk"

Cause: Cannot determine severity of an issue Solution:

Default to higher risk level (classify UP)
Document uncertainty in assessment
REQUEST-CHANGES if critical uncertainty exists

References

Examples

Example 1: Pull Request Review

User says: "Review this PR" Actions:

Read CLAUDE.md, then read all changed files, map scope and dependencies (UNDERSTAND)
Run tests, verify claims in comments and PR description (VERIFY)
Evaluate security/performance/architecture risks (ASSESS)
Produce structured findings with severity and verdict (DOCUMENT) Result: Structured review with clear verdict and rationale

Example 2: Pre-Merge Verification

User says: "Check this before we merge" Actions:

Read CLAUDE.md, then read all changes, identify breaking change potential (UNDERSTAND)
Run full test suite, verify backward compatibility claims (VERIFY)
Assess risk level for production deployment (ASSESS)
Document findings with APPROVE/REQUEST-CHANGES verdict (DOCUMENT) Result: Go/no-go decision with evidence

Adoption

notque/systematic-code-review

$ install --global

SKILL.md

Systematic Code Review Skill

Reference Loading Table

Instructions

Phase 1: UNDERSTAND

Phase 2: VERIFY

Phase 3: ASSESS

Phase 4: DOCUMENT

Reference Material

Trust Hierarchy

Severity Classification Rules

Go-Specific Review Patterns

Receiving Review Feedback

Error Handling

Error: "Incomplete Information"

Error: "Test Failures"

Error: "Unclear Risk"

References

Examples

Example 1: Pull Request Review

Example 2: Pre-Merge Verification

Related Skills

notque/shell-config

notque/kubernetes

notque/swift

notque/php

notque/systematic-code-review

$ install --global

SKILL.md

Systematic Code Review Skill

Reference Loading Table

Instructions

Phase 1: UNDERSTAND

Phase 2: VERIFY

Phase 3: ASSESS

Phase 4: DOCUMENT

Reference Material

Trust Hierarchy

Severity Classification Rules

Go-Specific Review Patterns

Receiving Review Feedback

Error Handling

Error: "Incomplete Information"

Error: "Test Failures"

Error: "Unclear Risk"

References

Examples

Example 1: Pull Request Review

Example 2: Pre-Merge Verification

Related Skills

notque/shell-config

notque/kubernetes

notque/swift

notque/php