plugins-copilot/sf-code-review/skills/sf-code-review/SKILL.md
Use this skill for extensive code reviews of current changes when the user wants chunked, cross-family review with dual mid-range reviewers and targeted Opus adjudication. Trigger phrases include "extensive code review", "split the review into chunks", "dual review", "reconcile findings", and "stonefish code review".
npx skillsauth add st0nefish/claude-toolkit sf-code-reviewInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Purpose. Run a repeatable review pipeline over the current changeset: scope the diff, collect review-specific focus items, split the work into coherent chunks, review each chunk with two different mid-range models, and use a targeted Opus pass only to adjudicate candidate findings before producing a triage-ready report.
If the user explicitly invokes /sf-code-review:review, use this exact procedure.
Use this workflow when the user wants a deeper review than a single-pass code review, especially when they ask for any of the following:
Do not use this workflow for tiny quick reviews where a normal single-pass review is enough.
claude-opus-4.6.
Never substitute claude-opus-4.7 or claude-opus-4.8.Parse command arguments if present.
--staged is present, review git diff --cached and ignore --base.--base <ref> is present, review <ref>...HEAD.HEAD and the default branch.--scope <path> ... is present, restrict all diff commands to those paths.If the user has not already provided review-specific focus items, ask via
AskUserQuestion for:
Use direct git commands as the source of truth:
git status --porcelain=v1 -b
default=$(git symbolic-ref --quiet --short refs/remotes/origin/HEAD | sed 's|^origin/||')
base=$(git merge-base HEAD "origin/$default")
git diff --stat "$base"...HEAD
git diff --name-only "$base"...HEAD
git diff --numstat "$base"...HEAD
For staged reviews, use the --cached equivalents instead. For path-scoped
reviews, append -- <paths...> to every diff command.
Capture:
staged, base-ref, or default-merge-base)If the scoped diff is empty, report that there is nothing to review and stop.
Chunk by semantic unit first, size second. Keep these together whenever possible:
Rules:
<= 3 files or <= 250 changed lines), use a
single chunk instead of forcing chunking.150-400 changed lines.6 chunks.6 chunks or roughly 2500 changed lines after
scoping, show the chunk plan and ask the user whether to narrow the scope or
review in phases before dispatching agents.Run one whole-diff sweep before chunk-level adjudication. Its job is to look for cross-cutting concerns, not to replace chunk review.
Prefer a code-review agent with model gpt-5.4. Only switch this sweep to
general-purpose with gpt-5.4 if the user's focus items clearly require
reading surrounding unchanged code or architectural context.
The sweep must check this fixed checklist plus the user-provided focus items:
Return cross-cutting concerns only. Do not use this pass as a second full review.
For each chunk, launch two independent code-review agents in parallel:
claude-sonnet-4.6gpt-5.4Give both reviewers the same chunk scope and the same review contract:
Require each reviewer to return a strict finding schema. Free-form prose is not acceptable.
If a reviewer finds nothing, it must return exactly:
chunk_id: <chunk-id>
NO_FINDINGS
Otherwise require this structure for every finding:
chunk_id: <chunk-id>
findings:
- finding_id: <stable-id>
file:line: <path:line or path>
severity: critical|high|medium|low
confidence: high|medium|low
category: correctness|regression|architecture|security|performance|tests|compatibility
claim: <one-sentence issue statement>
why_it_matters: <impact>
minimal_evidence: <smallest diff/context that supports the claim>
suggested_check: <what to verify or change>
Before invoking Opus:
Use claude-opus-4.6 only as an adjudicator for candidate findings.
Hard rules:
claude-opus-4.7 or claude-opus-4.8; keep Opus pinned to
claude-opus-4.6 for cost control.code-review for adjudication when the candidate can be judged from
the touched diff and nearby context.general-purpose with claude-opus-4.6 only when a candidate cannot be
resolved without broader surrounding code or architectural context.Tell Opus explicitly: you are an adjudicator, not a third reviewer.
Require one verdict per candidate finding:
finding_id: <stable-id>
verdict: confirmed|rejected|uncertain
severity: critical|high|medium|low
confidence: high|medium|low
reason: <why>
recommended_fix: <specific next step>
Return a triage-ready report with these sections, in this order:
development
Start work from your description — explore the codebase and plan
data-ai
Multi-phase, multi-agent feature workflow: spec → plan → refine → divide → execute → review. Invoke when the user escalates a session-start/session-issue flow to orchestration, or asks to run a non-trivial feature (multiple files, design ambiguity, cross-cutting concerns, correctness-critical paths) through the full multi-agent workflow. For small fixes, prefer session-start.
tools
Browse open issues, pick one, and start work on it
tools
Review, clean up, and open a PR to finalize the work