skills/code-review/SKILL.md
Review concrete code plan drafts, specs, diffs, and implementation shapes. Use for code-review requests, serious code-plan design critique, and judging whether a proposed direction is sound. Prioritize solution direction, premise validity, logic chain, constraints, alternatives, design shape, contracts, tests, local fit, and actionable findings. Near miss: use code-plan to create or revise plans; use code-scope-gate for pre-spec scope shaping.
npx skillsauth add plimeor/agent-skills code-reviewInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Perform an adversarial review of the change. Look for opportunities to reduce layers (pass-through wrappers, single-use abstractions), remove complexity (code-judo: restructurings that delete branches/modes/helpers, not rearrange them), and increase reliability (correctness, contract, state, concurrency, migration risk). For plan and spec reviews, judge the chosen direction before acceptance or test details: whether it solves the stated problem, rests on valid premises, assigns ownership to the right layer, fits observed constraints, and beats credible lower-scope or root-cause alternatives. Honor repo-wide policies (AGENTS.md, CONTRIBUTING.md, ADRs) as hard constraints. Verify what you can. Keep the original goal fixed; challenge the proposed route when evidence shows a better route to that same goal.
Distinguish NEW (introduced or made materially worse by this change) from PRE-EXISTING (already true on the base branch). Compare against base — read the unchanged file or check blame, not just the diff hunks. When the change replaces or deletes code, enumerate the guards and preconditions the old surface enforced — not just which capabilities it kept — and confirm each survives; an action that stays present but loses the predicate that gated it is invisible in a structural diff. Report PRE-EXISTING only when the change touches the same surface and it blocks the intended outcome, the change makes it worse, or the user asked for a broader audit. Tag every finding [NEW] or [PRE-EXISTING]; if uncertain, say so rather than defaulting to NEW.
Activate this gate when reviewing a plan, spec, proposed implementation approach, migration design, or any request asking whether an approach is right, wrong, best, optimal, or worth doing.
Required evidence before judging plan quality:
Prohibited substitutes:
Incomplete behavior:
Volume is failure. Each finding names a concrete surface, the impact, and the smallest correction. Drop anything that wouldn't change the merge decision or follow-up plan. Cap nits at three — if you have more, the bar is too low.
Filter on severity and merge-relevance, not on your own confidence. Investigate fully, then decide what to report: surface a plausible correctness/contract/state issue even when you are unsure of it — flag the uncertainty and what would confirm it — instead of dropping it silently because you can't fully prove it. Only style and naming nits get capped.
Severity: blocker (correctness/contract/state), raise (real issue, follow-up acceptable), nit (small polish). Order by severity. Split atomic findings when surface or correction differs; merge only exact duplicates. If nothing meets the bar, say so and name residual risk.
For draft-plan reviews, corrections are plan changes (revised approach, added context, stronger non-goal, checkpoint, user decision).
Most reviews are inline — apply whichever lenses fit the change. When surfaces fan out or risk concentrates (multi-file, public API/CLI/schema/migration/persistence), dispatch focused sub-agents using the lens descriptions below as their prompts (consult subagent-delegation first). Sub-agents stay read-only, apply the attribution rule, and return concrete candidates with source pointers — silence is a valid output.
Design shape. Shallow interfaces, information leakage, tactical patches adding branches/modes/fallbacks without reducing underlying complexity, weak ownership boundaries, error handling that should be removed by design, layers that only forward calls. Use APOSD vocabulary (references/aposd-complexity-review.md) — every design finding names reader task + symptom (change amplification / cognitive load / unknown-unknown) + cause (dependency / obscurity). Reject "cleaner" / "more DRY" / "add comments" without that mapping.
Contract surface. Public/shared contracts: API, CLI, schemas, persisted state, generated artifacts, wrappers, migrations. New capability promises, silent rewrites/repairs/normalizations on parse, wrappers that reinterpret upstream contracts, missing collision checks on derived identities (paths/routes/cache keys/external refs), undocumented changes to behavior/imports/CLI output/event payloads/errors/timing, migrations that drop source baseline or fixture matrix, rewrites that keep an action but silently drop the precondition that gated it (permission / verification / risk-control / validation / enable-disable predicate). Don't invent compatibility requirements beyond observed contracts.
Test validation. Regression evidence. Behavior that could regress without coverage, tests that assert private state / call order / component names instead of public behavior, mocks of things that should be real, production-only seams added for testability, plans that prove the new path while leaving existing behavior unprotected. Don't report coverage gaps in untouched code.
Implementation fit. Fit with the local codebase. New helpers/wrappers/adapters that duplicate existing owners, single-use abstractions, wrong-layer logic, near-copy variants, broad find-and-replace that scatters one behavior, options/branches/files for future flexibility, calls to nonexistent or bypassed helpers. Don't propose adjacent refactors unless they are the smallest correction.
Synthesis critic. Dispatch when risk is high or evidence conflicts. Different role: challenges the draft findings, doesn't produce new candidates. Probes: evidence real, attribution correct against base, PRE-EXISTING justified, set clears the bar (drop low-value / excess nits), no bad duplicates/splits/severity, APOSD findings name reader-task/symptom/cause not slogans, severe sub-agent candidates not silently dropped, unreviewed surfaces acknowledged. Returns per challenge: target finding (or missing area), issue, evidence, action (keep / drop / retag / reword / reorder / verify / surface in judgment).
Before final output, check:
User's primary language for prose; keep code symbols, paths, errors original. Return: findings (each prefixed [NEW]/[PRE-EXISTING] and severity-tagged) → open questions that materially change the decision → short overall judgment (note blocked/unreviewed surfaces). For plan/spec reviews, the findings section starts with direction, premise, ownership, and logic-chain issues when present; the overall judgment names the direction verdict: best-supported, adequate but suboptimal, wrong/unsupported, or unjudged. No edits, commits, or pushes without separate authorization.
For plan/spec reviews, finish only after the Plan-Direction Gate has a stated verdict, material open questions are named, and acceptance/test observations have not displaced a more important direction judgment. For diff/code reviews, finish only after attribution, severity, and residual risk are clear. If verification is unavailable or out of scope, state the next-best evidence used.
tools
Decide whether and how to use authorized sub-agents, then coordinate delegated work while preserving the main agent's context. Use when the user asks for orchestration, parallel agents, delegation, background workers, context isolation, or when another skill needs delegated research, review, implementation, or verification. Owns host-policy checks, delegation packets, non-overlap, report verification, and stop rules. Do not use to bypass tool policy, infer user authorization, or add coordination overhead to simple single-threaded tasks.
development
Use before finalizing a non-trivial answer, recommendation, review, or decision to reconsider it and raise its quality, especially when shallow reasoning, context inertia, false framing, overconfidence, unfit analogy transfer, or an obvious-but-missed defect could distort the result. Trigger especially before applying external evidence, familiar frameworks, or comparisons to the user's specific request, and when the user asks to reconsider, double-check, take a second look, or sanity-check an answer. Reconsider the draft against its most likely failure mode, and use independent scrutiny only when it is useful and authorized.
development
Write evidence-backed coding plans for implementation, debugging, refactoring, migrations, design parity work, and long-running agent tasks. Use when defining, clarifying, refining, or validating a development plan, /goal prompt, implementation approach, scope and non-goals, work sequence, acceptance criteria, regression evidence, verification strategy, or stop condition. Near miss: use code-review when judging an existing diff, spec, or already drafted plan rather than drafting or revising a plan. Also use when the user says `design twice` after a plan and wants an APOSD-style second-design pass over the completed plan.
tools
Route durable rules and context to the right layer — task, project, skill, tooling, hooks, MCP, or global — and produce the smallest directly applicable edit. Use for global rules files (~/.claude/CLAUDE.md, global AGENTS.md), repo-local AGENTS.md/CLAUDE.md, task context packs, hook placement (Codex/Claude Code settings.json), collaboration friction diagnosis, and rule-placement decisions.