skills/review-loop/SKILL.md
Adversarial review→triage→fix loop until a cold verifier signs off. Fans out lens-specific reviewer subagents, verifies every finding against the code (killing false positives), auto-applies confirmed fixes as fixup commits, and repeats until a fresh verifier approves. Prefers a deterministic dynamic workflow when available; falls back to in-instance Task dispatch. Use when the user types /review-loop or asks to adversarially review-and-fix a change set, branch, or commit range until clean.
npx skillsauth add roasbeef/claude-files review-loopInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run an adversarial review→triage→fix loop until a fresh cold verifier signs off.
Unlike /code-review (report-only) this loop verifies every finding against
the code, kills false positives, applies the confirmed fixes as fixup
commits, and repeats until acceptance. All reviewing happens in subagents,
so each reviewer burns its own context, not yours.
Target: $ARGUMENTS
This loop exists to defeat three failure modes that hit a single context window on long, adversarial tasks:
Because those guarantees depend on the orchestration actually running every
phase every time, the preferred execution path is a dynamic workflow (a
deterministic JavaScript harness), not model-driven dispatch. The workflow
encodes fan-out, triage, apply, loop, and verify as code that cannot drift or
cut corners. The in-instance path below is the fallback when the Workflow
tool is unavailable or the user passes --inline.
Do this in the first turn, before any dispatch, regardless of execution path.
Resolve scope into one concrete diff command and a stable description:
git branch --show-current
git show-ref --verify --quiet refs/heads/main && echo main || echo master
# Range given? use it. Branch given? <base>...<branch>.
# Else commits ahead of base? <base>..HEAD. Else uncommitted: git diff HEAD.
git diff <range> --stat ; git log <range> --oneline
Every finder and the verifier must review the same surface — record the exact diff command.
Capture a pre-flight baseline so pre-existing breakage is not blamed on the change:
make build 2>&1 | tail -5 ; make test 2>&1 | tail -5 ; make lint 2>&1 | tail -5
Note what was already red (e.g. a toolchain/lint-config issue) for the brief.
Write a design brief to .review-loop/brief.md — this is what makes
triage accurate. Include: what the change does and why (approved intent);
hard constraints and environment/protocol semantics reviewers can't infer
from the diff; accepted tradeoffs and out-of-scope items; the pre-flight
baseline.
Pick lenses from the changed files and record them in
.review-loop/lenses.md. Always run the baseline adversarial panel; add
specialized lenses when trigger files are present:
| Lens | subagent_type | Trigger |
|---|---|---|
| Correctness | code-reviewer | always |
| Offensive security | security-auditor | always |
| Differential / blast radius | general-purpose + differential-review skill | always |
| Concurrency | general-purpose | goroutines, channels, mutexes, sync, atomics |
| Shell / config hardening | general-purpose | *.sh, Dockerfiles, CI YAML, hooks, settings |
| API safety & insecure defaults | general-purpose + sharp-edges/insecure-defaults | public interfaces, config, RPC/proto |
| Deep function analysis | audit-context-building:function-analyzer | crypto/auth, consensus, value-transfer |
| Spec compliance | spec-to-code-compliance:spec-compliance-checker | BIP/BOLT/protocol/spec references |
mkdir -p .review-loop and track the run with TodoWrite (one item per phase,
plus a per-round entry as the loop iterates).
When the Workflow tool is available and --inline was not passed, run the
loop as a deterministic harness. The bundled script
workflow/review-loop.js is a template — adapt it to the run (the chosen
lens set, cutoff, and max-iters), do not assume it must run verbatim.
Invoke it via the Workflow tool, passing the Phase 0 artifacts as args:
Workflow({
scriptPath: "<this skill dir>/workflow/review-loop.js",
args: {
diffCmd: "<exact diff command>",
base: "<base branch>",
brief: "<contents of .review-loop/brief.md>",
lenses: [ /* the selected lens descriptors */ ],
cutoff: "medium",
maxIters: 5,
},
})
The workflow runs find→triage→apply→loop→verify and returns a structured summary (rounds, confirmed vs rejected per round, applied fixups, deferred follow-ups, verifier verdict). When it returns, the main loop does Phase 6 (finalize) below — autosquash offer and final green build — because those steps are interactive and side-effectful.
If the workflow hits maxIters without converging, it returns what remains
rather than looping forever; surface that and ask how to proceed.
--inline or no Workflow tool)Run the same phases with the Task tool. This is what we ran by hand; it works but relies on the orchestrator faithfully executing each phase.
Launch every selected lens in one message with parallel Task calls (or
run_in_background: true and collect notifications). Give each the same diff
surface and the design brief, with this adversarial skeleton:
You are an ADVERSARIAL reviewer. BREAK this change, do not grade it. Only
report findings you can argue concretely from the code.
Scope (review exactly this): <diff command>
Design brief: <.review-loop/brief.md>
Your lens: <lens + specific failure modes to hunt>
For each finding return: stable id, file:line, severity
(critical/high/medium/low/info), a concrete trigger SCENARIO, and a minimal fix
sketch. A verified "not a bug" is useful signal. Raw list, no pleasantries.
Write outputs to .review-loop/round-<N>/find-<lens>.md.
Spawn ONE general-purpose judge with all finder outputs + the brief + code
read access. It must verify each finding against the cited lines (reject
what it can't reproduce), dedup/merge, kill false positives with reasons,
and classify survivors into fix-now (≥ cutoff; with a repo-style fix
sketch), follow-up (deferrable; with an issue title), rejected (why).
Write to .review-loop/round-<N>/triage.md. If a fix-now item contradicts the
approved design, surface via AskUserQuestion before fixing.
For each fix-now finding in severity order: implement the minimal fix matching surrounding code; add/update tests when testable; build + relevant package tests must pass vs the Phase 0 baseline; commit as a fixup:
git add <files> ; git commit --fixup=<target-sha>
Use hunk stage for files mixing fix-now and deferred changes. Log to
.review-loop/round-<N>/applied.md.
Increment the round, re-run Phase 1 finders on the new diff. New
triage-confirmed fix-now findings → back to Phase 2/3. A clean round (zero new
fix-now) → Phase 5. Stop at --max-iters and report what remains.
Spawn ONE fresh code-reviewer that saw no prior round. Give it only the brief
and the full final diff (<base>..HEAD) and ask: APPROVE, or RE-OPEN with
concrete findings. APPROVE → Phase 6. RE-OPEN → feed findings into Phase 2
(subject to the same triage discipline).
hunk rebase autosquash --onto <base> --dry-run
Show the plan; on approval run for real. If fixups interleave with other
commits on the same lines (conflict risk), instead offer a single review:
commit via soft reset. Declined → leave fixups as-is./s-code-review) is the alternative when you want findings tracked in the
review system / web UI; this trades that for lower context cost.development
Run GitHub Actions CI locally with Agent CI to validate changes before pushing. Use when testing, running checks, or validating code changes.
development
Clear-writing guide distilled from Steven Pinker's "The Sense of Style." Use when writing or revising prose that must be clear to a reader — documentation, design docs, specs, explanations, essays, emails, reports, RFCs, release notes — or when asked to make writing clearer, tighter, less academic, or less jargon-laden. Activate for "make this clearer", "tighten this", "why is this hard to read", "edit this for clarity", or any prose-quality pass.
development
Interactively debug Go programs in a single context using Delve (dlv) driven through tmux. Use when a bug requires runtime inspection — stepping through code, examining variables, walking goroutines, attaching to a live process, or debugging a hanging integration test — rather than just reading the source. Triggers include "step through this", "set a breakpoint", "attach to the running server", "why is this goroutine stuck", "debug this failing test".
development
Find similar vulnerabilities and bugs across codebases using pattern-based analysis. Use when hunting bug variants, building CodeQL/Semgrep queries, analyzing security vulnerabilities, or performing systematic code audits after finding an initial issue.