skills/code-reviewer/SKILL.md
Run isolated code reviews for core algorithm or production code changes. Use when the user asks for a fresh-context reviewer, writer/reviewer separation, Spark pre-review, code review, implementation audit, review bundle, independent review, or review artifacts under `.agent/code-reviews/`.
npx skillsauth add a-green-hand-jack/ml-research-skills code-reviewerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run code review as an isolated artifact-driven workflow. The reviewer should judge the implemented change from the task contract, diff, writer summary, tests, and relevant files, not from the writer's conversation history.
<installed-skill-dir>/
├── SKILL.md
├── scripts/
│ └── prepare_review_bundle.py
├── references/
│ └── isolation-protocol.md
└── templates/
├── review.md
└── fix-log.md
The reviewer must not inherit the writer's chat context. Use one of these patterns:
gpt-5.3-codex-spark sidecar as a fast first-pass scanner, then let the main agent triage its findings.The reviewer input is the bundle, not the writer conversation.
gpt-5.3-codex-spark via codex exec --ephemeral.workspace-write only to write review artifacts; otherwise read-only plus -o..agent/code-reviews/<change-id>/review.md and fix-log.md; optional sidecar telemetry under .agent/sidecars/<task-id>/.python3 <installed-skill-dir>/scripts/prepare_review_bundle.py \
--repo . \
--base main \
--request "Implement <feature> with <acceptance criteria>" \
--writer-summary "Changed <files>; ran <tests>; known risks: <risks>"
python3 <installed-skill-dir>/scripts/prepare_review_bundle.py \
--repo . \
--working-tree \
--request-file .agent/code-reviews/<change-id>/request.md
Use code-reviewer.
Review the bundle at .agent/code-reviews/<change-id>/.
Do not modify production code.
Write findings to .agent/code-reviews/<change-id>/review.md.
For automated strong isolation, prefer a one-shot CLI session instead of an in-process subagent.
Spark pre-review:
codex exec --ephemeral \
-m gpt-5.3-codex-spark \
-C . \
-s workspace-write \
-o .agent/code-reviews/<change-id>/spark-output.md \
"$(cat .agent/code-reviews/<change-id>/reviewer-prompt.md)"
Treat Spark output as a fast issue candidate list, not final approval. The main agent should copy accepted findings into review.md or record rejected findings in fix-log.md / decision.md. For high-risk changes, run a strong fresh reviewer after Spark and after fixes.
Codex:
codex exec --ephemeral \
-C . \
-s workspace-write \
"$(cat .agent/code-reviews/<change-id>/reviewer-prompt.md)"
Claude Code:
claude -p "$(cat .agent/code-reviews/<change-id>/reviewer-prompt.md)" \
--no-session-persistence \
--permission-mode acceptEdits
For stricter Claude Code scripting, add --bare only when the prompt explicitly supplies every needed context path, because bare mode skips automatic project and skill discovery:
claude -p "$(cat .agent/code-reviews/<change-id>/reviewer-prompt.md)" \
--no-session-persistence \
--bare \
--add-dir .
Do not use claude --continue, claude --resume, codex resume, or codex fork for a first-pass review. Those are useful for continuing work, but they weaken the reviewer/writer context boundary.
The writer then reads review.md and any spark-output.md, fixes the code, and records responses in fix-log.md.
For high-risk changes, run a second fresh review after fixes.
Read references/isolation-protocol.md before reviewing.
Review only the change described by the bundle:
request.md: task contract and acceptance criteriawriter-summary.md: what changed, tests run, known risksdiff.patch: stat and patchtest-output.md: test commands and outputsreviewer-prompt.md: ready-to-use fresh reviewer promptFocus on:
Do not rewrite the implementation unless the user explicitly asks for reviewer-as-fixer mode. Default reviewer output is review.md.
Use this severity order:
High: likely correctness bug, data corruption, invalid experiment result, security/privacy issue, or broken public APIMedium: edge-case bug, missing test for risky behavior, fragile design, or confusing integrationLow: maintainability nit, naming issue, small docs mismatchEach finding must include:
End with one verdict:
request changesacceptable with nitsapproveThe writer should update fix-log.md with:
If review findings change the task scope or algorithm contract, update the project memory or design docs before continuing — update memory/claim-board.md when correctness claims are affected, memory/risk-board.md for newly identified technical risks, and memory/decision-log.md when an algorithm contract or design decision changes as a result of review.
testing
Bootstrap project-local ml-research-skills. Use from global installs when creating a new ML research project, enabling this collection in an existing ML research repo, or deciding whether to install the full bundle locally. Route to project-init for new projects; do not handle paper or experiment work directly.
development
Route project operations tasks — git, memory, bootstrap, remote, workspace, code review, timeline, ops — to the correct skill. Use when the task involves commits, pushes, worktrees, project memory, enabling project-local skills, SSH/server coordination, sidecar runners, or audits. Do not solve the ops task directly.
testing
Route ML/AI paper writing tasks to the correct skill — contract planning, prose drafting, section writing, consistency editing, review simulation, rebuttal, submission, or citation work. Use when the task involves writing, revising, reviewing, or submitting a paper instead of guessing between paper-writing-assistant, paper-writing-contract-planner, paper-reviewer-simulator, auto-paper-improvement-loop, or citation skills. Do not draft prose directly.
data-ai
Project-local router for ML research skill selection. Use inside an initialized ML research project, or while maintaining this skill repo, when the user describes an ML research/paper/experiment/discovery/ops/release workflow and may not know the skill; route to a domain router or high-signal leaf. Do not use for generic non-ML projects.