agents/dot-agents/skills/pr-review/SKILL.md
Deep, rigorous multi-model code review for PRs, branches, diffs, commits, or pre-merge checks. Use this when the user explicitly asks for a deep review, thorough review, rigorous review, pre-merge check, PR review, branch review, or wants maximum confidence. Do not use for quick/casual reviews; use quick-review instead. Treat AI output as a first draft: run independent Pi subagents on Claude Opus, GPT-5.5 extra-high reasoning, and Gemini; validate and synthesize severity-ranked findings before fixes.
npx skillsauth add nathankoerschner/dotfiles deep-reviewInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill for rigorous code review inspired by Nolan Lawson’s “write better code more slowly” approach: AI-generated or human-generated code is only a first draft. The value comes from independent review, synthesis, severity ranking, and validation before fixes happen.
Do not rush to fixes. First understand the scope, gather context, run independent reviews, then synthesize. Avoid letting early findings bias later analysis.
The subagent pass is the heart of this skill. Distinct models catch different classes of issues, and independent agreement is a strong signal. Do not do false-positive research or synthesis until all reviewers have returned.
The user may provide:
owner/repo#numberDetermine review scope in this priority order:
owner/repo#numbergit diff main...HEADgit diff --stagedgit show --stat --patch HEADFor a PR, use gh to fetch the PR context when available:
If gh is unavailable or unauthenticated, fall back to git commands and clearly state the limitation.
Collect enough context for reviewers to judge whether the diff is correct, not merely whether it is tidy.
Include:
Pay special attention to spec fidelity: whether the diff actually implements the original issue, PRD, or user intent.
Spawn these three reviewers in parallel in a single subagent batch:
pr-review-claude-opus — Claude Opus 4.8, extra-high thinkingpr-review-gpt55-xhigh — GPT-5.5, extra-high reasoningpr-review-gemini — Gemini Pro, extra-high thinkingUse agentScope: "user" unless the user explicitly wants project-local agents too.
Example tool shape:
{
"agentScope": "user",
"tasks": [
{ "agent": "pr-review-claude-opus", "task": "<full review objective and scope context>", "cwd": "<repo root>" },
{ "agent": "pr-review-gpt55-xhigh", "task": "<full review objective and scope context>", "cwd": "<repo root>" },
{ "agent": "pr-review-gemini", "task": "<full review objective and scope context>", "cwd": "<repo root>" }
]
}
Pass the same full PR/scope context to all three reviewers. Do not spawn them sequentially; doing so risks biasing later reviewers.
If a subagent fails because its provider/model is unavailable, say so clearly and continue with the remaining reviewers only if at least two distinct models completed. If fewer than two reviewers complete, ask the user whether to retry, switch models, or proceed with a lower-confidence review.
The objective for each reviewer should say:
critical, high, medium, or low..md files); all review output must stay in chat/subagent text.A “bug” is not limited to crashes. Treat the following as valid review findings:
criticalA finding that should block merge immediately.
Examples:
highA finding that should normally block merge until fixed.
Examples:
mediumA real issue worth fixing, but not necessarily merge-blocking.
Examples:
lowCleanup, polish, or low-risk maintainability issue.
Examples:
Do not inflate severity. A good review is useful because it distinguishes merge blockers from nice-to-haves.
After all reviewers return, synthesize their findings into a Markdown report in chat only. Never write the report to a file or create any review artifacts.
Validate findings before including them. Use code inspection, git/gh context, and targeted commands as needed. Remove or downgrade likely false positives.
Do not silently merge findings that sound similar but refer to different root causes.
Always produce a Markdown report with severity tags.
Use this structure:
# PR Review Report
## Summary
### Claude Opus reviewer
- Brief attributed summary.
### GPT-5.5 reviewer
- Brief attributed summary.
### Gemini reviewer
- Brief attributed summary.
## Agreements — high confidence
Findings flagged by at least two reviewers, especially when they cite the same root cause.
### [critical|high|medium|low] Title
- **Reviewers:** Claude Opus, GPT-5.5, Gemini as applicable
- **Evidence:** file/line/diff reference
- **Why it matters:** ...
- **Validation:** confirmed / partially confirmed / needs human context
- **Suggested direction:** brief guidance, not a full patch unless requested
## Disagreements
Findings where reviewers conflict or one reviewer’s concern appears weaker.
### [severity] Title
- **Reviewer views:** ...
- **My point of view:** which argument is stronger and why
- **Recommended disposition:** keep / downgrade / dismiss / needs discussion
## Unique Findings
Findings raised by only one reviewer but still credible.
### [severity] Title
- **Reviewer:** Claude Opus, GPT-5.5, or Gemini
- **Evidence:** ...
- **Why it matters:** ...
- **Validation:** ...
- **Suggested direction:** ...
## Final Recommendation
One of:
- **Approve**
- **Request changes**
- **Needs discussion**
Include a short rationale based on critical/high findings, confidence, and spec fidelity.
Use this guidance:
If there are so many criticals that the whole approach appears misguided, recommend abandoning or rethinking the PR rather than patching around it.
After the user reviews the report, guide the fix loop this way:
Only start implementing fixes after the user asks for them.
If the user asks for “grill me”, “quiz me”, “make sure I understand this PR”, or similar, run an understanding pass before approval.
In grill-me mode:
Example diagram format:
flowchart TD
A[User action] --> B[New code path]
B --> C[Database/API]
C --> D[Result]
At the end of the report, offer to post findings as inline PR review comments via gh, but only after explicit confirmation.
Rules:
COMMENT only..md files or any other artifacts.development
Review UI code for Vercel Web Interface Guidelines compliance
development
Opinionated constraints for building better interfaces with agents.
databases
Connect to Snowflake via SnowSQL
testing
Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.