skills/pr-review/SKILL.md
Decide whether a GitHub PR can be approved, then optionally write and submit the approval comment when the user explicitly authorizes it. Use when the user says "review this PR", "review PR
npx skillsauth add cvscarlos/ai-skills pr-reviewInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill when the user wants you to review a GitHub PR and tell them whether it is approvable. The user is the decision maker. You produce a clear verdict and the evidence behind it; the user reads, decides, and posts any comments themselves.
The default mode is review-only: produce a verdict and the evidence, but do not post comments or submit an approving review on your own. Approval is a separate, opt-in step — covered in the "Writing the Approval Comment" section near the end. Only act on it when the user explicitly tells you to approve.
A review is useful when it changes the user's confidence about merging — either by surfacing something they would have missed, or by confirming the PR is safe with concrete evidence. Anything that doesn't change their decision is noise.
The output is for a human reader who already knows the codebase. Skip generic process language ("I fetched the PR", "I read the diff"), skip rubber-stamp praise, skip checklists pasted verbatim. Lead with the verdict, then justify it.
Local state lies. The branch may have been force-pushed, comments may have been added, checks may have just failed. Before reviewing, refresh:
git fetch origin — make sure the local ref reflects the remote.git pull --ff-only (or compare local HEAD vs the PR head SHA from GitHub).gh CLI over the GitHub MCP whenever possible. The MCP returns large JSON blobs that burn context tokens quickly; gh lets you specify exact fields and pipe through jq or head to keep things lean. Reach for the MCP only when you need data gh doesn't expose cleanly — threaded review-comment metadata (isResolved, isOutdated), complex code/issue search, or GraphQL-level fields.Useful commands:
# Lean PR overview — name only the fields you need
gh pr view <num> --json title,body,state,mergeable,mergeStateStatus,additions,deletions,changedFiles,commits,reviews,statusCheckRollup,headRefOid
# General timeline comments
gh pr view <num> --comments
# Full diff
gh pr diff <num>
For threaded review comments where you need resolution status (isResolved, isOutdated), the gh CLI does not expose those fields cleanly — use the GitHub MCP get_review_comments tool for those specifically. When using gh --json, always specify only the fields you need; for large outputs, pipe through head or jq to keep context usage in check.
If the user gives you a PR URL but the branch isn't checked out locally, that's fine — review against gh pr diff plus targeted gh api reads for files. Don't refuse for lack of a local checkout.
For every comment thread on the PR, decide one of:
When listing outstanding comments back to the user, group them by severity and use this table format so the user can navigate each thread by appending the Discussion ID to the PR's base URL:
| Column | Description |
| --------------- | --------------------------------------------------------------------------------- |
| # | Sequential number |
| Discussion ID | The rXXXXXXXXXX identifier from the comment URL (used to build a browser link) |
| File | Filename (without full path) |
| Line | Line number |
| Issue | Brief description of the issue |
Example:
## 🔴 Blocking — 2 comments
| # | Discussion ID | File | Line | Issue |
| --- | ------------- | --------------- | ---- | ------------------- |
| 1 | `r1234567890` | `foo.js` | 180 | Missing null check |
| 2 | `r1234567891` | `bar.js` | 220 | Incorrect parameter |
**Base URL:** `https://github.com/{owner}/{repo}/pull/{number}#discussion_`
The user clicks any thread by appending its Discussion ID to the Base URL.
Tests are the most common place where a PR looks fine but isn't. For every new or modified test, ask the four questions below. A passing test suite is not the same as a tested change.
Check whether main or development is the default for this repo. If the test passes on both branches, it isn't actually testing the new behavior — it's decoration. Run it against the default branch when feasible, or read the diff of the code-under-test and reason about whether the assertion depends on the change.
If the same code path is covered at another layer with different inputs, an additional same-shape test adds maintenance cost without catching new bugs. Flag duplicates.
If the mock dictates the answer the assertion checks, the test only proves the mock was wired up. The classic shape: mock a function to return X, then assert the caller returns X. That proves nothing about the production code path.
A test that makes sense today, with full PR context, may be a maintenance burden once that context is gone. The author knows why every assertion exists right now; the maintainer reading it next year will not. Flag tests that look like they will age badly:
The lens to apply: imagine the PR's context is gone. A maintainer six months from now reads only the test name and body. Do they understand what behavior is being protected, and is that behavior worth protecting? If not, flag it.
Use these as internal discipline, not as a checklist to recite back to the user.
When you list findings, label each one so the user can decide which deserve a comment. Adopt these consistently — they map cleanly to how reviewers think.
Don't label nits as blocking. Don't bury blockers in a "suggestions" list. The label is the user's signal for what to actually post.
Return the review in this order. Skip sections that are empty rather than padding them.
One line. One of:
2–4 bullets of concrete evidence. Specific files, behaviors, code paths, test results, or CI status. No generic praise. If you say "the change is correct", say which change in which file and what makes it correct.
Use the comment table format from the "Triage Existing Comments" section above. Group by severity if more than a few.
For each 🔴 blocker, write the exact text the user can paste into GitHub. Match the user's PR comment style: terse, concrete, no filler. The user posts these — you do not.
A short, concrete alternative if there is one worth raising — even on an approve. Skip if the current approach is fine. Examples worth raising: a simpler structure, a different boundary, a name that would age better, a missing test that's cheap to add. Don't list rewrites for the sake of having an opinion.
If the verdict is Approve or Approve with non-blocking notes and the user explicitly tells you to approve, follow the "Writing the Approval Comment" section below. Otherwise stop after the verdict — the user posts any comments themselves.
Only do any of this once the user has explicitly authorized the approval after seeing the verdict. The default mode is still review-only.
Before drafting the body, internally confirm:
Do not paste this list into the approval body — it's only there to decide whether approval is appropriate.
Write for the PR author and human reviewers, not as a report that an AI followed review instructions.
The author spent days in this code; you spent minutes. They chose the boundaries and made the trade-offs deliberately, so a bullet that endorses their decision — "catching it here is the right boundary", "the split maps cleanly to the two failure modes", "suppressing under ?draft=true is intended" — hands them nothing they don't already have. It reads as grading their homework. And "is intended" or "is the right call" is you inferring intent you can't see: in review you know what the author wrote, not what they were thinking.
What earns a place in the body is work the author couldn't have done for themselves:
[email protected]", "the onError signature in [email protected] takes a 4th arg, so meta resolves", "cross-checked against the handleServerError helper", "matches upstream PR #1674".The test for every bullet: could the author have written this without me? If yes — it's their decision, their rationale, or a restatement of their diff — cut it. If it took going outside the diff to know it (running it, checking a dependency version, reading adjacent code, hitting the live app), keep it.
Before — every bullet validates a decision the author already made (all four are cuttable):
Approved.
- Catching the config-load errors in the `config` property is the right boundary — exactly where these escaped before.
- Splitting the two error types maps cleanly to the two real failure modes.
- Suppressing the notification under `?draft=true` is intended: an admin previewing their own draft shouldn't be paged.
- Regression tests exercise each distinct branch.
After — each bullet reports something checked, not something endorsed:
Approved.
- Ran the 5 new regression tests locally; all pass, each covering a distinct branch (unpublished, invalid payload, lazy-settings failure, draft-skip, registration route).
- Confirmed the escaped errors reached the error tracker unhandled before this — the new catch sits on the only path that hits them (`setup()`, before `dispatch`).
- Cross-checked the draft-skip: `?draft=true` is the sole caller that suppresses, so non-draft failures still notify as before.
The PR author reads the approval cold — they don't see the chat history that led to the verdict. A bullet that only makes sense in that hidden context reads as filler, or as preemptive defense of a concern nobody raised.
Pass each bullet through this test:
If someone reads only the PR (no chat history with you), does this sentence give them new information that justifies the approval?
If no, cut it. Specifically:
<div> inside <label> was an issue and decided it wasn't, don't pre-empt that on the PR — readers will wonder why you brought it up.When in doubt: did anyone on the PR raise this, or is it leaking from your private investigation? If the latter, it doesn't belong in the body.
The approval body justifies the approval. It is not a place to enumerate every non-blocking observation.
If a bullet starts with "minor:", "nit:", "as a follow-up:", or "non-blocking:" — move it to a line comment on the relevant file, then approve. The default for non-blocking notes is line comment, not approval body. This applies even when the observation is genuinely useful; usefulness is the bar for posting it, not for putting it in the approval body.
Exceptions that genuinely belong in the body:
Anything else that's "optional, suggestion, or feedback" belongs on a file line, not in the approval body.
When you post a line comment as part of an approval flow, start the body with AI suggestion: on its own line, then a blank line, then the actual comment text. Don't lead the first sentence with the phrase. This keeps AI-authored suggestions visually distinct from human review feedback so the author can weigh them accordingly.
AI suggestion:
This fires for every `option.is_active`, but the target only exists for the currently selected option …
The prefix applies to line comments only — the approval body is already attributed to the reviewer in the GitHub UI.
Reviewers skim. A wall of text hides the conclusion and makes the evidence hard to verify at a glance.
Default structure, always: a one-line lead, a blank line, then bullets — never a single flowing paragraph. Even a short approval takes the lead-plus-bullets shape. A prose paragraph that strings five facts together with commas and "and" is the single most common way an otherwise-good approval becomes unreadable — if you catch yourself writing one, that's the signal to split it into bullets. And break to a new line after every sentence-ending ., ?, or !, because GitHub collapses consecutive sentences into one wrapped block and the structure you intended disappears.
— because… clause, or an "and" stitching two facts together to make the point, it's too deep for the approval body — cut it to its core claim or move the detail to a line comment. The author can open the diff for the rest; the body exists to state the verdict and the one or two things that justify it.code is for one short identifier. If you'd need three or more inline-code spans in a sentence, switch to bullets (one identifier per bullet) or a fenced block.., ?, or !. GitHub markdown renders consecutive sentences as one wrapped block — hard to scan. (Periods inside identifiers like option.is_active, abbreviations like e.g., or version numbers don't count — only sentence-ending punctuation.)Anti-pattern (wall of text):
Approved. The mechanical swap from `/regex/.test(input)` to `isValidFoo()` is consistent across all interfaces, and the divergent paths are intentional per the PR description: acme keeps its existing auth carve-out, while globex intentionally always validates…
Better (lead + bullets):
Approved.
- Mechanical swap from `/regex/.test(input)` to `isValidFoo()` is consistent across all interfaces.
- Divergent paths are intentional per the PR description: acme keeps the auth carve-out, globex intentionally always validates.
- Non-blocking JSDoc cleanup can land as a follow-up.
Each bullet is a headline — trim the paragraph down to its claim. A reviewer who wants the mechanism opens the diff.
Too deep (each bullet is a multi-sentence paragraph — the most common way an approval gets bloated):
Approved.
- The validation swap is the correct fix: `validateInput()` now runs on every interface, so malformed payloads are rejected before they reach the mapper, while the legacy `/regex/` path is fully removed. Confirmed no callers still reference the old `rawInputCheck` helper.
- Legacy records are handled by normalize-on-read rather than a backfill. The check only fires on the old payload shape, and the writer now emits the canonical shape, so new writes can't re-trigger it.
- Non-blocking, follow-up: `normalizeLegacyShape` runs on every read path (4 call sites), and the records carry no version marker to self-heal. Worth normalizing once at load.
Better (headlines; the follow-up moves to a line comment):
Approved.
- Validation swap is the canonical fix: `validateInput()` rejects malformed payloads before the mapper.
- Legacy records normalize-on-read; the new write shape can't re-trigger the old path.
- Swap is complete — no callers reference the old `rawInputCheck` helper.
Short approval with evidence:
Approved.
- <specific behavior/design> is correct and fits the existing flow.
- <targeted test/check> passes.
- <known unrelated warning/failure> is pre-existing.
Approval with a coordination or material caveat:
Approved.
- <specific reason this is safe to approve>.
- <coordination/material caveat: e.g. merge after #1234, related migration in #1235, pre-existing red check>.
Approval after deeper investigation:
Approved. Verified <risk area> against <source of truth>:
- <evidence 1: behavior, data, or code path>
- <evidence 2: unaffected flow or edge case>
- <evidence 3: checks/tests>
Genuine one-liner — only when there's truly one point:
LGTM — the new validation is scoped to authenticated routes; the public endpoints are unchanged.
Padded with non-blocking nits (anti-pattern — move these to line comments instead):
Approved.
- Core change looks correct.
- Targeted test passes.
- Minor: JSDoc could be tidied as a follow-up.
- Nit: variable name on line 42 could be clearer.
- Non-blocking: consider extracting the helper into a util.
Too process-heavy (anti-pattern):
Approved. I reviewed the latest PR head from GitHub, checked existing review threads, and verified tests.
Too generic (anti-pattern):
Approved. The changes look consistent with the PR goal, and I do not see any blocking issues. Nice to move forward.
If line comments are part of the flow, post them first so the approval review wraps everything together. Then submit:
gh pr review <num> --approve --body "$(cat <<'EOF'
Approved.
- <bullet 1>
- <bullet 2>
EOF
)"
Use --body-file <path> instead of inline heredoc when the body lives in a file already. Never submit the approval before the user has clearly authorized it for this PR — prior authorization on a different PR does not carry over.
These are the patterns that cause a PR review to mislead:
resolved threads — the marker means someone clicked a button, not that the fix is correct.mock.return_value = X; assert result == X is not a test.A clean approve:
**Verdict:** Approve.
**Why:**
- The mapper change in `src/foo/mapper.ts` correctly handles nested input fields and the new test exercises the previously-broken path.
- Existing flow is unchanged; verified by reading `src/foo/handler.ts` against the diff.
- CI green; no unrelated failures.
**What I'd do differently:** Nothing material — the structure fits the existing pattern.
Hold for clarification:
**Verdict:** Hold.
**Why:**
- The new `acceptedFoo` filter changes the calculation in `barService.ts:209-218`, but it's unclear whether partial-acceptance entries should still be eligible. The PR description doesn't say.
**Critical issues to surface:**
> What's the intended behavior when `acceptedQty < expectedQty`? The current code treats partial as full — was that intentional?
Don't approve:
**Verdict:** Don't approve yet.
**Why:**
- 🔴 The `lookupFoo` test mocks the SQL client to return the exact shape the assertion checks, so the test passes regardless of the production query (`tests/lookup.test.ts:42`).
- 🔴 Existing comment at `r1234567890` about null-handling at `foo.ts:180` is not addressed in the current diff.
**Critical issues to surface:**
> The new test at `tests/lookup.test.ts:42` is tautological — the mock dictates the assertion. Suggest replacing with an integration test that hits the real query path, or removing the test if coverage already exists at the integration layer.
> The null-handling concern from r1234567890 still applies — `item.barQty` can be `undefined` for child entries, and `foo.ts:180` will throw. The diff doesn't change that path.
tools
Writes human-friendly Markdown documentation for non-developer audiences — Implementation, product, project, CS, support, QA, ops, business analysts. Explains how a feature works, how to set it up or configure it for a client, how it ties to business rules and the product roadmap — for technical readers who don't read code. Code blocks appear only when the reader will copy, paste, send, or recognize them (embed snippets, config values, sample payloads) — not to explain how the system is built. Use when asked to "document this for the implementation team", "write a guide for product / PM / CS", "explain how X works for non-devs", or for setup guides, runbooks, FAQs, hand-off docs aimed at internal non-dev teams. Also use when adding or editing a Markdown file inside the repo's `docs/` folder for a non-developer audience. Do NOT use for API reference, developer READMEs, code-level docs (JSDoc, docstrings), or framework / testing guides — those are different docs.
development
Run a Socket.dev supply-chain check before installing, updating, or executing any npm package. Triggers on `npm install/i/add/update/upgrade <pkg>`, `yarn add/upgrade/dlx`, `pnpm add/install/update/dlx`, `bun add/install`, `npx`, `pnpx`, `bunx`, or any phrase that adds/bumps a Node dependency ("let's use <pkg>", "install X", "bump <pkg>"), including direct `package.json` edits. Use this skill EVEN IF the user did not ask — npm typosquats, malware, and malicious postinstall scripts are common, and one extra check beats days of cleanup. Checks Socket score, malware verdicts, install scripts, capabilities, CVEs, and maintainer trust, then decides PROCEED, WARN, or ABORT.
development
Generates a comprehensive handover document for another AI agent to seamlessly continue work on a task without prior context. Use when asked to "handover", "save state", "switch providers", "switch to claude/codex/opencode/cursor", or prepare a summary for the next AI session. Also use when moving work between directories of the same repository.
testing
Create, edit, improve, or audit AgentSkills. Use when creating a new skill from scratch or when asked to improve, review, audit, tidy up, or clean up an existing skill or SKILL.md file. Also use when editing or restructuring a skill directory (moving files to references/ or scripts/, removing stale content, validating against the AgentSkills spec). Triggers on phrases like "create a skill", "author a skill", "tidy up a skill", "improve this skill", "review the skill", "clean up the skill", "audit the skill".