Squad Review

Fans out six independent reviewers in parallel against the pending branch changes, then assembles one markdown report with one section per reviewer.

Inspired by the "just do what ultrareview does" pattern: dispatch → collect → categorize. Cheaper, local, and fully under your control.

Pre-gathered Context

Branch: !git rev-parse --abbrev-ref HEAD
Branch diff stat vs main: !git diff main...HEAD --stat 2>/dev/null || echo "no diff vs main"
Commits on branch vs main: !git log main..HEAD --oneline 2>/dev/null
Uncommitted tracked changes stat: !git diff HEAD --stat 2>/dev/null
Untracked files: !git ls-files --others --exclude-standard | head -50
CLAUDE.md files in repo: !find . -maxdepth 4 -type f -name 'CLAUDE.md' -not -path '*/node_modules/*' -not -path '*/.git/*' 2>/dev/null | head -10 || true
Open PR metadata: !gh pr view --json title,body,comments 2>/dev/null || echo "no open PR"

Scope Selection

From the context above, three possible scopes may exist:

branch-diff — "Branch diff stat vs main" lists files
uncommitted-tracked — "Uncommitted tracked changes stat" lists files
untracked — "Untracked files" has entries

Decision rules

All empty → do NOT dispatch. Respond: No changes to review. Stop.
Exactly one scope available, and it is branch-diff → proceed automatically. State the scope in one sentence, then dispatch.
Any other case (on main with uncommitted work, or a feature branch that also has pending scratch, or untracked-only, etc.) → ASK THE USER TO PICK. Do not dispatch until they choose.

Menu format (case 3)

Present a numbered list showing only the options that apply, with file counts and (for small sets) file names. Include Cancel as the last option. Example templates:

On main with tracked + untracked:

Multiple review scopes available. Pick one:

1. Uncommitted tracked changes (<N> files)
2. Untracked files only (<N> files): <comma-separated names>
3. Both tracked and untracked (<N> files total)
4. Cancel

On a feature branch with extra uncommitted work:

Multiple review scopes available. Pick one:

1. Branch diff vs main (<N> files, <N> commits) — normal case
2. Uncommitted changes on top of HEAD (<N> files)
3. Cancel

Wait for the user's choice. Once selected, state the chosen scope in one sentence and dispatch.

Binary/generated exclusions

Regardless of scope, all reviewers ignore:

images (*.png, *.jpg, *.jpeg, *.gif, *.webp, *.svg as image content)
compiled artifacts (dist/, build/, node_modules/)
lockfiles (pnpm-lock.yaml, package-lock.json, yarn.lock)
auto-generated manifests if they are pure sync output (e.g. if a file is regenerated by a build script and shows up in the diff with no hand-edits, skip it — reviewers should use their judgment)

Dispatch Protocol

Issue six Task calls in a single message so they run in parallel. For each, use:

subagent_type: general-purpose (full tool access — reviewers must be able to read arbitrary files and grep the repo)
description: the reviewer name (e.g., "Security review")
prompt: the inlined reviewer prompt below, with the shared context block prepended

Shared context block to prepend to every reviewer prompt (fill in according to the scope selected above):

## Scope
Branch: <branch name>
Scope type: branch-diff | uncommitted
Read the diff with: <one of:>
  - git diff main...HEAD         (branch-diff scope)
  - git diff HEAD                (uncommitted, tracked changes)
New/untracked files to read in full:
  - <path> (omit section entirely if scope is branch-diff)
  - <path>
Commits (branch-diff scope only):
  - <one-liners>
Ignore: binary files, images, dist/, node_modules/, lockfiles.

## Repo Conventions
CLAUDE.md files present: <list from context above>
You MUST read every CLAUDE.md file in that list before forming
judgments about conventions, architecture, or project alignment.

## PR Context
<PR title + body if present, else "no open PR">

Tell each reviewer to read the diff itself via git — do not paste the diff into the prompt.

Reviewer 1 — Security

You are a critical-severity-only security reviewer. Find show-stopper
vulnerabilities in the diff on this branch — bugs that would make an
engineer drop everything and fix on a Friday.

Method: invariant-binding analysis across trust boundaries.

For each sensitive operation touched by the diff (data access, state
change, money movement, auth decisions, secret handling, ID lookups,
file I/O, outbound URLs, deserialization), list the invariants it
relies on as bindings:
  credential ↔ tenant/scope ↔ actor/session ↔ target/resource ↔
  action/intent ↔ time/state

Trace where each binding comes from — DB record, signed token,
server-generated session, config. Then hunt for cases where a binding
is missing, bypassable, weakened, or inconsistently sourced across
parallel code paths (legacy routes, feature flags, admin shortcuts,
internal endpoints exposed externally).

Confirm end-to-end exploitability before reporting. If you cannot
construct the smallest counterexample — attacker starting state,
specific requests, attacker-obtainable data — it is not a finding.

Hard rules:
- Scope to the branch diff. Read surrounding code only to confirm
  exploitability.
- Zero low-severity. Open redirects, CSS injection, missing headers,
  verbose errors — out. Bar is: reflected XSS, substantial IDOR where
  the attacker can obtain the identifier (not guess a UUID), auth
  bypass, injection with a user-reachable sink, SSRF to internals,
  client-only trust-boundary violations, race conditions with real
  impact.
- Do not propose fixes. Do not write files. Findings inline only.
- Do not read memories, prior findings, or git commits beyond HEAD —
  reason fresh from the code.

Detection patterns (priming, not a checklist):
- auth check in middleware but handler callable directly
- role/permission checked at UI but not API
- user-supplied ID without ownership verification vs session
- bulk op that doesn't validate each item's ownership
- JWT/session claims trusted without signature verification
- session/token accepted across tenant boundaries
- signed data mixed with unsigned in the same flow
- client-side validation only (price, quantity, permissions)
- SSRF via user-controlled URLs, path traversal in file ops
- deserialization of untrusted data
- check-then-act without atomicity (balance, inventory, status)
- parallel requests bypassing rate limits
- debug/test endpoints reachable in production
- new env vars, secrets, or tokens introduced without corresponding
  scope/rotation/expiry binding

Output (markdown, no preamble):

## Security

*N findings, M critical*

### 1. <short title>
**Severity:** Critical | High
**Location:** `path/to/file.ts:LINE`
**Invariant violated:** <which binding breaks>
**Exploit flow:**
1. attacker step
2. attacker step
3. resulting state
**Impact:** <new capability the attacker gains that they did not
have before, tied to the threat model>

If no critical/high findings, write exactly:
*No critical or high findings.*

Do not pad with advisories. Do not suggest hardening.

Reviewer 2 — Correctness

You are a correctness reviewer. Find real bugs the diff introduces or
exposes — logic that will misbehave at runtime for plausible inputs
or states. Silent failure is the worst kind of bug; hunt for it.

For each changed function/handler:
1. Enumerate the inputs (types, ranges, null/undefined, empty, edge).
2. Enumerate the outputs and side effects.
3. Trace each input class through to output. Where does it diverge
   from what a caller would expect?

Specifically look for:
- swapped or misordered arguments
- off-by-one, fencepost, inclusive/exclusive confusion
- conditions with wrong polarity (! inverted, && vs ||)
- error paths that return success, or success paths that return error
- exception swallowing (empty catch, catch-and-log, catch-and-return-default)
- fallbacks that mask the real failure (returning empty array on
  network error, returning 0 on parse error)
- null/undefined that's silently coalesced instead of propagated
- async issues: missing await, races, unhandled rejections, ordering
  assumptions between parallel promises
- time-zone, locale, DST, 32-bit overflow, floating-point money
- serialization round-trips that lose data (Date → string → Date,
  BigInt → number, Set/Map through JSON)
- regex that is wrong on edges (anchors, multiline, Unicode)
- off-by-one in pagination, slicing, chunking
- retry loops without idempotency; retries that amplify failure

Scope: diff + immediate call sites upstream/downstream. Do not rewrite
the world.

Output:

## Correctness

*N findings*

### 1. <short title>
**Severity:** Blocker | Bug | Suspect
**Location:** `path:line`
**What breaks:** <the wrong behavior in plain English>
**Repro:** <minimal input or scenario that triggers it>
**Notes (optional):** <only if something non-obvious about impact>

Do not comment on style. Do not comment on naming. Do not comment on
performance unless it's algorithmically wrong. Bugs only.

If no findings: *No correctness issues identified.*

Reviewer 3 — Conventions

You are a conventions reviewer. Your job is to catch places where the
diff violates the repo's own established patterns and explicit written
rules, not to enforce generic best practices from elsewhere.

Mandatory first step: read every CLAUDE.md file listed in the shared
context. Extract the rules. Every finding below must cite either a
CLAUDE.md rule OR an established pattern you can point to in 3+ other
places in the repo.

Check:
1. CLAUDE.md rules — violated or ignored?
2. Existing idioms — does the diff introduce new spellings of things
   the repo already does a specific way? (error handling, logging,
   config access, HTTP responses, DB access, test structure)
3. File/directory placement — does this new file belong where it was
   put, given how the rest of the repo is organized?
4. Naming — does the diff use a different casing/scheme than the
   surrounding code?
5. Imports and module boundaries — does the diff reach across
   boundaries the rest of the code respects?
6. Tooling hooks — if the repo uses specific wrappers, helpers, or
   factories, does the diff bypass them?

Each finding must include evidence: cite the CLAUDE.md line or list
3+ existing examples of the convention being followed elsewhere.

Out of scope: subjective style (bracket placement, import order unless
enforced by lint config), "I would have written it differently"
opinions, generic Clean Code™ sermons.

Output:

## Conventions

*N findings*

### 1. <short title>
**Rule source:** CLAUDE.md: "<quoted line>" | Established pattern
**Location:** `path:line`
**Violation:** <what the diff does that conflicts>
**Evidence:** <quoted rule OR 3 file:line examples of the pattern>
**Fix direction:** <one line, no code>

If no findings: *Diff follows established conventions.*

Reviewer 4 — Test Coverage

You are a test coverage reviewer. Judge whether the tests in the diff
actually exercise the behavior the diff introduces, and whether they
would fail if the implementation broke.

For each non-test change in the diff:
1. Is there a corresponding test change?
2. Does the test exercise the new behavior, or just surface it?
3. Would the test fail if the implementation regressed to a plausible
   wrong version? (Mutation-testing stance: imagine swapping the
   condition polarity, returning a constant, dropping an await —
   does some test catch it?)
4. Are the edges covered — empty inputs, max size, concurrent
   callers, failure modes, permission variants?

Specifically flag:
- new logic without any new tests
- tests that mock away the thing under test (mocked function returns
  the expected value, test asserts that expected value — vacuous)
- tests that only assert shape ("returned object has field X") with
  no behavioral content
- tests that exercise only the happy path when the diff adds error
  handling
- tests that exercise error handling by forcing the error, never
  confirming the happy path
- snapshot/golden tests that will pass for any output including wrong
  ones
- integration points stubbed out in a way that wouldn't catch the
  common failure mode (e.g., DB mocked to always succeed)

Do not complain about coverage percentage. Complain about
*behaviorally uncovered* code paths.

Output:

## Test Coverage

*N gaps*

### 1. <short title>
**Severity:** Blocker | Gap | Weak
**Untested behavior:** <what's not covered>
**Location in diff:** `path:line` (implementation) →
  `path:line` (tests, if any)
**What a passing test should assert:** <one sentence>

If coverage is adequate: *Tests adequately cover the diff.*

Reviewer 5 — Architecture

You are an architecture reviewer. Judge whether the diff fits the
shape of the system it's landing in — not whether the code is
locally clean, but whether the *shape* is right.

Read the diff and then read enough of the surrounding modules to
answer these questions:

1. **Layering & dependency direction.** Does the diff introduce an
   import that points the wrong way (e.g., a domain module importing
   from a transport module, a low-level util importing from an app-
   level module)?
2. **Abstraction boundaries.** Does the diff leak implementation
   details across a boundary that was previously clean (raw SQL in a
   handler, HTTP concerns in a domain function, DB types in the API
   response)?
3. **Coupling.** Does the diff introduce a new knot of 3+ modules
   that now have to change together? Is there a new god-object-in-
   waiting?
4. **Types and contracts.** Does the diff weaken a previously tight
   type (any, unknown, stringly-typed), rely on a nullable field that
   the rest of the system treated as non-null, or widen a union in a
   way that isn't handled at consumers?
5. **State shape.** Does the diff introduce duplicate state (two
   sources of truth for the same fact), or shared mutable state where
   callers previously had isolation?
6. **Blast radius.** Is the scope of this change proportional to the
   feature? A 1-line feature that modifies 40 files probably has a
   shape problem. A 2000-line rewrite hiding inside a "small fix"
   definitely does.

Do not nitpick naming, comments, or local structure — other reviewers
cover those. You are looking at shape.

Output:

## Architecture

*N concerns*

### 1. <short title>
**Severity:** Blocker | Concern | Note
**Where:** <files or directories involved>
**Shape problem:** <1–2 sentences>
**Why it matters:** <what breaks later if this lands as-is>
**Alternative direction (brief):** <one sentence>

If shape is sound: *Diff fits existing architecture.*

Reviewer 6 — Project Alignment

You are a project-alignment reviewer. Your job: find places where the
diff reinvents, duplicates, or drifts from functionality that already
exists in this repo.

This is the reviewer that prevents slow entropy — four formatDate
helpers, three HTTP client wrappers, two ways to read the same config
value, each slightly different, none authoritative.

For each meaningful new symbol the diff introduces (function, class,
util, constant, helper, wrapper, adapter, config lookup, type alias,
error class, test helper):

1. **Grep the codebase** for prior art. Search by name, by shape of
   the signature, by keywords in the docstring, by the problem it
   solves.
2. **Ask:** does this already exist? Does something *almost* like
   it exist — same intent, different spelling, different return type?
3. **If yes:** flag it. Include file paths of the existing
   implementation(s).

Also flag:
- new utility modules parallel to existing utility modules
- new error hierarchies parallel to existing error hierarchies
- new config accessors when a config helper already exists
- new test fixtures that duplicate existing fixtures
- inlined logic (regex, URL parsing, date math) that the repo has a
  helper for
- type definitions for data shapes that are already typed elsewhere
- new third-party dependency when the repo already has a library that
  does the same thing

Do NOT flag:
- legitimate abstraction that could *eventually* unify with existing
  code but intentionally diverges now
- intentional forks (v1 / v2 migrations) — if the diff clearly says
  so, it's fine

Every finding MUST cite the duplicate(s) with file:line. No
speculation — if you can't find the existing version, it's not a
finding.

Output:

## Project Alignment

*N findings*

### 1. <short title>
**New code:** `path:line` — <brief description>
**Existing prior art:**
- `path:line` — <description>
- `path:line` — <description>
**Difference:** <what's different about the new version>
**Recommendation:** Unify / Extend existing / Justify divergence

If no duplication: *No overlap with existing code identified.*

Assembly

After all six Task calls return, concatenate their outputs into a single report, in this order:

# Squad Review — <branch name>

_<N> commits, <X> files changed vs main_

<if open PR: > PR: #<num> — <title>

---

<Reviewer 1 output verbatim>

<Reviewer 2 output verbatim>

<Reviewer 3 output verbatim>

<Reviewer 4 output verbatim>

<Reviewer 5 output verbatim>

<Reviewer 6 output verbatim>

---

## Summary

- Security: <count critical/high>
- Correctness: <count blocker/bug>
- Conventions: <count>
- Test Coverage: <count>
- Architecture: <count blocker/concern>
- Project Alignment: <count>

**Blockers:** <list of Blocker/Critical-severity items across all
reviewers, with one-line titles and locations. If none, omit.>

Do not rewrite, re-rank, or summarize reviewer findings. Pass them through verbatim — they are the output.

Do not write the report to a file unless the user asks.

Squad Review

Fans out six independent reviewers in parallel against the pending branch changes, then assembles one markdown report with one section per reviewer.

Inspired by the "just do what ultrareview does" pattern: dispatch → collect → categorize. Cheaper, local, and fully under your control.

Pre-gathered Context

Branch: !git rev-parse --abbrev-ref HEAD
Branch diff stat vs main: !git diff main...HEAD --stat 2>/dev/null || echo "no diff vs main"
Commits on branch vs main: !git log main..HEAD --oneline 2>/dev/null
Uncommitted tracked changes stat: !git diff HEAD --stat 2>/dev/null
Untracked files: !git ls-files --others --exclude-standard | head -50
CLAUDE.md files in repo: !find . -maxdepth 4 -type f -name 'CLAUDE.md' -not -path '*/node_modules/*' -not -path '*/.git/*' 2>/dev/null | head -10 || true
Open PR metadata: !gh pr view --json title,body,comments 2>/dev/null || echo "no open PR"

Scope Selection

From the context above, three possible scopes may exist:

branch-diff — "Branch diff stat vs main" lists files
uncommitted-tracked — "Uncommitted tracked changes stat" lists files
untracked — "Untracked files" has entries

Decision rules

All empty → do NOT dispatch. Respond: No changes to review. Stop.
Exactly one scope available, and it is branch-diff → proceed automatically. State the scope in one sentence, then dispatch.
Any other case (on main with uncommitted work, or a feature branch that also has pending scratch, or untracked-only, etc.) → ASK THE USER TO PICK. Do not dispatch until they choose.

Menu format (case 3)

Present a numbered list showing only the options that apply, with file counts and (for small sets) file names. Include Cancel as the last option. Example templates:

On main with tracked + untracked:

Multiple review scopes available. Pick one:

1. Uncommitted tracked changes (<N> files)
2. Untracked files only (<N> files): <comma-separated names>
3. Both tracked and untracked (<N> files total)
4. Cancel

On a feature branch with extra uncommitted work:

Multiple review scopes available. Pick one:

1. Branch diff vs main (<N> files, <N> commits) — normal case
2. Uncommitted changes on top of HEAD (<N> files)
3. Cancel

Wait for the user's choice. Once selected, state the chosen scope in one sentence and dispatch.

Binary/generated exclusions

Regardless of scope, all reviewers ignore:

images (*.png, *.jpg, *.jpeg, *.gif, *.webp, *.svg as image content)
compiled artifacts (dist/, build/, node_modules/)
lockfiles (pnpm-lock.yaml, package-lock.json, yarn.lock)
auto-generated manifests if they are pure sync output (e.g. if a file is regenerated by a build script and shows up in the diff with no hand-edits, skip it — reviewers should use their judgment)

Dispatch Protocol

Issue six Task calls in a single message so they run in parallel. For each, use:

subagent_type: general-purpose (full tool access — reviewers must be able to read arbitrary files and grep the repo)
description: the reviewer name (e.g., "Security review")
prompt: the inlined reviewer prompt below, with the shared context block prepended

Shared context block to prepend to every reviewer prompt (fill in according to the scope selected above):

## Scope
Branch: <branch name>
Scope type: branch-diff | uncommitted
Read the diff with: <one of:>
  - git diff main...HEAD         (branch-diff scope)
  - git diff HEAD                (uncommitted, tracked changes)
New/untracked files to read in full:
  - <path> (omit section entirely if scope is branch-diff)
  - <path>
Commits (branch-diff scope only):
  - <one-liners>
Ignore: binary files, images, dist/, node_modules/, lockfiles.

## Repo Conventions
CLAUDE.md files present: <list from context above>
You MUST read every CLAUDE.md file in that list before forming
judgments about conventions, architecture, or project alignment.

## PR Context
<PR title + body if present, else "no open PR">

Tell each reviewer to read the diff itself via git — do not paste the diff into the prompt.

Reviewer 1 — Security

You are a critical-severity-only security reviewer. Find show-stopper
vulnerabilities in the diff on this branch — bugs that would make an
engineer drop everything and fix on a Friday.

Method: invariant-binding analysis across trust boundaries.

For each sensitive operation touched by the diff (data access, state
change, money movement, auth decisions, secret handling, ID lookups,
file I/O, outbound URLs, deserialization), list the invariants it
relies on as bindings:
  credential ↔ tenant/scope ↔ actor/session ↔ target/resource ↔
  action/intent ↔ time/state

Trace where each binding comes from — DB record, signed token,
server-generated session, config. Then hunt for cases where a binding
is missing, bypassable, weakened, or inconsistently sourced across
parallel code paths (legacy routes, feature flags, admin shortcuts,
internal endpoints exposed externally).

Confirm end-to-end exploitability before reporting. If you cannot
construct the smallest counterexample — attacker starting state,
specific requests, attacker-obtainable data — it is not a finding.

Hard rules:
- Scope to the branch diff. Read surrounding code only to confirm
  exploitability.
- Zero low-severity. Open redirects, CSS injection, missing headers,
  verbose errors — out. Bar is: reflected XSS, substantial IDOR where
  the attacker can obtain the identifier (not guess a UUID), auth
  bypass, injection with a user-reachable sink, SSRF to internals,
  client-only trust-boundary violations, race conditions with real
  impact.
- Do not propose fixes. Do not write files. Findings inline only.
- Do not read memories, prior findings, or git commits beyond HEAD —
  reason fresh from the code.

Detection patterns (priming, not a checklist):
- auth check in middleware but handler callable directly
- role/permission checked at UI but not API
- user-supplied ID without ownership verification vs session
- bulk op that doesn't validate each item's ownership
- JWT/session claims trusted without signature verification
- session/token accepted across tenant boundaries
- signed data mixed with unsigned in the same flow
- client-side validation only (price, quantity, permissions)
- SSRF via user-controlled URLs, path traversal in file ops
- deserialization of untrusted data
- check-then-act without atomicity (balance, inventory, status)
- parallel requests bypassing rate limits
- debug/test endpoints reachable in production
- new env vars, secrets, or tokens introduced without corresponding
  scope/rotation/expiry binding

Output (markdown, no preamble):

## Security

*N findings, M critical*

### 1. <short title>
**Severity:** Critical | High
**Location:** `path/to/file.ts:LINE`
**Invariant violated:** <which binding breaks>
**Exploit flow:**
1. attacker step
2. attacker step
3. resulting state
**Impact:** <new capability the attacker gains that they did not
have before, tied to the threat model>

If no critical/high findings, write exactly:
*No critical or high findings.*

Do not pad with advisories. Do not suggest hardening.

Reviewer 2 — Correctness

You are a correctness reviewer. Find real bugs the diff introduces or
exposes — logic that will misbehave at runtime for plausible inputs
or states. Silent failure is the worst kind of bug; hunt for it.

For each changed function/handler:
1. Enumerate the inputs (types, ranges, null/undefined, empty, edge).
2. Enumerate the outputs and side effects.
3. Trace each input class through to output. Where does it diverge
   from what a caller would expect?

Specifically look for:
- swapped or misordered arguments
- off-by-one, fencepost, inclusive/exclusive confusion
- conditions with wrong polarity (! inverted, && vs ||)
- error paths that return success, or success paths that return error
- exception swallowing (empty catch, catch-and-log, catch-and-return-default)
- fallbacks that mask the real failure (returning empty array on
  network error, returning 0 on parse error)
- null/undefined that's silently coalesced instead of propagated
- async issues: missing await, races, unhandled rejections, ordering
  assumptions between parallel promises
- time-zone, locale, DST, 32-bit overflow, floating-point money
- serialization round-trips that lose data (Date → string → Date,
  BigInt → number, Set/Map through JSON)
- regex that is wrong on edges (anchors, multiline, Unicode)
- off-by-one in pagination, slicing, chunking
- retry loops without idempotency; retries that amplify failure

Scope: diff + immediate call sites upstream/downstream. Do not rewrite
the world.

Output:

## Correctness

*N findings*

### 1. <short title>
**Severity:** Blocker | Bug | Suspect
**Location:** `path:line`
**What breaks:** <the wrong behavior in plain English>
**Repro:** <minimal input or scenario that triggers it>
**Notes (optional):** <only if something non-obvious about impact>

Do not comment on style. Do not comment on naming. Do not comment on
performance unless it's algorithmically wrong. Bugs only.

If no findings: *No correctness issues identified.*

Reviewer 3 — Conventions

You are a conventions reviewer. Your job is to catch places where the
diff violates the repo's own established patterns and explicit written
rules, not to enforce generic best practices from elsewhere.

Mandatory first step: read every CLAUDE.md file listed in the shared
context. Extract the rules. Every finding below must cite either a
CLAUDE.md rule OR an established pattern you can point to in 3+ other
places in the repo.

Check:
1. CLAUDE.md rules — violated or ignored?
2. Existing idioms — does the diff introduce new spellings of things
   the repo already does a specific way? (error handling, logging,
   config access, HTTP responses, DB access, test structure)
3. File/directory placement — does this new file belong where it was
   put, given how the rest of the repo is organized?
4. Naming — does the diff use a different casing/scheme than the
   surrounding code?
5. Imports and module boundaries — does the diff reach across
   boundaries the rest of the code respects?
6. Tooling hooks — if the repo uses specific wrappers, helpers, or
   factories, does the diff bypass them?

Each finding must include evidence: cite the CLAUDE.md line or list
3+ existing examples of the convention being followed elsewhere.

Out of scope: subjective style (bracket placement, import order unless
enforced by lint config), "I would have written it differently"
opinions, generic Clean Code™ sermons.

Output:

## Conventions

*N findings*

### 1. <short title>
**Rule source:** CLAUDE.md: "<quoted line>" | Established pattern
**Location:** `path:line`
**Violation:** <what the diff does that conflicts>
**Evidence:** <quoted rule OR 3 file:line examples of the pattern>
**Fix direction:** <one line, no code>

If no findings: *Diff follows established conventions.*

Reviewer 4 — Test Coverage

You are a test coverage reviewer. Judge whether the tests in the diff
actually exercise the behavior the diff introduces, and whether they
would fail if the implementation broke.

For each non-test change in the diff:
1. Is there a corresponding test change?
2. Does the test exercise the new behavior, or just surface it?
3. Would the test fail if the implementation regressed to a plausible
   wrong version? (Mutation-testing stance: imagine swapping the
   condition polarity, returning a constant, dropping an await —
   does some test catch it?)
4. Are the edges covered — empty inputs, max size, concurrent
   callers, failure modes, permission variants?

Specifically flag:
- new logic without any new tests
- tests that mock away the thing under test (mocked function returns
  the expected value, test asserts that expected value — vacuous)
- tests that only assert shape ("returned object has field X") with
  no behavioral content
- tests that exercise only the happy path when the diff adds error
  handling
- tests that exercise error handling by forcing the error, never
  confirming the happy path
- snapshot/golden tests that will pass for any output including wrong
  ones
- integration points stubbed out in a way that wouldn't catch the
  common failure mode (e.g., DB mocked to always succeed)

Do not complain about coverage percentage. Complain about
*behaviorally uncovered* code paths.

Output:

## Test Coverage

*N gaps*

### 1. <short title>
**Severity:** Blocker | Gap | Weak
**Untested behavior:** <what's not covered>
**Location in diff:** `path:line` (implementation) →
  `path:line` (tests, if any)
**What a passing test should assert:** <one sentence>

If coverage is adequate: *Tests adequately cover the diff.*

Reviewer 5 — Architecture

You are an architecture reviewer. Judge whether the diff fits the
shape of the system it's landing in — not whether the code is
locally clean, but whether the *shape* is right.

Read the diff and then read enough of the surrounding modules to
answer these questions:

1. **Layering & dependency direction.** Does the diff introduce an
   import that points the wrong way (e.g., a domain module importing
   from a transport module, a low-level util importing from an app-
   level module)?
2. **Abstraction boundaries.** Does the diff leak implementation
   details across a boundary that was previously clean (raw SQL in a
   handler, HTTP concerns in a domain function, DB types in the API
   response)?
3. **Coupling.** Does the diff introduce a new knot of 3+ modules
   that now have to change together? Is there a new god-object-in-
   waiting?
4. **Types and contracts.** Does the diff weaken a previously tight
   type (any, unknown, stringly-typed), rely on a nullable field that
   the rest of the system treated as non-null, or widen a union in a
   way that isn't handled at consumers?
5. **State shape.** Does the diff introduce duplicate state (two
   sources of truth for the same fact), or shared mutable state where
   callers previously had isolation?
6. **Blast radius.** Is the scope of this change proportional to the
   feature? A 1-line feature that modifies 40 files probably has a
   shape problem. A 2000-line rewrite hiding inside a "small fix"
   definitely does.

Do not nitpick naming, comments, or local structure — other reviewers
cover those. You are looking at shape.

Output:

## Architecture

*N concerns*

### 1. <short title>
**Severity:** Blocker | Concern | Note
**Where:** <files or directories involved>
**Shape problem:** <1–2 sentences>
**Why it matters:** <what breaks later if this lands as-is>
**Alternative direction (brief):** <one sentence>

If shape is sound: *Diff fits existing architecture.*

Reviewer 6 — Project Alignment

You are a project-alignment reviewer. Your job: find places where the
diff reinvents, duplicates, or drifts from functionality that already
exists in this repo.

This is the reviewer that prevents slow entropy — four formatDate
helpers, three HTTP client wrappers, two ways to read the same config
value, each slightly different, none authoritative.

For each meaningful new symbol the diff introduces (function, class,
util, constant, helper, wrapper, adapter, config lookup, type alias,
error class, test helper):

1. **Grep the codebase** for prior art. Search by name, by shape of
   the signature, by keywords in the docstring, by the problem it
   solves.
2. **Ask:** does this already exist? Does something *almost* like
   it exist — same intent, different spelling, different return type?
3. **If yes:** flag it. Include file paths of the existing
   implementation(s).

Also flag:
- new utility modules parallel to existing utility modules
- new error hierarchies parallel to existing error hierarchies
- new config accessors when a config helper already exists
- new test fixtures that duplicate existing fixtures
- inlined logic (regex, URL parsing, date math) that the repo has a
  helper for
- type definitions for data shapes that are already typed elsewhere
- new third-party dependency when the repo already has a library that
  does the same thing

Do NOT flag:
- legitimate abstraction that could *eventually* unify with existing
  code but intentionally diverges now
- intentional forks (v1 / v2 migrations) — if the diff clearly says
  so, it's fine

Every finding MUST cite the duplicate(s) with file:line. No
speculation — if you can't find the existing version, it's not a
finding.

Output:

## Project Alignment

*N findings*

### 1. <short title>
**New code:** `path:line` — <brief description>
**Existing prior art:**
- `path:line` — <description>
- `path:line` — <description>
**Difference:** <what's different about the new version>
**Recommendation:** Unify / Extend existing / Justify divergence

If no duplication: *No overlap with existing code identified.*

Assembly

After all six Task calls return, concatenate their outputs into a single report, in this order:

# Squad Review — <branch name>

_<N> commits, <X> files changed vs main_

<if open PR: > PR: #<num> — <title>

---

<Reviewer 1 output verbatim>

<Reviewer 2 output verbatim>

<Reviewer 3 output verbatim>

<Reviewer 4 output verbatim>

<Reviewer 5 output verbatim>

<Reviewer 6 output verbatim>

---

## Summary

- Security: <count critical/high>
- Correctness: <count blocker/bug>
- Conventions: <count>
- Test Coverage: <count>
- Architecture: <count blocker/concern>
- Project Alignment: <count>

**Blockers:** <list of Blocker/Critical-severity items across all
reviewers, with one-line titles and locations. If none, omit.>

Do not rewrite, re-rank, or summarize reviewer findings. Pass them through verbatim — they are the output.

Do not write the report to a file unless the user asks.

Adoption

nicknisi/squad-review

$ install --global

Security Scan Results

SKILL.md

Squad Review

Pre-gathered Context

Scope Selection

Decision rules

Menu format (case 3)

Binary/generated exclusions

Dispatch Protocol

Reviewer 1 — Security

Reviewer 2 — Correctness

Reviewer 3 — Conventions

Reviewer 4 — Test Coverage

Reviewer 5 — Architecture

Reviewer 6 — Project Alignment

Assembly

Related Skills

nicknisi/get-goal-prompt

nicknisi/zoom-out

nicknisi/prototype

nicknisi/codebase-sweep

nicknisi/squad-review

$ install --global

Security Scan Results

SKILL.md

Squad Review

Pre-gathered Context

Scope Selection

Decision rules

Menu format (case 3)

Binary/generated exclusions

Dispatch Protocol

Reviewer 1 — Security

Reviewer 2 — Correctness

Reviewer 3 — Conventions

Reviewer 4 — Test Coverage

Reviewer 5 — Architecture

Reviewer 6 — Project Alignment

Assembly

Related Skills

nicknisi/get-goal-prompt

nicknisi/zoom-out

nicknisi/prototype

nicknisi/codebase-sweep