skills/parallel-review-implementation/SKILL.md
Per-package implementation review producing structured findings per review-findings.schema.json
npx skillsauth add jankneumann/agentic-coding-tools parallel-review-implementationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Receive a work package diff as read-only input and produce structured findings conforming to review-findings.schema.json. Designed for vendor-diverse dispatch — runs independently per package.
$ARGUMENTS - <change-id> <package-id> (e.g., "add-user-auth wp-backend")
Optional flags:
--adversarial — Use adversarial review mode: challenges design decisions instead of standard reviewImplementation review uses the provider-neutral dispatch adapter/configuration path as the canonical cross-provider mechanism. Claude Code, Codex, and Gemini/Jules are first-class reviewers when configured; provider-specific CLI or harness details stay inside their adapters.
The reviewer receives per-package context:
work-packages.yaml (scope, locks, verification)contracts/ relevant to this packagegit diff <base>...<head>)tasks.mdThe reviewer MUST NOT modify any files.
Every finding produced by this skill MUST be classified into BOTH dimensions below. The JSON Schema at openspec/schemas/review-findings.schema.json enforces both fields as required — output that omits either is rejected by the validator in Step 7.
axis field)Adopted from the code-review-and-quality reference skill. Pick exactly one per finding:
| Axis | What it covers in implementation review |
|---|---|
| correctness | Does the code do what the spec/contract demands? Bugs, off-by-one, broken edge cases, wrong return values. |
| readability | Will the next maintainer understand intent? Naming, comments, dead code, unclear control flow. |
| architecture | Does the change respect module boundaries? Coupling, layering, dependency direction, premature abstraction, scope creep. |
| security | Does the code introduce or fail to prevent risk? Injection, missing auth/authz, insecure defaults, leaked secrets, OWASP categories. |
| performance | Will it scale? N+1 queries, unbounded loops, missing pagination, sync calls in hot paths, memory blowups. |
The legacy type enum (spec_gap, contract_mismatch, etc. — see Step 6) is preserved for backward compatibility; axis is the new mandatory categorization that all reviewers — human or vendor — must agree on.
severity field)Every finding's description MUST begin with one of these markers. The severity enum value MUST match the prefix.
| Prefix | Severity value | Meaning |
|---|---|---|
| Critical | critical | Blocks merge. Must be fixed before integration. |
| Nit | nit | Should fix but does not block. Quality, naming, minor structure. |
| Optional | optional | Consider it. Author may accept or reject without further discussion. |
| FYI | fyi | Informational. Surfaces context the author may not have known; no action required. |
| none | none | Positive observation. Names what the implementation got right so good patterns survive review. |
Example finding (note prefix and matching severity):
{
"id": 1,
"axis": "security",
"severity": "critical",
"type": "security",
"criticality": "critical",
"description": "Critical: handlers/users.py:42 builds the SQL query with f-string concatenation of `request.user_id` — SQL injection vector.",
"resolution": "Use a parameterized query (`?` placeholder + bound value) via the existing `db.execute(query, params)` helper.",
"disposition": "fix",
"package_id": "wp-backend",
"file_path": "handlers/users.py",
"line_range": {"start": 42, "end": 44}
}
Reviewers MUST NOT collapse multiple severities onto one finding (split them). Reviewers MUST NOT use a severity that contradicts the disposition (e.g., severity: critical with disposition: accept is incoherent — escalate instead).
Before loading review context, determine the review mode:
if [ -f "openspec/changes/$CHANGE_ID/work-packages.yaml" ]; then
# Verify the file is valid YAML
python3 -c "import yaml; yaml.safe_load(open('openspec/changes/$CHANGE_ID/work-packages.yaml'))" 2>/dev/null
if [ $? -eq 0 ]; then
REVIEW_MODE="per-package"
else
REVIEW_MODE="whole-branch"
echo "WARNING: work-packages.yaml exists but cannot be parsed. Falling back to whole-branch review."
fi
else
REVIEW_MODE="whole-branch"
fi
Whole-branch mode: When work-packages.yaml is missing or malformed, treat the entire branch diff as a single review unit. Use package_id: "whole-branch" in findings output. Skip Steps 2 (Scope Verification) and conditionally skip Step 3 (Contract Compliance) based on contract availability.
Per-package mode: When work-packages.yaml exists and is valid, use existing per-package review logic (Steps 1-6 unchanged).
If REVIEW_MODE is "whole-branch", skip to Step 1-WB below. Otherwise, continue with Step 1 as normal.
When in whole-branch mode:
git diff <base>...<head>contracts/ exists AND contains files other than README.md: load contract artifacts for compliance reviewcontracts/ is missing or contains only README.md: skip contract compliance (Step 3)specs/**/spec.mdPACKAGE_ID="whole-branch" for all findings outputAfter loading context, skip to Step 3 (if contracts exist) or Step 4 (if no contracts).
Parse the package-id argument and load:
work-packages.yaml and extract the target package definitionspecs/**/spec.mdSkip this step in whole-branch mode (no package scopes to verify).
Before reviewing code quality, verify scope compliance:
write_allow globsdeny globsIf scope violations are found, emit a correctness finding with critical criticality.
In whole-branch mode: Skip this step if contracts/ is missing or contains only README.md. If machine-readable contract artifacts exist, perform compliance review against the full branch diff instead of per-package diff.
Check that the implementation matches declared contracts:
Standard code review criteria:
If work-queue result is available:
verification.passed is consistent with step resultsIf --adversarial flag was passed, the review prompt should be wrapped with adversarial framing:
from adversarial_prompt import wrap_adversarial
prompt = wrap_adversarial(prompt) # Prepends contrarian persona instructions
This changes the review persona to challenge design decisions rather than just checking correctness. The dispatch mode remains review (unchanged) and findings use the standard schema.
Generate findings as JSON conforming to review-findings.schema.json:
{
"review_type": "implementation",
"target": "<package-id>",
"reviewer_vendor": "<model-name>",
"findings": [
{
"id": 1,
"type": "contract_mismatch",
"criticality": "high",
"description": "POST /v1/users returns 200 but OpenAPI spec declares 201",
"resolution": "Change response status code to 201 Created",
"disposition": "fix",
"package_id": "wp-backend"
}
]
}
spec_gap — Implementation misses a spec requirementcontract_mismatch — Code doesn't match contract (OpenAPI, DB schema, events)architecture — Structural concern or pattern violationsecurity — Security vulnerabilityperformance — Performance concernstyle — Code style or convention issuecorrectness — Bug or logical errorobservability — Missing logging, metrics, or health endpointscompatibility — Breaking change to existing API or missing migration rollbackresilience — Missing retry, timeout, or idempotency handlingfix — Must fix before integration mergeregenerate — Contract needs updating (triggers escalation)accept — Minor issue, acceptable as-isescalate — Requires orchestrator decision (scope violation, contract revision)python3 -c "
import json, jsonschema
schema = json.load(open('openspec/schemas/review-findings.schema.json'))
findings = json.load(open('<findings-output-path>'))
jsonschema.validate(findings, schema)
print('Valid')
"
Write findings to artifacts/<package-id>/review-findings.json.
If any finding has disposition: "escalate" or disposition: "regenerate", the orchestrator will handle escalation (pause-lock, contract revision bump, etc.).
artifacts/<package-id>/review-findings.json conforming to review-findings.schema.jsonThe orchestrator dispatches this skill once per completed work package:
Integration Gate Logic (orchestrator-side, consensus-aware):
fix finding → return to package agent for remediationescalate finding → trigger escalation protocolLike parallel-review-plan, this skill is self-contained:
When this skill is dispatched to another vendor by the orchestrator, only the review steps run (produce findings). Multi-vendor dispatch is handled by the orchestrating agent in Phase C3 of /parallel-implement-feature.
Agent discovery resolution chain: The dispatcher resolves agents via the coordination MCP server configured in ~/.claude.json → mcpServers.coordination. It extracts the agent-coordinator/ directory from the MCP server args and runs get_dispatch_configs.py to load agents.yaml. If the coordinator is not configured, pass --agents-yaml <path> explicitly as fallback. Use --list-agents to verify available agents.
| Rationalization | Why it's wrong |
|---|---|
| "The diff is small — I'll skip the axis classification" | The schema rejects findings without axis/severity regardless of diff size. Cross-vendor consensus matches findings by axis + file_path + line_range; missing axis means your review is invisible to the integration gate. |
| "This is a correctness issue but it also has security implications — I'll merge them" | Split into two findings. The integration gate routes security findings differently (mandatory fix, never accept). Burying security under correctness lets the gate under-react. |
| "The verification result already says PASS — I don't need to look at the code" | The verification cross-check (Step 5) is necessary but not sufficient. PASS only means the package's own tests passed; cross-package interactions, security, and architectural fit are not covered. |
| "I'll mark everything Nit to be polite" | Politeness is not the goal; truth is. A Critical finding marked Nit causes the integration gate to merge a broken change. Use none for positive observations instead of downgrading real issues. |
review-findings.json for a non-empty diff that contains zero findings AND no severity: none positive observations — the reviewer almost certainly did not actually read the diff.severity: critical paired with disposition: accept — these contradict each other; the orchestrator's integration gate cannot resolve this safely.axis: security and disposition: accept — security findings must be fix or escalate, never silently accepted.Critical: / Nit: / Optional: / FYI:). The prefix is the human-readable signal; if it disagrees with the enum, the reviewer wrote JSON without re-reading the prose.file_path/line_range for code-level issues (correctness, security, performance) — these fields are what enables cross-vendor consensus matching; omitting them isolates the finding.write_allow should always produce a correctness + severity: critical finding (see Step 2).Valid — this proves axis and severity are present on every finding.description text begins with the prefix matching the severity enum value (e.g., severity: critical ↔ description starts with Critical:).disposition is coherent with severity AND axis: security findings never accept; critical findings always fix or escalate; none findings always accept.axis values appear across the findings array (a single-axis review missed the other four dimensions of the schema).write_allow, it must appear as a severity: critical finding.reviewer_vendor and package_id (or target: "whole-branch" in whole-branch mode) are populated — anonymous or untargeted findings cannot participate in consensus.testing
Create and edit Obsidian Flavored Markdown with wikilinks, embeds, callouts, properties, and other Obsidian-specific syntax. Use when working with .md files in Obsidian, or when the user mentions wikilinks, callouts, frontmatter, tags, embeds, or Obsidian notes.
tools
Interact with Obsidian vaults using the Obsidian CLI to read, create, search, and manage notes, tasks, properties, and more. Also supports plugin and theme development with commands to reload plugins, run JavaScript, capture errors, take screenshots, and inspect the DOM. Use when the user asks to interact with their Obsidian vault, manage notes, search vault content, perform vault operations from the command line, or develop and debug Obsidian plugins and themes.
data-ai
Create and edit Obsidian Bases (.base files) with views, filters, formulas, and summaries. Use when working with .base files, creating database-like views of notes, or when the user mentions Bases, table views, card views, filters, or formulas in Obsidian.
tools
Create and edit JSON Canvas files (.canvas) with nodes, edges, groups, and connections. Use when working with .canvas files, creating visual canvases, mind maps, flowcharts, or when the user mentions Canvas files in Obsidian.