skills/architecture-peer-review/SKILL.md
Use this skill to perform a Solution Architect peer review of ARCHITECTURE.md documents. Generates an interactive HTML playground for reviewing and triaging findings with approve/reject/comment workflow. Invoke when the user asks to review, critique, peer-review, or assess architecture documentation quality, asks for architecture feedback or a second opinion, or wants scalability/security/performance analysis of their architecture.
npx skillsauth add shadowx4fox/solutions-architect-skills architecture-peer-reviewInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill acts as a Solution Architect peer reviewer for ARCHITECTURE.md documents. It evaluates design decisions, questions trade-offs, identifies risks, and assesses real-world viability — the kind of review a senior architect would give in a design review meeting.
The output is an interactive HTML playground (via the playground plugin) where the user triages findings, approves or rejects each one, adds comments, and copies a fix prompt back to Claude.
Distinct from the architecture-docs skill's REVIEW_AUDIT_WORKFLOW: that workflow validates form compliance (structure, template adherence). This skill provides opinionated architectural judgment on quality, soundness, and production-readiness.
/skill architecture-peer-reviewDo NOT invoke for:
architecture-docs skill (REVIEW_AUDIT_WORKFLOW)architecture-compliance skillarchitecture-docs skillarchitecture-readiness skillarchitecture-component-guardian skill| File | Purpose |
|------|---------|
| SKILL.md | This file — entry point and workflow |
| PEER_REVIEW_CRITERIA.md | All review checks organized by depth level and category (82 checks across 13 categories) |
| PLAYGROUND_TEMPLATE.md | Custom playground template for the interactive review HTML file |
| agents/reviewers/peer-review-category-agent.md | Universal category reviewer sub-agent — evaluates one category's checks in parallel |
Before starting a full review, check if the user explicitly requested regeneration from existing data (e.g., "regenerate playground", "rebuild review playground", "reload peer review results").
If the user requested regeneration:
architecture-peer-review-*.json in the project root.reviewData.findings (array), scorecard (object with overall, rating, categories), and depthLevel (string).metadata.sourceFiles, compare each file's lastModified timestamp against the current file modification time. If any source file has been modified since the review, warn:
⚠️ Source files changed since this review was saved:
- docs/07-security-architecture.md (modified 2026-04-05, review from 2026-04-03)
Results may be stale. Proceed with regeneration or run a fresh review?
reviewData.♻️ Regenerated playground from saved results (skipped agent evaluation)If no JSON files found: inform the user and proceed with the full review workflow (Step 1).
If the user did NOT request regeneration: proceed normally to Step 1.
Search for ARCHITECTURE.md at the project root.
Multi-file structure detected if ARCHITECTURE.md exists as a navigation index and a docs/ directory contains numbered section files. In this case, the full architecture spans:
ARCHITECTURE.md (navigation index)docs/NN-section-name.md files (one per section)docs/components/NN-component-name.md files (or docs/components/<system>/NN-*.md for multi-system architectures)adr/*.md files (for Hard depth)Monolithic structure: A single ARCHITECTURE.md file contains all 13 sections.
If no ARCHITECTURE.md is found, abort with: "No ARCHITECTURE.md found. Use /skill architecture-docs to create one first."
Present the three options. Do not default or assume. If the user says "review my architecture" without specifying depth, ask.
Choose review depth:
Light (~40 sec) Structural check: are all required sections present? Do naming conventions follow standards? Are required fields populated?
Medium (~90 sec) Everything in Light plus content quality: are sections internally consistent? Do technology choices make sense together? Are integration patterns sound? Are metrics realistic?
Hard (~2-3 min) Everything in Medium plus deep architectural analysis: scalability design, security posture, performance patterns, operational readiness, ADR quality, trade-off honesty, TOGAF/BIAN alignment.
Read PEER_REVIEW_CRITERIA.md. Parse it into per-category blocks — one block per category, containing that category's full markdown criteria table (header row + all check rows).
Active categories for each depth are defined by the Depth Level column in the Scoring Weights table in PEER_REVIEW_CRITERIA.md.
Store for each active category:
code — e.g., SECURITYname — e.g., Security Postureweight — as defined in the Scoring Weights tablechecks_table — the full markdown table text for that category (to be passed to the sub-agent)Determine the ordered list of files to review (doc_files). This list is passed to each category agent — agents read the files themselves.
File discovery order (same for monolithic and multi-file):
ARCHITECTURE.md (always)docs/NN-*.md files in numeric order (if docs/ exists)docs/components/NN-*.md and docs/components/*/NN-*.md files in numeric order (for Medium and Hard depth)adr/*.md files in alphabetic order (for Hard depth only)Store the result as doc_files — an ordered list of absolute file paths.
Spawn one peer-review-category-agent per active category. Issue Task() calls in batches of 2 per message (strict parallel barrier).
Batching rule: dispatch exactly 2 Task() calls per message. After sending a batch, wait for BOTH CATEGORY_REVIEW_RESULT: blocks to return before sending the next batch. Do not start batch N+1 until every Task() in batch N has returned. If any category agent in a batch fails, record the failure and continue with the next batch (do not retry inline; failures are collected and reported at the end). This caps peak parallelism at 2 and gives the orchestrator a chance to observe early failures before dispatching the remaining batches. For Light (3 categories) this is 2 batches (2+1); Medium (7) is 4 batches (2+2+2+1); Hard (13) is 7 batches (2×6 + 1).
All agents use: sa-skills:peer-review-category-agent
Agent prompt template:
Review category.
category_code: [CODE]
category_name: [Name]
weight: [0.XX]
depth_level: [light|medium|hard]
CHECKS:
[paste the full markdown criteria table for this category]
FILES:
[absolute path 1]
[absolute path 2]
...
Pass all file paths from doc_files in every agent prompt. Agents read only what they need.
For each active category (from Step 3), issue one Task() call substituting that category's code, name, weight, and the chosen depth_level. Set description to "CODE — Name". Group categories into pairs in the order they appear in the Scoring Weights table and dispatch each pair in its own message; if there's an odd category left over, it goes in a final 1-agent batch.
All category codes, names, weights, and depth assignments are in the Scoring Weights table in PEER_REVIEW_CRITERIA.md.
[BARRIER — wait for the current batch to complete before dispatching the next batch, and wait for the final batch before proceeding to Step 5.2]
After all agents return:
Collect the CATEGORY_REVIEW_RESULT: JSON block from each agent response.
Check for failures. If any agent failed to return a valid result:
Merge all findings arrays from all CATEGORY_REVIEW_RESULT blocks into a single flat findings array.
Renumber IDs sequentially across the merged array (1, 2, 3...). Sort order: by category depth level (LIGHT categories first, then MEDIUM, then HARD), then by severity within each category (critical → major → minor → suggestion).
Inject category fields into each finding from its parent CATEGORY_REVIEW_RESULT envelope: set category, categoryName, and depthLevel. Agents omit these from individual findings to reduce output size.
Report progress:
✅ Review complete — N categories, M total findings (X critical, Y major, Z minor, Z suggestions)
Generating scorecard and playground...
The findings array is now ready for Step 6 (scorecard) and Step 7 (playground).
Each CATEGORY_REVIEW_RESULT block already contains score and weight computed by the agent. Assemble the scorecard directly from those values — no recalculation needed.
Per-category scores: read score and weight directly from each agent's result block.
Renormalization for partial depths (weights must sum to 1.0 across active categories):
Overall score = sum of (category score × renormalized weight) across all active categories.
Apply the Scorecard Rating Bands from PEER_REVIEW_CRITERIA.md to assign the rating label.
Write the complete reviewData object to a JSON file for future fast-path regeneration (Step 0).
Assemble the full reviewData object:
{
"depthLevel": "hard",
"scorecard": { "overall": 7.2, "rating": "...", "categories": [...] },
"findings": [...],
"metadata": {
"reviewDate": "YYYY-MM-DD",
"sourceFiles": [
{ "path": "docs/07-security-architecture.md", "lastModified": "YYYY-MM-DDTHH:mm:ss" }
]
}
}
metadata.reviewDate: current ISO datemetadata.sourceFiles: array of { path, lastModified } for each file in doc_files (enables staleness detection on fast-path load in Step 0)Write to architecture-peer-review-<YYYY-MM-DD>.json in the project root (pretty-printed).
Report: 💾 Review data saved to architecture-peer-review-<date>.json
Invoke the playground skill using PLAYGROUND_TEMPLATE.md as the template.
The reviewData object (containing findings, scorecard, and depthLevel) comes from either:
Embed in the generated HTML file:
findings — the findings array as a JSON literalscorecard — the calculated scorecard (overall score, rating, per-category scores)depthLevel — the chosen depth levelFollow all core playground requirements:
open <filename>.htmlFilename convention: architecture-peer-review-<YYYY-MM-DD>.html
Fallback — If the playground plugin is not installed, output the findings as a structured plain-text report instead:
# Architecture Peer Review Report
Date: <date>
Depth: <level>
Overall Score: <score>/10 — <rating>
## Scorecard
...
## Findings (<N> total)
...
After the playground opens (or the fallback report prints), always append the following user-visible context-reclaim hint (verbatim, as the final lines of the skill's output):
💡 Tip: findings are saved to `architecture-peer-review-<date>.json` and the playground HTML. To reclaim context from the category-agent responses before your next task, run:
/compact
The user reviews findings in the browser playground, approves/rejects each one, and adds comments where needed. When done, they copy the generated fix prompt.
If the user pastes the generated fix prompt back into Claude, apply the approved findings using the architecture-docs skill's context-efficient editing workflow:
| Skill | Relationship |
|-------|-------------|
| architecture-docs | Prerequisite: ARCHITECTURE.md must exist. Architecture docs skill's REVIEW_AUDIT_WORKFLOW validates form before this skill validates quality. Use architecture-docs editing workflow to apply fixes. |
| architecture-definition-record | For ADR fixes found during Hard review (ADR quality checks), delegate any ADR creation, update, or supersede to this skill. Read-only access to adr/*.md is permitted directly. |
| playground | External plugin dependency: generates the interactive HTML review file. |
| architecture-compliance | Peer review findings (especially SECURITY and SCALE categories) can inform compliance gap analysis. Hard-depth findings map directly to compliance contract requirements. |
| architecture-readiness | If peer review reveals missing business context (vague requirements, unexplained constraints), suggest running architecture-readiness to produce a PO Spec. |
/skill architecture-peer-review
→ Skill activates, asks user to choose depth
"Peer review my ARCHITECTURE.md at hard depth"
→ Skips depth prompt, proceeds directly to Hard review
"Do a light review of my architecture"
→ Runs Light depth checks only
"Architecture quality assessment"
→ Skill activates, asks user to choose depth
"Review my architecture — just the security parts"
→ Activates skill, suggests Hard depth and notes only SECURITY category findings will be most relevant
"Regenerate the peer review playground"
→ Fast path: reads saved JSON, regenerates HTML, skips agent evaluation
"Rebuild the review from the last results"
→ Fast path: reads most recent architecture-peer-review-*.json
A successful peer review produces:
lineRef pointing to actual content in the documentrecommendation (not vague advice)development
Run risk and design-characteristics analyses over ARCHITECTURE.md documentation. Produces date-stamped reports in analysis/ covering ten lenses across two groups: HIGH-priority runtime/security — SPOF (single points of failure), Blast Radius (downstream cascade impact), Bottleneck (throughput chokepoints), Cost Hotspots (Pareto cost concentration), STRIDE (per-trust-boundary security threats); Strategic/sustainability — Vendor Lock-in (portability risk and exit cost), Latency Budget (per-hop SLO decomposition), Tech Debt/EOL (currency and deprecated tech), Coupling (fan-in/fan-out and cycles), Data Sensitivity (PII flow and encryption gaps). Each analysis can be requested individually, as a group, or all ten run in parallel. A consolidated Security Posture option (analysis 12) merges the STRIDE and Data Sensitivity reports into a single reviewer-fillable validation checklist of every security control to validate (markdown-only; exportable to a Word worksheet via architecture-docs-export). Output: analysis/<TYPE>-<YYYY-MM-DD>.md (default) OR analysis/<TYPE>-<YYYY-MM-DD>.html (interactive d3.js report; format is selected at runtime — Step 2.4). Requires ARCHITECTURE.md to exist (created by architecture-docs skill). Do NOT invoke for compliance contracts (use architecture-compliance), peer quality review (use architecture-peer-review), or ADR management (use architecture-definition-record).
development
On-demand export of architecture documents to professional Word (.docx) files. Exports are never automatic — invoke explicitly when ready to produce deliverables. Solution Architecture mode synthesizes an Executive Summary from docs/01-system-overview.md, the component index, and the compliance manifest (if present), then exports individual ADR docs. Handoff mode exports selected component development handoffs from handoffs/. Compliance mode exports selected compliance contracts from compliance-docs/. Security Posture mode exports the consolidated security validation checklist (analysis/SECURITY-POSTURE-<date>.md, from architecture-analysis) as a reviewer-fillable worksheet. IMPORTANT — this skill ONLY produces Word .docx files. It does NOT handle releasing, publishing, tagging, freezing, bumping, or finalizing an architecture version. For the Draft → Released lifecycle (git tag architecture-v{version}, archive snapshot, semver bump), use the `architecture-docs` skill (Workflow 10) instead. Do NOT invoke this skill when the user says "release my architecture", "release architecture", "publish architecture", "ship architecture", "tag architecture version", "freeze architecture", "bump architecture version", or "finalize architecture" — those route to `architecture-docs`.
testing
Generate Compliance Contracts (Contratos de Adherencia) from ARCHITECTURE.md files
testing
Use this skill whenever the user mentions deferred work, known compromises, shortcuts taken, "we'll fix this later," temporary workarounds, missing controls, or any architectural trade-off that should be made visible and tracked. Also trigger when arc42 Section 11 (Risks and Technical Debt) is being authored or updated, when a TDR / Technical Debt Record is requested by name, or when an ADR documents a decision that intentionally accepts debt. Do NOT use for general bug reports, feature requests, or backlog items — TDRs are for architectural or systemic compromises, not ordinary defects.