skills/review-adr/SKILL.md
Review an Architecture Decision Record for completeness, correctness, and quality. Use when user types /review-adr or asks to review an ADR.
npx skillsauth add jamesc/skills review-adrInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A three-pass architecture document review that validates an ADR's technical accuracy, completeness, and reasoning quality. Each pass uses a different lens.
Key Philosophy: ADRs are the institutional memory of the project. A weak ADR is worse than no ADR — it enshrines bad reasoning. Review rigorously, fix inline, and ensure the ADR would be useful to a developer encountering this decision in 6 months.
/review-adr # Review ADR files in current branch diff
/review-adr docs/ADR/0015-*.md # Review specific ADR file
Scale based on ADR scope:
| ADR Scope | Criteria | Passes | Rationale | |-----------|----------|--------|-----------| | Narrow | Single component, ≤1 alternative | Pass 1 only | Straightforward decision | | Medium | 2-3 components, multiple alternatives | Pass 1 + Pass 2 | Cross-component reasoning needs verification | | Broad | System-wide, new patterns, language design | All 3 passes | Full adversarial challenge needed |
Always run all 3 passes for:
Verify the ADR is complete, internally consistent, and factually accurate.
Check every section from docs/ADR/TEMPLATE.md is present and substantive:
| Section | Check | Common Failures | |---------|-------|-----------------| | Status | Has status + date | Missing date | | Context | Problem statement clear to an outsider | Assumes reader knows the current discussion | | Decision | Unambiguous — a developer knows what to implement | Vague or hand-wavy on specifics | | Prior Art | ≥3 reference languages compared | Only compares to Smalltalk, ignores BEAM ecosystem | | User Impact | All 4 personas addressed (newcomer, Smalltalk dev, BEAM dev, operator) | Missing personas or superficial treatment | | Steelman Analysis | Best argument FOR each rejected alternative, from each cohort | Strawman arguments, missing cohorts | | Alternatives | ≥2 alternatives with code examples and rejection rationale | Alternatives are obviously weak (setup to be rejected) | | Consequences | Honest negatives, not just benefits | No real negatives listed | | Implementation | Phases with effort estimates and affected components | Too vague ("update the runtime") | | Migration Path | Present if breaking change; absent with justification if not | Missing for a breaking change | | References | Links to issues, ADRs, docs, external resources | No external references |
CRITICAL: Every code example must be verified against the actual codebase. ADRs with wrong code examples are actively harmful.
For each code example in the ADR:
"Current state" examples — Verify the code actually exists and works as described:
# Search for the pattern in the codebase
grep -rn "<pattern from ADR>" <expected file>
"Proposed" examples — Verify they're syntactically valid and consistent:
docs/beamtalk-language-features.md?:load not @load)?Counts and claims — Verify quantitative claims:
# "20+ call sites" — actually count them
grep -rn "error(Error" runtime/apps/beamtalk_runtime/src/*.erl | wc -l
For issues found in Pass 1:
Evaluate the quality of the decision-making process. Is the reasoning sound? Are there gaps?
For each alternative considered:
For the steelman analysis:
For each user persona:
Lexer → Parser → AST → Semantic Analysis → Codegen → Runtime → REPL
If the decision affects codegen, does the implementation mention codegen files?For issues found in Pass 2:
Challenge the decision from outside the author's perspective. Use a different model family for fresh eyes.
Use the task tool with agent_type: "general-purpose" and a different model family:
You are a skeptical principal architect reviewing an ADR. Your job is to find
weaknesses in the reasoning that the author missed. Be adversarial but constructive.
Review this ADR: <paste full ADR content>
Focus on:
1. HIDDEN ASSUMPTIONS: What does this decision assume that might not hold?
- "This assumes errors are always #beamtalk_error{} — but what about raw Erlang exceptions?"
- "This assumes the class hierarchy is static — what about dynamic class loading?"
2. SECOND-ORDER EFFECTS: What consequences aren't in the Consequences section?
- "If errors are objects at signal time, what happens to crash log tooling?"
- "If we add 3 new stdlib classes, what's the compile time impact?"
3. ALTERNATIVES NOT CONSIDERED: What obvious approaches were missed?
- "Why not a protocol/behavior approach instead of class hierarchy?"
- "Why not defer the hierarchy and just fix the REPL display?"
4. FUTURE CONFLICTS: Will this decision conflict with planned features?
- Check against active epics and ADRs in the codebase
- "ADR 0006 (Unified Dispatch) assumes X — does this conflict?"
5. WEAKEST SECTION: Which section of the ADR is least convincing? Why?
- "The steelman for Alternative A is weak — a BEAM veteran would actually argue..."
- "The migration path ignores test fixtures that pattern-match on #beamtalk_error{}"
6. OVER-ENGINEERING: Is this decision more complex than needed?
- "Could you get 80% of the value with 20% of the complexity?"
- "Do you really need 4 phases, or could phases 1-2 be one?"
DO NOT comment on: writing style, markdown formatting, or section ordering.
ONLY flag issues that affect the quality of the decision or its implementation.
Check for conflicts or redundancy with existing decisions:
ls docs/ADR/*.md | grep -v README | grep -v TEMPLATE
For each related ADR:
Verify the ADR meets the project's DevEx checklist:
Can you demonstrate the feature in 1-2 lines of REPL code?
What does the error look like?
Is it discoverable?
:help, or reflection?For each finding from the adversarial review:
## ADR Review Summary: ADR NNNN — <Title>
### Pass 1: Structural & Factual
- [x] All template sections present and substantive
- [x] Code examples verified against codebase
- ⚠️ Fixed: [description of correction]
- [x] Quantitative claims verified
- ⚠️ Fixed: "20+ call sites" → actual count is 49
- [x] Internal consistency verified
### Pass 2: Reasoning & Completeness
- [x] Problem framing: [assessment]
- [x] Prior art: [sufficient/needs work]
- [x] Alternatives: [genuine/strawman concern]
- [x] Steelman quality: [assessment]
- [x] User impact: [complete/gaps]
- [x] Implementation feasibility: [realistic/concerns]
### Pass 3: Adversarial
- [x] Hidden assumptions: [findings]
- [x] Second-order effects: [findings]
- [x] Missing alternatives: [findings]
- [x] Conflict with existing ADRs: [none/found]
- [x] DevEx validation: [pass/fail]
### Fixes Applied
1. [description] — [what was changed and why]
2. [description] — [what was changed and why]
### Open Questions for Author
1. [question that needs domain expertise]
2. [design choice that could go either way]
### Assessment
- **Ready to accept:** Yes / Needs revision
- **Strengths:** [key positive aspects of the ADR]
- **Weaknesses:** [remaining concerns]
- **Recommendation:** [Accept / Revise section X / Needs discussion on Y]
✅ Strong ADRs have:
❌ Weak ADRs have:
| Anti-Pattern | Symptom | Fix | |-------------|---------|-----| | Rubber stamp | Only benefits listed, no real negatives | Add honest consequences section | | Foregone conclusion | Alternatives are obviously worse | Find or construct a genuinely compelling alternative | | Scope creep | ADR decides 3+ things at once | Split into focused ADRs, one decision each | | Missing the user | All technical, no user impact | Add REPL examples and error messages | | Phantom prior art | "Smalltalk does X" without verification | Verify with web search or source code | | Implementation masquerading as decision | ADR is really a design doc / implementation plan | Separate the "what" (ADR) from the "how" (issues) | | Anchoring bias | First option described in most detail, alternatives are thin | Give equal depth to all alternatives |
tools
Find the next logical piece of work. Use when user types /whats-next or asks what they should work on next, or wants recommendations for the next task.
development
Use when navigating code, finding references, looking up definitions, understanding types, or tracing call hierarchies in TypeScript, Rust, or Beamtalk (.bt) files. Prefer LSP over Grep/Glob for any navigation task where symbol semantics matter.
data-ai
Find and update Linear issues that need labels, blocking relationships, or metadata. Use when user says '/update-issues' with criteria like 'no labels', 'missing agent-ready', 'needs size', etc.
data-ai
Sync modified skills and agents back to the repo and create a PR. Use when user types /sync-skills or wants to save in-session skill improvements.