skills/spec-panel/SKILL.md
Use this skill whenever a specification, PRD, BRD, design doc, or RFC needs rigorous multi-expert review BEFORE implementation begins. ALWAYS trigger on: "spec panel", "expert review", "panel analysis", "spec analysis", "expert panel review", "review this spec", "audit the PRD", "spec quality check", "is this spec complete", "IEEE 830 audit", "spec smells", "devil's advocate on this spec", "review before we build". Implicit triggers: user pastes a PRD/BRD/spec and asks "what do you think", "any gaps", "is this ready to build", "should we implement this", "am I missing anything"; user wants a second opinion before committing engineering effort; user is deciding whether to proceed to spec-to-impl; user mentions specific concerns about requirements quality, ambiguity, or feasibility; user shows a spec with TBDs, "handle this somehow", or other vague language. Produces a structured findings report with IEEE 830 quality scoring (8 attributes), a spec-smells scan for red-flag language, a cross-cutting concerns checklist (security, performance, observability, compliance), and a multi-expert panel critique with a devil's advocate. Combines codebase investigation, internet research, and domain expertise. This is the gate BEFORE `spec-to-impl` — run this when the spec is drafted but not yet being implemented. Does NOT write code. Does NOT modify the spec in place — produces a separate analysis report that feeds `spec-update` (for spec rewrites) or `spec-to-impl` (for implementation).
npx skillsauth add OmexIT/claude-skills-pack spec-panelInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill is read-only — it reviews an existing spec/PRD and produces a findings report, never inline fixes or implementation code. It sits at a specific point in the superpowers workflow, immediately before implementation begins.
Before invoking this skill: nothing. Reviewers analyze existing work and don't need brainstorming or planning upfront.
Invoke this skill (spec-panel) to audit a specification document through an expert panel lens — IEEE 830 attributes, spec smells, cross-cutting concerns, and domain-expert critique with a devil's advocate. Produces findings with severity ratings and specific line references.
After findings are produced:
/spec-update to apply the agreed changes to the spec document (preserves spec-panel output contract)./spec-to-impl only AFTER the spec has been updated and all CRITICAL findings are resolved. Do NOT proceed to implementation with unresolved CRITICAL findings.Hard rule: this skill NEVER produces implementation code in the same invocation. It produces spec-review findings. Implementation happens in a separate pass through /spec-to-impl — and only after the spec has passed the quality gate.
Pre-implementation gate: if the user tries to invoke /spec-to-impl on a spec that has unresolved CRITICAL findings from spec-panel, refuse politely and route them back to spec-update first. Specs with critical ambiguities produce ambiguous implementations.
Conduct a rigorous, multi-expert analysis of a specification document. This is NOT a quick review — it's a thorough investigation combining codebase research, internet research, requirements quality analysis, and domain expert perspectives into an actionable implementation plan with quantified quality scoring.
Before any analysis, ask clarifying questions. Understand context that isn't obvious from the document. Ask 3-5 focused questions maximum — skip questions the spec already answers. Wait for answers before proceeding.
Phase 1 is non-negotiable and must complete before any analysis. A spec reviewed without context is a guess. You MUST gather real evidence from the codebase, existing docs, and up-to-date library documentation before producing findings. If any Phase 1 step cannot be completed, surface it as a blocker — do not fabricate context.
1A: Codebase Investigation (MANDATORY — read real files, not guesses)
Use the Read, Glob, and Grep tools. Do NOT rely on the spec's claims about what the code does — verify everything.
CLAUDE.md, README.md, package.json / build.gradle* / pom.xml to detect stack, build system, and conventionsGlob to enumerate files matching the spec's domain (e.g., **/payment/**, **/user/**). Target files mentioned by name in the spec.Grep).Grep for keywords from the spec (e.g., "password reset", "token", "refund"). The feature may already exist partially.git log --oneline -20 -- <path> for each touched file. git blame to identify recent authors. Changes in the last 30 days often indicate active work that conflicts with the spec.ProblemDetail, Result<T>, exceptions), dependency injection style (constructor vs field), testing conventions (JUnit 5 vs JUnit 4, Testcontainers usage), data access (Spring Data JDBC vs JPA vs jOOQ).Minimum evidence required before Phase 2:
CLAUDE.md (or equivalent) — or note explicitly if absentgit log on touched pathsGrepIf the spec references code that doesn't exist yet, say so — do not pretend it does.
1B: Existing Docs & Prior Art (MANDATORY — do not skip)
Existing documentation often contains decisions that constrain the spec. Read them before reviewing.
Use Glob + Read to locate and consume:
docs/adr/**/*.md, claudedocs/adr-*.md, or wherever the project keeps them) — any ADR touching the spec's domain is binding contextdocs/design/**/*.md, docs/rfcs/**/*.md) — especially anything mentioning the entities or flows in the specclaudedocs/*-panel-analysis.md) — check whether this spec has been reviewed before and what was flaggeddocs/prd/**/*.md) — upstream context from productsrc/main/resources/db/**/*.sql) — schema realityFor every relevant doc found, note: [DOC-REF] <path> → <one-line relevance> in the final report. Conflicts between this spec and existing ADRs are HIGH severity findings.
1C: External Research with Tools (use context7 FIRST, then web)
Training data is often out of date. Use real tools:
context7 MCP (mcp__context7__resolve-library-id, mcp__context7__query-docs) — use this FIRST for every library, framework, SDK, or API mentioned in the spec. Training-data knowledge about Spring Boot 4, React 19, Next.js 15, Temporal 1.26+, etc. may be wrong or stale. Explicit trigger cases:
WebSearch / WebFetch for:
Sourcegraph MCP (if connected, mcp__sourcegraph__*) for searching across public open-source implementations of the same pattern.
Rules:
1D: Stack Version Verification
After context7 research, run /stack-check (or detect manually) against the project and compare to what the spec assumes:
Any mismatch is at minimum a MEDIUM finding — often HIGH if it blocks implementation.
1E: Context Manifest (mandatory output of Phase 1)
Before proceeding to Phase 2, produce this manifest and include it in the final report:
CONTEXT MANIFEST — <spec name>
==============================
CODEBASE EVIDENCE
Files read: <n> [list]
Entities touched: <list>
Existing impl found: yes/no — <brief>
Git activity (30d): <n> commits, <n> authors, flags: <e.g. concurrent refactor>
Conventions detected: [naming, DI style, error handling, testing, data access]
Discrepancies vs spec: <n> — see findings
DOC EVIDENCE
CLAUDE.md: read / absent
ADRs: [list of relevant ADR paths]
Design docs: [list]
Prior analyses: [list, especially prior spec-panel output on this spec]
Schema (migrations): [key files]
API contract: [OpenAPI path or absent]
EXTERNAL RESEARCH
context7 queries: [library @ version → finding]
WebSearch queries: [query → key finding]
CVEs surfaced: <n>
Standards consulted: [OWASP / NIST / RFC ####]
STACK VERIFICATION
Project stack: Java <v>, Spring Boot <v>, <etc>
Spec assumes: <versions>
Mismatches: <list or "none">
BLOCKERS FROM PHASE 1
<anything that prevents reliable review — surface here, do not proceed silently>
Do not start Phase 2 until the Context Manifest is produced. A Phase 2 finding that isn't backed by evidence from Phase 1 is speculation and must be marked as such.
Before the expert panel, run a systematic quality check.
2A: IEEE 830 Quality Attributes
Score each attribute 1-10 with specific evidence:
| Attribute | Score | Evidence | |-----------|-------|----------| | Correct — Every requirement reflects an actual system need | | | | Unambiguous — Each requirement has exactly one interpretation | | | | Complete — All requirements included, no TBDs or gaps | | | | Consistent — No requirements contradict each other | | | | Ranked — Prioritized by importance and stability | | | | Verifiable — Each requirement can be tested via a finite process | | | | Modifiable — Easy to change without cascading updates | | | | Traceable — Bidirectional: backward to source, forward to design/test | | |
2B: Spec Smells Scanner
Scan the spec for red-flag language that signals ambiguity or incompleteness:
| Smell Category | Red-flag Words | Found? | Location | |----------------|---------------|--------|----------| | Unquantified scope | all, always, every, never, none | | | | Vague frequency | most, many, several, some, usually, normally, often | | | | Vague adjectives | easy, user-friendly, fast, flexible, robust, efficient, seamless, intuitive | | | | Weak verbs | handle, improve, provide, support, maximize, optimize, manage, process | | | | Uncertainty markers | should, can, could, may, might, if possible, as needed, TBD | | | | Implementation leak | use [specific technology], implement via, built with | | |
Every flagged instance must be rewritten into a concrete, testable requirement.
2C: Cross-Cutting Concerns Checklist
Verify each concern is explicitly addressed or intentionally scoped out:
| Concern | Addressed? | Where in Spec | Gap Severity | |---------|-----------|---------------|--------------| | Security — Auth model, encryption, input validation, secrets | | | | | Observability — Metrics, logging, tracing, alerting, SLOs | | | | | Accessibility — WCAG 2.2 AA, keyboard nav, screen readers | | | | | Internationalization — Locale, currency, date/time, RTL | | | | | Data Privacy — PII classification, GDPR/CCPA, retention, consent | | | | | Backward Compatibility — API versioning, schema migration, client matrix | | | | | Rollback Strategy — Deployment rollback, data rollback, time-to-rollback | | | | | Feature Flags — Gradual rollout, kill switch, flag cleanup timeline | | | | | Error Handling — Failure modes, retry policies, circuit breakers, fallbacks | | | | | Performance — Latency targets, throughput, capacity, scalability | | | | | Caching — Strategy, invalidation, TTLs, consistency impact | | | | | Rate Limiting — Throttling, quotas, abuse prevention | | | | | Disaster Recovery — Failover, RTO/RPO, data integrity verification | | | | | Multi-tenancy — Isolation, data segregation, tenant-specific config | | | |
Mark N/A for genuinely irrelevant concerns. Missing concerns with MEDIUM+ severity become expert panel findings.
2D: Alternatives Considered Check
Every spec must answer: "Why this approach and not another?" Verify:
If the spec lacks alternatives, flag as a CRITICAL finding — it means the design space was not explored.
Severity Classification (used by all experts):
| Level | Definition | Action | |-------|-----------|--------| | CRITICAL | Blocks delivery, causes data loss, security vulnerability, or fundamental design flaw | Must fix before implementation starts | | HIGH | Will cause bugs, performance issues, or significant rework if not addressed | Must fix before feature ships | | MEDIUM | Creates tech debt, testing gaps, or operational risk | Should fix, schedule if time-constrained | | LOW | Improves polish, developer experience, or documentation quality | Nice-to-have, defer if needed |
Structured Finding Format (every expert uses this):
[SEVERITY] Issue title
├─ Issue: What's wrong, with specific location in spec
├─ Impact: What happens if not addressed
├─ Recommendation: Concrete fix (file, function, specific change)
└─ Rationale: Why this matters (cite framework, pattern, or research)
Fixed Experts (always included):
| Expert | Domain | Focuses on | |--------|--------|------------| | Karl Wiegers | Requirements Quality | Completeness, testability, ambiguity, missing acceptance criteria, contradictions. Uses IEEE 830 + SMART criteria. | | Martin Fowler | Architecture & Design | Integration gaps, coupling issues, missing abstractions, pattern fitness. Checks for alternatives considered. | | Gojko Adzic | Specification by Example | Concrete Given/When/Then scenarios, edge cases the spec doesn't address. Every requirement must have at least one executable example. | | Lisa Crispin | Testing & Quality | Test gaps, untested paths, broken assumptions, regression risks. Maps the testing pyramid for this feature. | | Michael Nygard | Operational Concerns | Failure modes, deployment risks, monitoring gaps, data migration safety, circuit breakers, bulkheads, timeouts. |
Devil's Advocate (always included):
| Expert | Domain | Focuses on | |--------|--------|------------| | The Skeptic | Fundamental Challenge | Challenges the spec's premise. Asks: "Should we build this at all?", "What if we did nothing?", "What's the simplest thing that could work?", "What assumption, if wrong, makes this entire spec invalid?" |
The Skeptic's role is to prevent groupthink and rubber-stamping. They must produce at least 2 challenges to the spec's fundamental approach, not just implementation details.
Domain Experts (activated based on spec content):
| Expert | Activated when spec involves | Focuses on | |--------|------------------------------|------------| | Roy Fielding | API design or integration | REST constraints, resource modeling, versioning, error contracts | | Martin Kleppmann | Database or data modeling | Consistency guarantees, migration safety, schema evolution | | Dan Abramov | Frontend/UI | Component composition, state management, rendering performance | | Troy Hunt | Security, auth, compliance | OWASP risks, auth flows, data exposure, secrets management | | Pat Helland | Payments, fintech, transactions | Idempotency, exactly-once semantics, compensation patterns, ledger integrity | | Greg Young | Event-driven architecture | Event design, projection strategy, eventual consistency, replay safety | | Bernd Ruecker | Workflow orchestration | Saga vs orchestration, compensation, timeout handling, workflow versioning | | Charity Majors | DevOps, deployment, infrastructure | SLOs, alerting philosophy, deployment safety, canary patterns | | Marty Cagan | Product/UX decisions | Solving the right problem, discovery gaps | | Guillermo Rauch | Mobile or cross-platform | SSR vs CSR, edge deployment, hydration strategy, performance budget |
State which domain experts are activated and why. Each expert provides 2-5 specific, actionable findings using the structured format above — not generic advice.
Expert panel members execute their analysis in parallel:
Model routing for experts:
| Expert | Model | Rationale |
|---|---|---|
| Domain experts | opus | Deep domain reasoning |
| IEEE Auditor | sonnet | Systematic checklist evaluation |
| Spec Smells Scanner | sonnet | Pattern matching against red flags |
| Cross-Cutting Reviewer | sonnet | Checklist-driven gap analysis |
| Devil's Advocate | opus | Independent critical thinking |
| Internet Researcher | sonnet | Web search + synthesis |
Produce an overall spec quality scorecard:
| Dimension | Score (1-10) | Key Issue | |-----------|-------------|-----------| | Requirements Clarity — Language precision and freedom from ambiguity | | | | Completeness — Coverage of functional, non-functional, and edge cases | | | | Testability — Every requirement has measurable acceptance criteria | | | | Architectural Soundness — Design patterns, boundaries, and coupling | | | | Operational Readiness — Monitoring, failure modes, rollback, deployment | | | | Cross-Cutting Coverage — Security, a11y, i18n, privacy, compatibility | | | | Overall | | |
Scoring guide:
Current State Summary:
| Layer | Current State | What Spec Says | Gap | Severity | |-------|--------------|----------------|-----|----------|
Risk Register:
| # | Risk | Likelihood | Impact | Mitigation | |---|------|-----------|--------|------------|
Implementation Plan:
| # | Task | Files to Create/Modify | Effort | Dependencies | Expert Source | |---|------|----------------------|--------|--------------|---------------|
Recommended Priority:
Recommended Reading: references from Phase 1C the team should review.
Save the full analysis to:
claudedocs/<spec-name>-panel-analysis.md
Include a recommendation tracker at the top:
# Panel Analysis: <spec name>
**Date:** <today>
**Spec:** <path to original spec>
**Status:** IN REVIEW
**Quality Score:** <overall>/10
## Recommendation Tracker
| # | Recommendation | Severity | Status | Owner | Notes |
|---|---------------|----------|--------|-------|-------|
Set all statuses to PENDING.
Tell the user: "Analysis saved to claudedocs/<name>-panel-analysis.md. To action recommendations, run: /spec-update @claudedocs/<name>-panel-analysis.md"
Upstream skills that feed into this:
/prd — PRD document to analyze/design-doc — Design document / RFC to review/api-design — API specification to evaluate/data-design — Data architecture to assess/flow-map — System flow paths to validate/ui-design — UI design artifacts to reviewDownstream skills that consume this output:
/spec-update — Action recommendations from panel analysis/spec-to-impl — Implementation from analyzed spec/ticket-breakdown — Break analyzed spec into tickets/test-plan — Test planning informed by expert findingsAfter analysis completes, save:
produces:
- type: "panel-analysis"
format: "markdown"
path: "claudedocs/<spec-name>-panel-analysis.md"
sections:
- clarification-answers
- codebase-findings
- internet-research
- ieee-830-quality-audit
- spec-smells-report
- cross-cutting-concerns-checklist
- alternatives-considered-check
- expert-panel-findings
- skeptic-challenges
- quality-scorecard
- current-state-summary
- risk-register
- implementation-plan
- recommended-priority
- recommended-reading
- recommendation-tracker
handoff: "Run superpowers:systematic-debugging per CRITICAL finding. Write claudedocs/handoff-spec-panel-<timestamp>.yaml — suggest: spec-update (apply recommendations), spec-to-impl (ONLY after CRITICAL findings are resolved), ticket-breakdown (turn analysis into tickets), superpowers:writing-plans (remediation plan)"
/spec-to-impl when CRITICAL findings exist — route through /spec-update firstsuperpowers:systematic-debugging on each CRITICAL finding to understand root causetools
Use this skill to verify a completed implementation through live testing — API calls, database state checks, and UI automation with Playwright. Triggers include: "test the implementation", "verify this works", "run API tests", "check the database", "test the UI", "end-to-end verify", "smoke test", "sanity check the implementation", "manually test", or any time an implementation needs post-build validation beyond unit tests. Also triggered automatically by spec-to-impl during the integration review phase. Use this when you want real evidence the system works — not just that tests compile. Can consume a pre-generated e2e/test-plan.yaml from spec-to-impl for fully automated test execution.
development
--- name: ux-review description: Evaluate a UI/UX design or implementation using heuristic analysis, accessibility audit, and cognitive walkthrough. Triggers: "UX review", "usability review", "heuristic evaluation", "accessibility audit", "is this usable". argument-hint: "[feature / screen / URL / mockup]" effort: high --- # UX review ## What I'll do Evaluate a design or implementation for usability, accessibility, and user experience quality using established heuristic frameworks. ## Inputs
development
--- name: user-flow description: Map user journeys through a feature or product, identifying key paths, decision points, friction, error states, and edge cases. Triggers: "user flow", "user journey", "flow diagram", "happy path", "user path". argument-hint: "[feature / user goal]" effort: medium --- # User flow ## What I'll do Map the complete user journey for a feature — from entry point through completion — including happy paths, error states, edge cases, and decision points. > **user-flow
development
Use this skill to produce complete UI/UX design artifacts from a specification document or panel analysis. Triggers include: "design the UI for this spec", "create wireframes", "design this panel", "UX design from spec", "generate component specs", "design tokens", "create the UI design for", "design system for", "wireframe this feature", "design a UI", "create a design system", "design this component", "design the layout", "create a style guide", "design a screen", "UI/UX review", "typography system", "color system", "spacing system", "design this feature", "design the dashboard", "design the onboarding", "create a component library", "design review", "audit the design", "improve the UI", "redesign this", "design system documentation", "create design guidelines", "responsive design", "mobile design", "dark mode design", "design the brand", or any time a spec/panel analysis document needs to be transformed into actionable UI/UX deliverables before implementation. Also triggers for standalone design system creation, component design, design reviews, dark mode/responsive variants, and developer handoff — even before code is involved. Orchestrates a multi-agent design team (UX Lead, UI Designer, Component Architect, Accessibility Reviewer, Design System Engineer, Design Reviewer) in parallel waves. Outputs feed directly into spec-to-impl's FE agent and figma-to-code.