kramme-cc-workflow/skills/kramme:siw:spec-audit:team/SKILL.md
Audit specification documents for quality using multi-agent execution where dimension specialists collaborate, cross-validate findings, and challenge each other's assessments. Higher quality than standard spec-audit but uses more tokens. Supports inline report output with --inline.
npx skillsauth add abildtoft/kramme-cc-workflow kramme:siw:spec-audit:teamInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Evaluate specification documents for quality across 8 dimensions using multi-agent execution. Each dimension auditor runs with its own context window and can cross-validate findings with other auditors. A cross-reviewer meta-reviews all findings for completeness.
Arguments: "$ARGUMENTS"
This skill requires multi-agent execution.
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS).multi_agent enabled.If multi-agent execution is not available, print:
Multi-agent execution is not enabled. Run /kramme:siw:spec-audit instead.
Claude Code: add CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 to settings.json.
Codex: use a runtime with `multi_agent` enabled (for example, Conductor Codex runtime).
Then stop.
Same as /kramme:siw:spec-audit Steps 1-2:
$ARGUMENTS — extract --model flag (default: opus), optional --inline output mode, resolve spec file paths or auto-detect from siw/Create a multi-agent session named siw-spec-audit.
Spawn 4 dimension auditors and 1 cross-reviewer (5 agents total):
| Agent Name | Dimensions | Rationale |
|---|---|---|
| structure-auditor | Coherence, Completeness | Contradictions and gaps are deeply intertwined — contradictions often manifest as completeness gaps |
| clarity-auditor | Clarity, Actionability | Vague requirements are also non-actionable; a single agent can flag both the ambiguity and its implementation impact |
| validation-auditor | Testability, Scope | Untestable criteria often stem from scope problems (implicit inclusions, missing boundaries) |
| design-auditor | Value Proposition, Technical Design | The most judgment-intensive dimensions; a strategic lens assesses whether the design matches the stated problem |
| cross-reviewer | Meta-review | Cross-dimension pattern detection, suspiciously-clean challenge, duplicate detection |
Phase 1 tasks (parallel):
structure-auditorclarity-auditorvalidation-auditordesign-auditorPhase 2 task (blocked on all Phase 1 tasks):
cross-reviewerEach dimension auditor receives the full spec text and analysis instructions for its assigned dimensions.
Read references/dimension-instructions.md in this skill folder and paste the relevant blocks (Coherence, Completeness, Clarity, Scope, Actionability, Testability, Value Proposition, Technical Design) into each agent's prompt.
Base prompt for each auditor:
You are auditing a specification document for quality. Do NOT look at any
implementation code. Do NOT use Grep or Glob against the codebase. Analyze the
spec text ONLY using the Read tool on the provided spec files.
## Spec Files
Read these files completely:
{list of spec file paths}
## Your Assigned Dimensions
{Paste the relevant blocks from references/dimension-instructions.md}
## Finding Format
For each finding, report:
- **Finding ID**: SPEC-{NNN} (sequential from {start_number})
- **Dimension**: Which dimension
- **Title**: Brief description
- **Location**: Source file > section heading
- **Details**: What the issue is, with quotes from the spec
- **Severity**: Critical | Major | Minor
- **Recommendation**: Specific action to fix
- **Fix Confidence**: {score}/100 ({MECHANICAL|HIGH_CONFIDENCE|MODERATE_CONFIDENCE|REQUIRES_DECISION})
## Rules
- Report on every dimension. Even if no findings, confirm the dimension was analyzed.
- Do not return early. Continue until every section is checked against every assigned dimension.
- Quote the spec. When flagging an issue, include the relevant text.
- Be specific in recommendations. "Add more detail" is not enough — say what detail is missing.
- Score provisional fix confidence on all findings (0-100) using the same four-condition rubric as `/kramme:siw:spec-audit`: determinism, information availability, meaning preservation, and absence of alternatives. Sum the four scores, then apply the safety caps below before writing the agent's provisional `Fix Confidence`.
- Use these tier boundaries for the provisional score: 90-100 = `MECHANICAL`, 75-89 = `HIGH_CONFIDENCE`, 50-74 = `MODERATE_CONFIDENCE`, 0-49 = `REQUIRES_DECISION`.
- Apply these provisional guardrails before reporting the score: if any sub-score is below 15, set the provisional score to `0 (REQUIRES_DECISION)`.
- Apply these safety caps before reporting the score: set the provisional score to `0 (REQUIRES_DECISION)` if any of these apply: Critical finding in Completeness, Scope, or Value Proposition; recommendation uses decision-signal language (`consider`, `decide whether`, `choose between`, `discuss with`, `evaluate options`); the finding adds or removes scope; the finding defines success-criteria substance rather than measurability.
## Work Context Adjustments
This spec has Work Type: {work_context.work_type}
Priority dimensions (flag even minor issues): {work_context.priority_dimensions}
Deprioritized dimensions (cap at Minor severity): {work_context.deprioritized}
When evaluating **deprioritized dimensions**:
- Assess severity normally and keep that original severity in the finding data
- Tag each finding with: **[Deprioritized — cap to Minor during aggregation]**
- Do NOT downgrade the severity yourself; the lead applies the Minor cap during aggregation after recording `original_severity`
When evaluating **priority dimensions**:
- Apply strict scrutiny. Even small gaps should be flagged.
- Tag priority findings with: **[Priority dimension]**
{If work_context is Production Feature or not specified, omit this entire section from the agent prompt.}
## Cross-Validation Protocol
While analyzing, if you discover findings that may affect another agent's dimensions,
message them using SendMessage:
- **Contradictions or structural issues** -> message structure-auditor
- **Ambiguity or unclear wording** -> message clarity-auditor
- **Untestable criteria or scope issues** -> message validation-auditor
- **Design flaws or value gaps** -> message design-auditor
Message content:
"[CROSS-REF] In {spec_file} > {section}, I found {brief finding}.
This may affect your {dimension} analysis because {reason}.
Please check {specific aspect}."
When you RECEIVE a cross-ref message:
1. Check the referenced section against your dimension criteria
2. If it produces a finding, note: "Cross-ref from {sender}: {context}"
3. If no finding, note that too — the cross-reviewer will use this
When done, message the lead with your complete findings and mark your task complete.
While dimension auditors work:
After all Phase 1 tasks complete, the cross-reviewer runs with this prompt:
You are the cross-reviewer for a spec quality audit. Your job is NOT to re-audit
the spec. Your job is to review the findings from 4 dimension-specialist agents
and ensure the audit is complete and internally consistent.
## All Phase 1 Findings
{Collected findings from all 4 dimension auditors}
## Spec Files
{List of spec file paths — read them for context when challenging findings}
## Work Context Adjustments
This spec has Work Type: {work_context.work_type}
Priority dimensions (flag even minor issues): {work_context.priority_dimensions}
Deprioritized dimensions (cap at Minor severity): {work_context.deprioritized}
When evaluating **deprioritized dimensions**:
- Assess severity normally and keep that original severity in the finding data
- Tag each finding with: **[Deprioritized — cap to Minor during aggregation]**
- Do NOT downgrade the severity yourself; the lead applies the Minor cap during aggregation after recording `original_severity`
When evaluating **priority dimensions**:
- Apply strict scrutiny. Even small gaps should be flagged.
- Tag priority findings with: **[Priority dimension]**
{If work_context is Production Feature or not specified, omit this entire section from the agent prompt.}
## Mission 1: Cross-Dimension Pattern Detection
Read all findings from all agents. Identify findings that share a root cause.
When two findings from different dimensions point to the same spec deficiency,
link them and recommend the lead merge them.
Output: Root-cause links
[{finding-a}, {finding-b}] -> "Same root cause: {description}"
## Mission 2: Suspiciously Clean Challenge
For any dimension with 0 findings (or very few given spec size):
- Read the spec sections that agent analyzed
- Identify at least 2 specific aspects that SHOULD have been flagged
- If you find gaps: report them as additional findings with the same format, including `Fix Confidence`
- If the dimension is genuinely strong: confirm it explicitly with evidence
Threshold: For specs over 200 lines, a dimension with 0 findings requires
justification.
Output: Challenge findings or clean confirmations
For each new finding, use the full format below:
- **Finding ID**: SPEC-{NNN}
- **Dimension**: Which dimension
- **Title**: Brief description
- **Location**: Source file > section heading
- **Details**: What the issue is, with quotes from the spec
- **Severity**: Critical | Major | Minor
- **Recommendation**: Specific action to fix
- **Fix Confidence**: {score}/100 ({MECHANICAL|HIGH_CONFIDENCE|MODERATE_CONFIDENCE|REQUIRES_DECISION})
Compute `Fix Confidence` exactly like the dimension auditors: sum the four 0-25 sub-scores, then apply the same tier boundaries, sub-score guardrails, and safety caps before reporting the provisional value.
OR: "{dimension}: Confirmed no findings — {evidence}"
## Mission 3: Duplicate Detection
Flag findings from different agents that describe the same spec issue from
different angles. Recommend which to keep as primary and which to merge.
Output: Duplicate flags
[{finding-a}, {finding-b}] -> "Merge into {finding-a}"
When done, message the lead with your complete cross-review results and mark
your task complete.
After the cross-reviewer completes:
/kramme:siw:spec-audit Steps 4-5 for:
Fix Confidence uses the shared tier boundaries, four sub-score guardrails, and safety caps while preserving any pre-downgrade Critical safety cap via recorded original_severity and the matching report Severity Notesiw/AUDIT_SPEC_REPORT.md (or project root), or replying inline if INLINE_MODE=trueAdditional report sections (insert after Summary):
## Team
- 4 dimension auditors + 1 cross-reviewer participated
- Cross-validation messages: {N} sent, {M} produced additional findings
- Cross-reviewer challenges: {N} dimensions challenged, {M} additional findings
- Duplicates merged: {N}
## Cross-Review Notes
- {Root cause links, disputes, cross-validation results}
Tag findings discovered via cross-validation with [Cross-validated].
Same as /kramme:siw:spec-audit Step 6 — create SIW issues for actionable findings if SIW workflow is active.
Same as /kramme:siw:spec-audit Step 7 — display quality scores, findings counts, and next steps.
/kramme:siw:spec-auditUse this skill when:
Use /kramme:siw:spec-audit when:
development
Runs kramme:pr:code-review as a closeout review loop for local or PR branch changes before commit, ship, or final response. Use when the user asks for autoreview, second-model review, or a final code-review pass after non-trivial edits. Not for UX, visual, accessibility, or product review.
development
Guides topic-level understanding verification for a PR, branch, feature, document, spec, design decision, bug fix, or other concrete subject. Use when the user asks to confirm, quiz, drill, teach-and-check, or verify that they understand a topic. Maintains a topic-specific checklist artifact and requires demonstrated understanding before marking the topic complete. Not for ordinary explanations without verification, end-of-session summaries, or code/test correctness checks.
testing
Design a CI/CD pipeline with quality gates, a <10-minute budget, feature-flag lifecycle, and an exit checklist. Use when adding a new CI pipeline, changing gate configuration, or planning a rollout for a new service. Complementary to kramme:pr:fix-ci (which fixes failures in an existing pipeline). Covers gate ordering, secrets storage, branch protection, rollback mechanism, and staged-rollout guardrails — not a rollout-execution runbook.
tools
--- name: kramme:visual:demo-reel description: Capture local demo evidence for observable product behavior: screenshots, before/after image sets, browser reels, terminal recordings, and short GIF/video proof. Use when shipping UI changes, CLI features, or any change where PR reviewers would benefit from visual or behavioral evidence. argument-hint: "[what to capture] [--url <url>|auto] [--tier static|before-after|browser-reel|terminal-recording]" disable-model-invocation: true user-invocable: tr