skills/report-format/SKILL.md
Unified review report format for all finding-producing agents. Load when emitting or consuming review findings.
npx skillsauth add lklimek/claudius report-formatInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Unified format for all review findings. Schema: schemas/review-report.schema.json (v3.0.0).
Hard cutover: schema versions 1.x and 2.x are no longer accepted. Producers and consumers must use v3.0.0.
Agents emit a JSON array of finding_section objects:
[
{
"title": "Section Title",
"category": "security|project|code_quality|call_tree|dependencies|documentation|pr_comments|pr_promises",
"findings": [
{
"id": "PREFIX-001",
"risk": 0.6,
"impact": 0.7,
"scope": 1.0,
"title": "Short finding title",
"tags": ["A03 Injection", "CWE-79"],
"location": "src/auth.rs:42-56",
"description": "What the issue is and why it matters",
"impact_description": "What could go wrong (Markdown narrative)",
"recommendation": "How to fix it",
"code_snippets": [
{"language": "rust", "caption": "auth.rs:42", "content": "let user = unwrap_token(&hdr);"}
]
}
],
"positives": "Optional positive observations"
}
]
This is the producer-emitted shape. Integer severity and float overall_severity are not listed — the coordinator's derive pass adds them from risk/impact/scope (see "Coordinator-derived / validator-owned fields" below). The example validates against the v3 schema as-is because those derived fields are optional; producer skills can call validate_report.py on their own output before consolidation.
| Field | Type | Description |
|-------|------|-------------|
| id | string | PREFIX-NNN -- see ID Prefixes below |
| risk | float | 0.0–1.0, OWASP Likelihood normalized (see severity skill) |
| impact | float | 0.0–1.0, OWASP Impact normalized (see severity skill) |
| scope | float | 0.0–1.0, PR relevance (1.0 direct, 0.5 indirect, 0.0 unrelated) |
| title | string | Short finding title |
| location | string | Full file path with lines: src/auth.rs:42-56 -- never bare line numbers |
| description | string | What the issue is and why it matters |
| recommendation | string | How to fix it |
Producers MUST emit risk, impact, and scope — the schema rejects findings missing any of them. The coordinator computes overall_severity from those floats and derives integer severity via the band table in the severity skill. The validate-findings skill is the only documented path to re-estimate floats post-hoc when a producer's partial output reaches the coordinator without them.
Optional: tags (OWASP, CWE, etc.), impact_description (Markdown impact narrative; pairs with the numeric impact float), code_snippets (when the producer captured exact source during analysis — never invent one).
Producers must NOT set these; they are populated downstream:
overall_severity — Python-computed mean of risk/impact/scopelocation_permalink — Python-constructed GitHub blob/<sha>/<path>#L<n> URLmetadata.repository — coordinator derives from git remote get-url originai_assessment, ai_verdict, ai_verdict_confidence — owned by the validate-findings skillseverity when emitting floats — the coordinator overridesThese fields are Markdown by default — agents emit Markdown markup, renderers parse it as CommonMark:
descriptionimpact_descriptionrecommendationai_assessmentexecutive_summary.summary_text, executive_summary.verdict_textUse Markdown — renderers handle formatting; you write content. Single-line fields (title, severity, category, location, etc.) stay plain text.
Markdown style for agents: separate lists, code blocks, and headings from preceding text with a blank line (CommonMark requires this for parsing).
For consumers: parse long-text fields as CommonMark Markdown. Reference renderer: scripts/generate_review_report.py — HTML uses the markdown Python package sanitised through bleach, PDF walks the parsed HTML to ReportLab mini-XML. Markdown output passes through verbatim.
When writing findings to a file, ALWAYS use the Write tool — never use Bash commands like cat > file, tee, heredoc redirects, or inline python3 scripts for file creation. The Write tool is allowed in all CI environments; Bash file-writing commands are typically blocked by tool allowlists.
| Prefix | Category | Used by |
|--------|----------|---------|
| SEC- | security | security-engineer-smythe |
| QA- | code_quality | qa-engineer-marvin |
| PROJ- | project | project-reviewer-adams |
| CODE- | code_quality | developer-bilby (generic) |
| RUST- | code_quality | developer-bilby (Rust) |
| PY- | code_quality | developer-bilby (Python) |
| GO- | code_quality | developer-bilby (Go) |
| FE- | code_quality | developer-bilby (frontend) |
| DOC- | documentation | technical-writer-trillian |
| CMT- | pr_comments | check-pr-comments |
| PPM- | pr_promises | review-pr (Pass C: promise verification) |
| DEP- | dependencies | review-dependency |
| CALL- | call_tree | reviewer call-tree inspection pass |
IDs are provisional -- the consolidation step deduplicates and reassigns final IDs.
Agents may add context to description and tags per their domain:
tags, CVE references and evidence in descriptiondescriptionreviewer, comment_id, comment_url, thread_id, verdict fields (schema-defined)location is a synthetic string (no file:line) — use PR-title, PR-body:summary-bullet-N, or PR-body:out-of-scope-item-N. Renderers leave it as plain text (no permalink). Example:{
"id": "PPM-001",
"risk": 0.6, "impact": 0.5, "scope": 1.0,
"title": "Title claims PDF fix, diff is gRPC tests",
"location": "PR-title",
"description": "Title says `fix: PDF rendering` but diff touches only `tests/grpc/`.",
"recommendation": "Rename to `test(grpc): add coverage for retry path` or move the gRPC changes to a separate PR."
}
Rationale for the example values: scope: 1.0 because a title/body mismatch is by definition about THIS PR. location is the synthetic string PR-title because the finding has no commit-relative file:line target — renderers leave it as plain text and skip the permalink. risk: 0.6 reflects moderate likelihood the next reviewer is misled (the title is the densest hint in the UI); impact: 0.5 covers reviewer-time cost plus risk of approving unintended changes. The coordinator computes overall_severity and integer severity from these floats per claudius:severity.
| Tool | Purpose | Usage |
|------|---------|-------|
| scripts/validate_report.py | Validate report JSON against schema | python3 ${CLAUDE_SKILL_DIR}/../../scripts/validate_report.py report.json |
| scripts/consolidate_reports.py | Merge multiple agent reports, deduplicate findings | python3 ${CLAUDE_SKILL_DIR}/../../scripts/consolidate_reports.py agent1.json agent2.json -o consolidated.json |
| scripts/generate_review_report.py | Render consolidated report as Markdown/HTML | python3 ${CLAUDE_SKILL_DIR}/../../scripts/generate_review_report.py consolidated.json |
For complete reports (grumpy-review, check-pr-comments), wrap finding sections in:
{
"schema_version": "3.0.0",
"metadata": {
"project": "claudius",
"date": "YYYY-MM-DD",
"commit": "<full 40-char SHA from `git rev-parse @{u}` (fall back to `git rev-parse HEAD` when the branch has no upstream)>"
},
"executive_summary": { "overall_assessment": "..." },
"summary_statistics": { "total_findings": 0, "severity_counts": {} },
"findings": []
}
metadata.commit must be a full 40-character SHA when present (the coordinator builds permalinks from it). Both metadata.commit and metadata.repository are optional — omit them for non-git directories; permalinks are silently skipped and everything else renders normally.
See schemas/review-report.schema.json for complete envelope schema.
testing
Coordinator-only LLM validation pass. Adds ai_assessment / ai_verdict / ai_verdict_confidence and, in the rare partial-producer case, re-estimates absent risk/impact/scope on a consolidated v3 report.
testing
Use for typos or single-line fixes (≤20 lines). Same mandatory phase order (Planning→Impl→QA→LL), minimal ceremony. Auto-retry on failure.
testing
Use for bug fixes or small changes (≤200 lines). Same phase order as workflow-feature (Planning→Impl→QA→LL) with lighter ceremony. Auto-retry on failure, unattended.
development
Use for new projects, features, or major refactoring. Phases: Planning (Req→UX→Test Spec→Dev Plan) → Implementation → QA → Lessons Learned. Auto-retry on failure, unattended.