.claude/skills/execute-review-findings/SKILL.md
Use when you have code review findings, PR comments, or review reports that need to be systematically addressed — especially when there are multiple findings across different files and severities
npx skillsauth add ahrav/gossip-rs execute-review-findingsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Systematically convert review findings into tracked tasks and execute them in priority waves with parallel agents where files don't overlap.
Core principle: Normalize findings → create self-contained beads tasks → analyze concurrency → execute in priority waves with parallel dispatch.
/review-dispatch produces a ranked findings report/execute-review-findings <source>
Source options:
pr or pr:<number> — fetch PR comments via gh apireview — use the most recent /review-dispatch output in conversation<file-path> — read a markdown review document from diskParse findings from any source into a uniform list. Each normalized finding has:
| Field | Description |
|-------|-------------|
| id | Sequential number (F1, F2, ...) |
| severity | CRITICAL / SHOULD FIX / CONSIDER / NIT |
| type | bug, performance, safety, documentation, design, complexity |
| file | File path(s) affected |
| line | Line number(s) or range |
| summary | One-line description |
| detail | Full finding with current → desired behavior |
| specialists | Which reviewers flagged it (if from /review-dispatch) |
Parsing rules by source:
/review-dispatch output: Parse the ranked tables directly. Map importance 9-10 → CRITICAL, 7-8 → SHOULD FIX, 5-6 → CONSIDER, 1-4 → NIT.gh api repos/{owner}/{repo}/pulls/{number}/comments. Categorize each: bug claim → bug type, style suggestion → design/complexity, question → skip (reply only).Reviewers use different severity vocabularies. Normalize all external labels to the four canonical levels before filtering:
| External Label | Canonical Severity | |----------------|-------------------| | CRITICAL, P0, Blocker, MUST FIX | CRITICAL | | HIGH, P1, Major, SHOULD FIX | SHOULD FIX | | MEDIUM, P2, Moderate, CONSIDER | CONSIDER | | LOW, P3, Minor, NIT, Trivial, Style, INFO | NIT |
When a finding lacks an explicit severity label, infer from type:
Present the normalized table to the user for confirmation before proceeding.
After normalization, discard NIT-severity findings by default. Only CRITICAL, SHOULD FIX, and CONSIDER proceed to Phase 2.
In reviewer terms: CRITICAL and HIGH are always addressed. MEDIUM is addressed (maps to CONSIDER). LOW, Minor, NIT, Trivial, Style, and INFO are skipped — these are cosmetic or informational and dilute agent focus.
Discarded findings are listed in the summary report (Phase 6) with status "Skipped (below threshold)".
Override: Use --include=nit to opt in to NIT-severity findings when the user
explicitly wants them addressed.
Create one bd create per finding. Each task description must be fully self-contained — a fresh agent can work it without reading the original review.
## Finding: {summary}
**Severity**: {severity}
**File(s)**: {file}:{line}
**Type**: {type}
**Flagged by**: {specialists}
### Problem
{detail — full finding including current behavior and why it matters}
### Resolution Steps
{type-specific steps — see below}
### Acceptance Criteria
- [ ] {specific, verifiable conditions}
- [ ] All existing tests pass: `cargo test`
- [ ] Code compiles clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`
Bug (TDD mandatory + test-consolidate):
1. Analyze existing tests using /test-consolidate principles BEFORE writing any test:
a. Read the test module for the affected file
b. Identify existing test clusters that cover the same function/method
c. Determine the right test form for the new test:
- If existing tests for this function use rstest: ADD A NEW #[case] to the
existing parameterized test rather than writing a standalone #[test]
- If existing tests use proptest: check if the bug reveals a property violation
— if so, tighten the property assertion rather than adding a separate test
- If existing tests use table-driven: add the bug's input/expected to the table
- If no existing tests or tests are standalone: write a new test, but prefer
rstest #[case] format if 3+ similar tests already exist for the same function
d. NEVER duplicate test structure that already exists — extend, don't clone
2. Write the failing test in the form determined by step 1
- Test name: descriptive of the behavior, NOT the review
- Place in the appropriate test module for the file
3. Run test, confirm it FAILS: `cargo test <test_name> -- --nocapture`
4. Fix the production code — minimal change to pass the test
5. Run full suite: `cargo test`
6. Verify clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`
Performance:
1. Establish baseline: run relevant benchmark or add one if none exists
- Use `/bench-compare` if Criterion benchmarks cover this path
2. Implement the optimization
3. Re-benchmark and compare against baseline
4. Verify no regressions: `cargo test`
5. Verify clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`
Safety (unsafe code):
1. Write a test exercising the unsafe path with edge-case inputs
2. Fix the safety issue (bounds checks, invariant enforcement, etc.)
3. Add or update `// SAFETY:` comment documenting invariants
4. Run tests: `cargo test`
5. If Miri-compatible: `cargo +nightly miri test <test_name>`
6. Verify clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`
Documentation:
1. Read the code the docs describe — understand actual behavior
2. Write or update documentation to match reality
3. Check AGENTS.md consistency table — if touched file is listed, update corresponding docs
4. Verify doc tests compile: `cargo test --doc`
5. Verify clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`
Design / Complexity:
1. Read surrounding code to understand existing patterns
2. Refactor to address the finding while preserving behavior
3. Run full test suite to confirm no regressions: `cargo test`
4. Verify clean: `cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings`
bd create| Severity | bd priority | |----------|-------------| | CRITICAL | 1 | | SHOULD FIX | 2 | | CONSIDER | 3 | | NIT | 4 |
bd create --title="Fix off-by-one in window boundary check" --type=bug --priority=1
Then update the description with the full self-contained template using bd update <id> --description="...".
Determine which tasks can run in parallel vs. must run sequentially.
For each task, list every file it will read or write:
| Task | Writes | Reads | |------|--------|-------| | F1 | crates/scanner-engine/src/engine/core.rs | crates/scanner-engine/src/engine/mod.rs | | F2 | crates/scanner-engine/src/engine/scratch.rs | - | | F3 | crates/scanner-engine/src/engine/core.rs | crates/scanner-engine/src/api.rs |
Two tasks conflict if they write to the same file. Read-read and read-write of different files are fine.
Within each severity wave, group non-conflicting tasks:
Wave 1 (CRITICAL):
Group A (parallel): F1, F4 — no file overlap
Group B (sequential after A): F3 — conflicts with F1 on core.rs
Wave 2 (SHOULD FIX):
Group C (parallel): F5, F6, F7 — no file overlap
bd dep add <F3-id> <F1-id> # F3 depends on F1 (same file)
Execute findings wave by wave: CRITICAL → SHOULD FIX → CONSIDER. NIT and INFO findings are skipped by default (see Phase 1 severity filter).
Dispatch parallel agents for each non-conflicting group using the Task tool with subagent_type=general-purpose. Each agent gets the full self-contained task description from Phase 2.
Agent prompt structure:
You are fixing a code review finding. Follow the resolution steps exactly.
{full task description from Phase 2}
IMPORTANT:
- Follow the resolution steps in order
- For bugs: BEFORE writing any test, read the existing test module and apply
/test-consolidate principles:
* If the function already has rstest parameterized tests, add a #[case] — do NOT
create a new standalone test function
* If the function has proptest coverage, tighten the property or add a targeted
prop_assert — do NOT duplicate with a unit test
* If similar tests exist as a table-driven loop, add your case to the table
* Only create a new standalone #[test] when no consolidation opportunity exists
* NEVER create a test that duplicates coverage already provided by existing tests
- For bugs: write the failing test BEFORE fixing code
- Run all verification commands listed in acceptance criteria
- Report back: what you changed, test results, any issues encountered
Collect results from all agents in the group.
Sequential groups: After a parallel group completes, dispatch the next group that depended on it.
Close completed tasks: bd close <id> for each successfully resolved finding.
Before moving to the next wave, run:
cargo fmt --all && cargo check && cargo clippy --all-targets --all-features -- -D warnings
cargo test
If anything fails, fix it before proceeding. Do not let failures from Wave 1 propagate into Wave 2.
--plan-only ModeIf invoked with --plan-only, stop after Phase 3. Present the task list, dependency graph, and execution plan without dispatching agents. The user can then:
--plan-onlyAfter all waves pass the quality gate, dispatch a separate /doc-verify agent on every
source file modified during execution. This catches documentation drift introduced by the fixes.
Gather the list of all .rs files that were modified across all waves. Use git diff --name-only
(unstaged) to identify them. Filter to .rs files only — skip test-only files and Cargo.toml.
Launch a fresh Task agent with subagent_type="general-purpose". The agent prompt must:
/doc-verify Phase 2 verification protocol:
You are a documentation verifier. You have zero prior context about why these files
were changed — verify only what the documentation says against what the code does.
Files to verify:
{list of modified .rs files}
Follow the /doc-verify Phase 2 protocol exactly:
- Step A: Extract all testable claims from doc comments
- Step B: Verify code-level claims against actual implementation
- Step C: Identify external claims
- Step D: Verify external claims (skip with --code-only if time-constrained)
- Step E: Produce findings report
Output the standard doc-verify report format with BLOCK/WARN/INFO findings.
--skip-doc-verify flag is set, skip this phase entirely.After all waves complete, present:
## Review Findings Execution Summary
**Source**: {source description}
**Total findings**: N
**Executed**: X resolved, Y skipped, Z failed
### Results
| # | Finding | Severity | Status | Task ID | Notes |
|---|---------|----------|--------|---------|-------|
| F1 | Off-by-one in window check | CRITICAL | Resolved | beads-xxx | TDD: test added + fix |
| F2 | Missing capacity hint | SHOULD FIX | Resolved | beads-yyy | 12% alloc reduction |
| F3 | Unclear doc comment | CONSIDER | Resolved | beads-zzz | Updated doc |
| F4 | Rename variable | NIT | Skipped | - | User opted out |
### Verification
- All tests passing: yes/no
- Clippy clean: yes/no
- New tests added: N
- Tests consolidated (extended existing rstest/proptest): N
- Files modified: [list]
### Doc Verification (Phase 5)
- Files verified: N
- Claims checked: N
- BLOCKs: N (list if any)
- WARNs: N
- Verdict: PASS / PASS WITH WARNINGS / FAIL
| Anti-Pattern | Why It's Wrong | Do This Instead |
|--------------|----------------|-----------------|
| Fixing a bug without a failing test first | You don't know if the fix works or if the bug was real | TDD: failing test → fix → green |
| Putting multiple findings in one task | Agents lose focus, partial completion is messy | One bd create per finding |
| Skipping documentation findings | Doc debt compounds silently | Documentation findings are never optional |
| Dispatching agents that write to the same file | Merge conflicts and lost work | Concurrency analysis in Phase 3 |
| Task description says "see review for details" | Fresh agent can't work it — context is lost | Self-contained descriptions with full context |
| Running all severities in parallel | A CRITICAL might invalidate a NIT | Execute in priority waves |
| Skipping the quality gate between waves | Broken state cascades into subsequent fixes | cargo test + clippy between every wave |
| Writing a new standalone test when rstest cases exist | Test proliferation, maintenance burden, inconsistent patterns | Add a #[case] to the existing rstest instead |
| Ignoring existing proptest coverage for a bug | Duplicates coverage, misses the property violation | Tighten the property assertion or add a targeted prop_assert |
| Skipping doc-verify after fixes | Fixes can invalidate doc comments, creating silent drift | Always run Phase 5 doc-verify on modified files |
| Flag | Effect |
|------|--------|
| --plan-only | Stop after Phase 3 — show tasks and execution plan, don't execute |
| --wave=N | Execute only wave N (1=CRITICAL, 2=SHOULD FIX, 3=CONSIDER) |
| --include=nit | Include NIT findings (skipped by default) |
| --include=nit,info | Include both NIT and INFO findings |
| --skip=consider | Also skip CONSIDER findings (only CRITICAL and SHOULD FIX) |
| --dry-run | Parse and normalize findings without creating beads tasks |
| --skip-doc-verify | Skip Phase 5 doc verification |
/review-dispatch — produces the findings this skill consumes/pr-comment-response — TDD verify-first pattern for individual PR comments/bench-compare — baseline/comparison benchmarks for performance findings/test-strategy — choose appropriate test type (unit, property, fuzz, Kani)/test-consolidate — referenced in bug resolution steps to avoid test duplication/doc-verify — runs as Phase 5 to catch documentation drift from fixesdevelopment
Deep first-principles code explanation that builds real understanding through phased walkthroughs with diagrams. Covers algorithms, data structures, memory layout, concurrency patterns, and performance tricks — especially for systems code in Rust. Use whenever the user asks to explain, walk through, break down, deep dive into, or understand code. Trigger on "how does this work", "what's happening here", "teach me about this", "why is it done this way", or when the user references a file with @ and wants to understand it. Proactively use when examining code involving lock-free algorithms, atomics/CAS, memory ordering,
development
Use when creating implementation-ready beads tasks that need testing strategy, optimal implementation approach, and documentation requirements baked in — composes /create-task with parallel enrichment agents that analyze the codebase and produce concrete test specifications, algorithm/data-structure guidance, and doc quality standards so implementing agents don't need to re-research
development
--- name: autoresearch description: Autonomous Goal-directed Iteration. Apply Karpathy's autoresearch principles to ANY task. Loops autonomously — modify, verify, keep/discard, repeat. Supports bounded iteration via Iterations: N inline config. version: 1.9.11 --- # Claude Autoresearch — Autonomous Goal-directed Iteration Inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch). Applies constraint-driven autonomous iteration to ANY work — not just ML research. **Core id
development
Use when implementing a new feature and assessing coverage gaps, during periodic test hygiene, when test suites feel bloated, or before merging code that changes coordination or hot paths. Two-phase assess-then-improve testing pipeline.