skills/mine.audit/SKILL.md
Use when the user says: "audit the codebase", "find tech debt", or "health check". Systematic codebase health audit — surfaces aging code, brittle designs, missing tests, and accumulated debt, ranked by impact.
npx skillsauth add NodeJSmith/Claudefiles mine.auditInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Systematic assessment of a codebase's health. Finds the problems worth fixing — not everything that's imperfect, but the things that are actively hurting: code that's aged poorly, designs that have become brittle, abstractions that leak, areas with no test safety net.
$ARGUMENTS — optional scope narrowing. Can be:
/mine.audit src/services//mine.audit "test coverage" or /mine.audit "error handling"/mine.audit "what's the riskiest part of this codebase?"Read the code and reason about it directly. Subagents should use Read, Grep, and Glob to examine files. Do NOT write or execute Python/shell scripts to perform analysis — no AST parsers, no custom complexity calculators, no throwaway scripts to count imports or measure coupling. You can read code and identify these patterns yourself.
The only commands to execute during analysis are:
git log / git diff / git shortlog — for churn, age, and history datapytest --cov or equivalent — for actual test coverage numbersruff, eslint) — for existing lint outputwc -l or similar — for quick file size counts when scanning many filesagnix . — if auditing a Claudefiles-style repo (agents/, skills/, commands/)Everything else — identifying smells, mapping dependencies, assessing coupling, spotting duplication — comes from reading the files.
Identify the top-level modules and determine review units.
src/api/, src/services/, src/models/)Launch parallel Explore subagents — one per review unit identified in Phase 1. Each subagent assesses ALL concerns for its directory:
Each subagent returns a structured summary for its directory. This is faster and produces better results than concern-based slicing because each subagent sees the full context of its directory.
Launch a single Explore subagent that reads ALL per-directory findings plus the full file manifest. This subagent looks for problems that only emerge at the boundary between directories:
Don't just dump raw data. Synthesize the per-directory and cross-scope results into a prioritized assessment.
Rank findings by impact — how much this problem is likely to cause bugs, slow down development, or resist change:
| Signal | Why it matters | |--------|---------------| | High churn + high complexity | Changed often but hard to change safely — the most dangerous combination | | High fan-in + no tests | Many things depend on it but there's no safety net | | Large + old + still active | Written long ago, never cleaned up, still critical path | | Inconsistent patterns | Developers can't build intuition — each area works differently | | Missing error handling on boundaries | Silent failures, data corruption, hard-to-debug production issues | | Tight coupling clusters | Can't change A without breaking B, C, and D |
Apply the Validity Assessment protocol from ${CLAUDE_HOME:-~/.claude}/skills/mine.challenge/findings-protocol.md: findings are valid by default; flagging one as likely invalid requires a concrete evidence trail (claim vs. what the code actually does). Read the relevant source files directly to verify claims — do not rely solely on the per-directory summaries.
Likely-invalid findings are excluded from the narrative summary and placed in the ## Likely Invalid section of the findings file per ${CLAUDE_HOME:-~/.claude}/skills/mine.challenge/findings-protocol.md. Always include the **Likely-invalid:** N count in the findings file header and in the narrative summary.
Before entering the findings flow, present the findings as a narrative organized by severity so the user can orient:
## Codebase Audit: [project name]
**Likely-invalid:** N
### Critical (high impact, fix soon)
1. **src/services/payment.py** (520 lines, 47 changes in 3 months, 12% test coverage)
The most frequently changed file in the codebase has almost no test coverage. It handles payment processing and has 3 broad `except Exception` blocks that silently swallow errors.
2. **Circular dependency: models ↔ services ↔ utils**
These three directories have 14 circular import paths. Adding anything to models/ requires understanding how services/ and utils/ will react. This is the main reason features take longer than expected.
### Concerning (accumulating risk)
3. **src/api/routes.py** (680 lines, mixes routing + business logic + validation)
God file that 23 other modules import from. Every API change requires modifying this single file. Should be split by domain.
4. **No tests for src/integrations/** (4 files, 1,200 lines)
External API integrations with zero test coverage. These modules do have error handling but it's untested — if an API changes behavior, you'll find out in production.
### Worth noting (low urgency)
5. **Inconsistent error handling** — src/api/ uses custom exceptions, src/services/ returns error tuples, src/utils/ raises ValueError for everything
6. **8 TODO/FIXME comments older than 6 months** — may be stale or forgotten
Run get-skill-tmpdir mine-audit and write <tmpdir>/audit-results.md using the findings file format:
# Audit Findings
**Target:** [project name or scope]
**Date:** [today's date]
**Format-version:** 3
**Likely-invalid:** <count>
## Finding 1: [concise title]
**Severity:** CRITICAL | **Type:** Test Gap | **Raised-by:** Audit Analysis (1/1)
**Resolution:** User-directed
**Problem:** [specific description with evidence — file names, line counts, churn data]
**Why-it-matters:** [concrete consequence — what breaks, what slows down]
**Recommendation:** Option A
**Options:**
- **A** *(recommended)*: Build the fix via `/mine.build`
- **B**: File as issue — track in GitHub for future work
- **C**: Skip — noted, no action this session
**Why A:** [one-sentence rationale specific to this finding]
Use finding types from this vocabulary: Test Gap | Structural | Coupling | Tech Debt | Pattern Drift
The (1/1) in Raised-by is the single-source convention for non-critic-panel callers — audit has one analyst, not a critic panel.
Follow ${CLAUDE_HOME:-~/.claude}/skills/mine.challenge/findings-protocol.md for the findings file format and status field definitions.
Audit findings use the User-directed model with explicit option letters (A/B/C). Present each finding one at a time via AskUserQuestion:
A (or fix) — invoke /mine.build with the finding's description as the argument. For structural/architectural problems, /mine.build will assess complexity and route to direct implementation or the full caliper workflow.B — create a GitHub issue via gh-issue create for this findingC (or skip) — noted in session summary, no actionMultiple findings selected for build: if several findings are being addressed via /mine.build, suggest an order of attack — highest impact first, dependency-aware (e.g., fix the circular dependency before refactoring the modules caught in the cycle).
After findings are resolved, offer to save the audit to the repo. Audits are backward-looking snapshots that feed into future design docs and refactors.
Recommended convention — date-stamped directory under design/audits/:
design/audits/
└── YYYY-MM-DD-topic-name/
├── audit.md Narrative summary + findings list
└── ...
Create design/audits/ if it doesn't exist. If the project already saves audits elsewhere, follow the existing convention.
/mine.builddevelopment
Use when the user says: "humanize this", "unslop this", "de-slop this", "fix AI writing", "remove AI tells", "clean up AI prose". Edits prose to remove AI writing patterns and add human voice. Analyzes first, then asks how to fix. Prose complement to mine.clean-code.
development
Use when the user says: "why is this code like this", "why does this exist", "why was this built this way", "decision rationale", "what's the history behind". Decision archaeology — reconstructs historical rationale from evidence, not speculation.
development
Use when the user says: "how does X work", "walk me through", "explain this subsystem", "explain how", "trace the flow". Complexity-adaptive subsystem explanation — builds mental models conversationally, not documentation artifacts.
development
Use when the user says: 'create an issue', 'file an issue', 'open an issue', 'write an issue', 'new issue for this'. Codebase-aware issue creation — investigates the code to produce well-structured issues with acceptance criteria, affected areas, and enough detail for automated triage.