skills/codebase-audit/SKILL.md
--- name: codebase-audit description: Long-running comprehensive adversarial audit of an entire codebase. Orchestrates partition-by-partition adversarial reviews by delegating to /adversarial-reviewer --codebase <partition>, then synthesizes a cross-partition risk register and a written remediation report. Unlike /adversarial-reviewer --codebase (which samples strategically from the whole repo in one pass), this skill methodically covers the full scope over a long session, is resumable after cra
npx skillsauth add mhylle/claude-skills-collection skills/codebase-auditInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Long-running comprehensive adversarial audit. Orchestrates per-partition reviews through /adversarial-reviewer --codebase, then synthesizes findings into a written report. Designed for methodical full coverage, not strategic sampling — use when the user wants the whole thing reviewed and is willing to spend the tokens and time.
| Skill | What it does | Coverage | Output | When |
|-------|-------------|----------|--------|------|
| /code-review | Per-phase quality gate | Changed files | Verdict + notes | During implementation |
| /adversarial-reviewer (diff) | Hostile pre-merge review | Changed files | BLOCK/CONCERNS/CLEAN | Before a merge |
| /adversarial-reviewer --codebase | Adversarial sample of repo | 5-10 files per persona | HIGH/MEDIUM/LOW-RISK + most-concerning area | Quick whole-repo sanity check |
| /codebase-audit | Methodical full audit | Every partition | Written report + risk register | Onboarding / due diligence / tech-debt review |
| /code-quality-audit | Metrics: coverage, complexity, cycles, mutation | Whole repo (metric-based) | Numeric gate | Quantitative health |
Pair codebase-audit with code-quality-audit for a complete picture: this skill gives you the qualitative adversarial read; code-quality-audit gives you the quantitative one.
/codebase-audit # Audit CWD
/codebase-audit src/ # Scope to a subtree
/codebase-audit --resume # Pick up from last checkpoint
/codebase-audit --only api # Re-run a specific partition
/codebase-audit --force # Ignore existing partition reports (re-review everything)
Expect 200K-500K tokens and 30-60 minutes of wall clock for a typical ~500-file repo. This is a deliberate investment. Confirm with the user before kicking off if the cost would surprise them.
Determine the scope root (path argument, or CWD). Build the map:
git ls-files | wc -l — total file count (or find . -type f if not a git repo).git ls-files | head -500 + tree -L 3 — structural overview.README.md, CLAUDE.md, package.json / pyproject.toml / Cargo.toml / go.mod, top-level *.config.*.git log --pretty=format: --name-only --since=180.days | sort | uniq -c | sort -rn | head -50 — recent churn.packages/, apps/, workspace configs) — these are natural partition boundaries.Write a one-page codebase map summarizing what you learned. This goes into the audit directory so future steps and the user can reference it.
Create the audit workspace:
docs/audits/YYYY-MM-DD-<repo-name>/
├── map.md # from Step 1
├── plan.md # from Step 2 (this step)
├── partitions/ # per-partition reports (populated in Step 3)
└── REPORT.md # final synthesis (produced in Step 5)
Propose a partition plan in plan.md. Partition strategy, in priority order:
packages/*, apps/*) — each workspace is one partition.src/api, src/domain, src/ui, etc.) — each is one partition.Each partition should be 50-150 files. Too small = wasted orchestration overhead. Too large = exceeds /adversarial-reviewer --codebase's sampling sweet spot (which expects ~300-500 files max).
Record each partition with:
Then stop and show the plan to the user. This is the token-commitment checkpoint. Ask:
"Here's the partition plan: N partitions, estimated cost ~XYZ tokens. Approve, modify, or narrow scope?"
Do not proceed to Step 3 without explicit user approval. A user who sees 15 partitions and wanted 3 should catch it here, not after the audit is half-done.
Use TaskCreate to register one task per partition. This makes progress visible and resumable.
For each partition (honor TaskList ordering):
Check for existing report at docs/audits/.../partitions/<partition-name>.md. If it exists and --force was not passed, skip this partition (resume behavior) — mark task complete, move on.
Mark task in_progress via TaskUpdate.
Invoke the adversarial-reviewer skill on the partition path:
/adversarial-reviewer --codebase <partition-path>
Capture the full output (three persona sections + synthesis + verdict + most-concerning area).
Write the partition report to docs/audits/.../partitions/<partition-name>.md. The file should contain:
Mark task completed.
If a partition invocation fails (timeout, tool error), mark the task failed, write an error stub to the partition file, and continue with the next partition. Don't let one partition kill the whole audit.
Do not batch the reviews in parallel. Sequential is safer here — adversarial-reviewer itself spawns three persona subagents, so parallelism is already happening at that level. Stacking parallelism on top can exhaust rate limits and causes contention on subagent dispatch.
After all partitions are complete, read every partitions/*.md file and look for patterns that no single partition could see:
api, webhook, and cli. Systemic = highest severity.grep / import analysis to find these.Write a synthesis.md with the findings grouped by the patterns above.
Produce REPORT.md. This is the deliverable — assume the reader won't read the per-partition files. Structure:
# Codebase Audit Report: <repo-name>
**Date:** YYYY-MM-DD
**Scope:** <path> (N files across K partitions)
**Overall health:** HEALTHY / CONCERNING / AT-RISK / DEGRADED
## Executive Summary
3-5 sentences. Single most important thing to fix. Single biggest risk. Overall shape.
## Risk Register
Prioritized table. One row per distinct finding:
| # | Severity | Area | Finding | Detected in | Recommended action |
|---|----------|------|---------|-------------|-------------------|
| 1 | CRITICAL | auth | SQL injection in session middleware | api, admin | Parameterize queries |
| 2 | HIGH | data | No migration rollback procedure | data, worker | Add reversible migrations |
...
Order by severity, then by breadth (systemic issues above single-partition).
## Systemic Issues
Findings that appeared in multiple partitions, with the list of partitions they appeared in.
## Architectural Themes
Cross-cutting concerns reviewed at the whole-repo level.
## Per-partition Index
| Partition | Verdict | Critical | Warnings | Report |
|-----------|---------|----------|----------|--------|
| api | HIGH-RISK | 3 | 7 | partitions/api.md |
...
## Remediation Roadmap
Prioritized, with rough effort estimates:
1. **Immediate (this week):** items that block safe operation
2. **Short-term (this quarter):** items that degrade the codebase if ignored
3. **Medium-term (this year):** tech-debt paydown
4. **Defer or accept:** items that aren't worth fixing given current priorities
## Methodology Note
One paragraph explaining how the audit was conducted (partitions, `/adversarial-reviewer --codebase` per partition, synthesis pass) so the reader can judge the evidence.
Tell the user:
docs/audits/.../REPORT.mdpartitions//code-quality-audit if they want the quantitative complement| Rating | Criteria | |--------|----------| | HEALTHY | No critical findings, few warnings, tests present, no systemic issues | | CONCERNING | No criticals but clear warnings clusters, inconsistencies, or thin tests | | AT-RISK | 1-2 CRITICALs, or cross-partition systemic issues, or significant test gaps | | DEGRADED | Multiple CRITICALs, clear decay patterns, broken architectural boundaries |
The audit is resumable by default. A re-run of /codebase-audit (or explicit --resume) will:
docs/audits/.../plan.md → skip Step 1 and Step 2.partitions/<name>.md — skip if exists.If the user passes --force, ignore existing partition reports and re-review everything. This is for when the codebase has changed significantly since the last audit.
If the user passes --only <partition-name>, re-review just that partition (useful after fixes, or if a partition's first review was flaky).
The audit directory name includes the date, so re-runs the next day create a fresh audit rather than muddying yesterday's. Users who specifically want to continue yesterday's audit should pass --resume <audit-dir-name> (e.g., --resume 2026-04-23-myapp).
| Anti-pattern | Why it's wrong |
|-------------|----------------|
| Kicking off a 20-partition audit without user approval | Hundreds of K tokens is not a "background task." The user must opt in after seeing the plan. |
| Running partitions in parallel | adversarial-reviewer already spawns three subagents per call. Layering parallelism causes rate-limit pressure and concurrency bugs in subagent dispatch. |
| Skipping the synthesis step and shipping the per-partition files as the report | The systemic findings are the whole point. A stack of per-partition reports is evidence, not a deliverable. |
| Reusing persona logic inline instead of delegating | Two copies of the persona briefs drift. Always invoke /adversarial-reviewer --codebase <partition>. |
| Dumping the map into the partition-level briefs | /adversarial-reviewer --codebase builds its own map for its scope. Don't pre-feed it; keep partitions clean. |
| Treating the final report as a merge decision | This is a health assessment for a codebase, not a gate on a change. Use HEALTHY/CONCERNING/AT-RISK/DEGRADED, not BLOCK/CLEAN. |
| Letting one failed partition abort the audit | Mark the partition failed, move on. The report's methodology note should disclose any skipped partitions. |
adversarial-reviewer (one call per partition, --codebase mode)code-quality-audit — qualitative (this skill) + quantitative (that skill) gives you the complete picturecodebase-research — that skill explains how the code works; this skill assesses what's wrong with itcode-review or security-review — per-change quality gates still belong in their own skillsadversarial-reviewer --codebase does strategic sampling — 5-10 files per persona, one parallel pass, done in minutes. That's the right shape for a quick "is this repo okay?" answer.
Comprehensive audit is a different shape: multi-stage (scope → partition → per-partition → synthesis → report), long-running (tens of minutes), resumable (checkpoints on disk), and produces a written document rather than a verdict paragraph. Stuffing this into adversarial-reviewer would have bloated that skill past usefulness and confused its single-unit-of-review purpose.
This skill is an orchestrator — it does nothing the delegated skill can't do; it arranges the work so that full coverage becomes feasible, which no single adversarial-reviewer invocation can achieve.
tools
--- name: tt-workflow-build description: Tasktracker-native trigger for a PARALLEL build via the Claude Code Workflow tool. Thin by design — it does two things, then drives to done: (1) ensure a tasktracker project exists (use the existing one, or create one), then (2) start a dynamic `Workflow` that builds it, tracking the work in tasktracker and using the build + verify skills. It does NOT analyze parallelism up front, ask the user to choose a mode, hand back, or fall back to a sequential skil
tools
--- name: grumpy-reviewer description: A single grumpy, nitpicky structural code reviewer that runs as an isolated subagent and treats the code as third-party work submitted by a junior programmer for validation. It cares about exactly one thing — maintainability — judged through separation of concerns, service-oriented design, helper-method extraction, small files, and the rule of 7 (as any grouping nears 7 members, it pushes for sub-groupings). It is deliberately kept OUT of the implementation
development
--- name: tt-workflow-run description: Tasktracker-native autonomous build-loop orchestrator. Drives a first-class `workflow_run` end-to-end — create the run (Gate 1 lifecycle completeness + Gate 2 zero-defects-in), then loop while `getNextReadyTask(projectId)` returns a slice — `setActiveTask` → record a pre-slice `scanArchitectureDrift` baseline → delegate the slice to `/tt-implement-phase` (which does the code work, registers the architecture delta in-slice, and auto-logs defects/learnings/fr
tools
Tasktracker-native project-wide parallel audit using the Claude Code Workflow tool (dynamic workflows). Partitions a repo / backlog / architecture and fans out read-only agents (one per partition) that return schema-checked findings, aggregates them into a deduplicated, ranked risk register, and OPTIONALLY writes fixes back as tasks under a Bug Fix phase — with all tasktracker writes done by the PARENT, never the parallel agents (single global active-task pointer). Journaled and resumable, so a rate-limit or crash mid-audit resumes without re-running completed partitions. Use for large, embarrassingly-parallel, read/analyze-heavy jobs where each unit is self-contained and the output aggregates — audit every file/component for risk, find all architecture drift (scanArchitectureDrift) or duplicate tasks (detectDuplicates/auditDuplicates), per-file tech-debt sweep, test-coverage or security-surface scan across a whole project. Triggers on "/tt-workflow-audit", "audit the whole repo", "parallel audit", "scan every file/component", "find all drift/duplicates", "tech-debt sweep (tasktracker)", or any whole-project analyze-at-scale request inside a session with a tasktracker project. Prefer this over /codebase-audit or /code-quality-audit when the project is tracked in tasktracker AND you want the findings written back as tasks; prefer it over team-* modes when the units don't need to negotiate live (they just report).