skills/nih-audit/SKILL.md
Scan for custom code that duplicates well-supported libraries, then recommend migrations with effort estimates. Detects hand-rolled utilities, retry logic, validation, date handling, and DIY parsers. Use when the user mentions reinventing the wheel, asks "is there a library/crate for this", wants a build vs buy audit, says "what are we maintaining that we shouldn't be", or "should we just use lodash for this". Do NOT use for code-quality review (/age) or dead-code removal (/simplify or /ghostbuster).
npx skillsauth add paulnsorensen/dotfiles nih-auditInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Find code reinventing the wheel. Recommend libraries. Score with evidence.
Scope: $ARGUMENTS (or repo root if blank)
Goal: Know what's already installed so we never recommend existing deps.
Glob: **/package.json
Glob: **/Cargo.toml
Glob: **/pyproject.toml
Glob: **/go.mod
Glob: **/Gemfile
Glob: **/requirements.txt
Glob: **/composer.json
Glob: **/build.gradle
Glob: **/pom.xml
Glob: **/mix.exs
Filter out manifests inside node_modules/, vendor/, .git/, build/.
For each manifest, extract dependency names into a flat set:
| Manifest | Extract command |
|----------|----------------|
| package.json | jq -r '(.dependencies + .devDependencies) // {} \| keys[]' |
| Cargo.toml | yq -r '.dependencies \| keys[]' or parse [dependencies] section |
| pyproject.toml | yq -r '.project.dependencies[]' or [tool.poetry.dependencies] |
| go.mod | grep '^require' + parse module paths |
| requirements.txt | line-by-line package names |
Store as depManifest:
{
"workspaces": {
".": { "ecosystem": "node", "deps": ["express", "zod", "uuid"] }
},
"primaryLanguages": ["typescript"]
}
Infer from manifest types + file extensions in scope. This determines which ast-grep patterns the scanner will run.
Tool budget: ~5 calls.
Goal: Find code that smells like reinvented wheels.
Spawn the nih-scanner agent:
Agent(
subagent_type="nih-scanner",
model="sonnet",
prompt="Scan for NIH patterns.
Languages: <detected languages>
Scope: <$ARGUMENTS or repo root>
depManifest: <JSON>
Slug: <slug>",
run_in_background=false
)
The scanner returns the full JSON candidate list inline in its response, along with a summary (file count, candidate count, categories).
Parse the candidates from the response. If 0 candidates, report clean and stop.
Tool budget: ~30 calls (in sub-agent).
Goal: For each NIH candidate category, find the library that already does this.
Deduplicate candidates that share a category. Group into research queries:
| Category | Research query shape | |----------|---------------------| | RETRY | "best retry/backoff library for {language}" | | UUID | "recommended UUID library for {language}" | | VALIDATION | "best validation library for {language}" | | DATE | "best date/time library for {language}" | | DEBOUNCE | "debounce/throttle library for {language}" | | CLONE | "deep clone library for {language}" | | ARGPARSE | "argument parsing library for {language}" | | STRING | "string manipulation library for {language}" | | HTTP | "HTTP client library for {language}" | | SERIALIZATION | "serialization library for {language}" | | ERROR | "error handling library for {language}" | | CRYPTO | "password hashing library for {language}" | | SECURITY | "HTML sanitization library for {language}" | | FORMAT | "number/currency formatting library for {language}" | | COMPARE | "deep equality library for {language}" |
For each category group (max 5 parallel), spawn a general-purpose agent with
focused MCP access. Library lookup primarily uses Context7 for API surface
and gh CLI for repo stats — not a full /briesearch call.
Agent(
subagent_type="general-purpose",
model="sonnet",
prompt="Find well-maintained open-source libraries for: <category description>
Language: <lang>
Already installed (DO NOT recommend): <depManifest deps>
Use ONLY these tools (do NOT use WebSearch or WebFetch):
- mcp__context7__resolve-library-id and mcp__context7__query-docs
- `gh search repos` / `gh repo view` for GitHub stats
For each library found, return:
- Name and latest version
- License (flag GPL, prefer MIT/Apache-2.0/BSD)
- Weekly downloads or crates.io downloads
- GitHub stars
- Last commit date
- Contributor count
- Whether it's stdlib, micro-library, or framework
- One-sentence API example showing how it replaces the NIH code",
run_in_background=true
)
Wait for all research agents. For each candidate:
Tool budget: ~15 calls.
Goal: Check if NIH code is intentional or if a library covers planned work too.
Search for spec directories across the repo, not just .claude/specs/:
Glob: **/specs/*.md
Filter out specs inside node_modules/, vendor/, .git/, build/.
Read each spec's first 100 lines (summary, requirements, goals sections).
For each candidate, search for signals that NIH was deliberate:
In specs: Look for mentions of the candidate's concept + words like "intentionally", "we chose to build", "build vs buy", "don't use", "avoid dependency on".
In code: Search candidate files for comments indicating intent:
Grep: "intentionally|deliberately|don't use|avoid|instead of|rather than|we chose|NOTE:|DECISION:"
Scoped to files containing NIH candidates only.
For each recommended library, check if it covers features described in specs that aren't built yet. A library that handles current NIH code AND future planned features is a stronger recommendation.
| Signal | Modifier | |--------|----------| | Spec explicitly chose NIH | -30 | | Code comment explains NIH choice | -20 | | Library covers planned spec features | +10 | | No spec or comment context | +0 |
Tool budget: ~10 calls.
Goal: Apply 4-step confidence scoring, produce actionable recommendations.
For each candidate with a library recommendation, apply the full 4-step chain:
| Type | Base | Cap | When | |------|------|-----|------| | REPLACE_WITH_STDLIB | 55 | 100 | stdlib function does the same thing | | REPLACE_WITH_MICRO_LIB | 45 | 95 | small focused library (<5 deps) | | REPLACE_WITH_FRAMEWORK | 35 | 85 | large framework (lodash, Django, etc.) | | EXTRACT_TO_EXISTING_DEP | 50 | 95 | already-installed dep has this feature |
| Evidence | Modifier | |----------|----------| | Serena-verified usage count (exact caller list) | +15 | | Library has >10K weekly downloads + MIT/Apache | +20 | | ast-grep pattern match + code read confirms NIH | +15 | | NIH code has recent bug fixes (git blame) | +10 | | NIH code >100 LOC for what library does in 1 call | +10 | | Generic pattern match, code does more than pattern suggests | -15 | | Recommended library is unmaintained (last commit >1yr) | hard cap at 40 |
| Signal | Modifier | |--------|----------| | Spec explicitly chose NIH | -30 | | Code comment explains NIH choice | -20 | | Library covers planned spec features | +10 | | NIH code is in a git hotspot (many recent changes) | +10 | | NIH code is isolated (1 file, clear boundary) | +5 | | NIH code is deeply coupled (referenced from >10 files) | -5 |
For EVERY candidate (not just borderlines):
| Criteria | Size | |----------|------| | 1 file, <50 LOC, <=3 call sites | S | | 2-5 files, <200 LOC, <=10 call sites | M | | >5 files, >200 LOC, or >10 call sites | L |
Build the full report in memory. Do NOT write to $TMPDIR or any file — return
everything inline in the summary response.
For EVERY finding (no threshold filtering — show all candidates):
### Finding #N: <Title> (Score: NN) [AMBIGUOUS if passes diverge >20]
**NIH Code**: `file:line-line` (N LOC)
**Category**: CATEGORY
**Pattern**: <what was detected>
**Recommended Alternative**: `library-name` (version)
- License: MIT/Apache-2.0/BSD
- Downloads: N/week | Stars: N | Last commit: YYYY-MM-DD
- Contributors: N
**Code Touchpoints**:
- `file:line` — implementation (DELETE or REPLACE)
- `file:line` — import (UPDATE)
- ...
**Effort**: S/M/L (N files, N call sites)
**Migration Path**:
1. Install: `npm install library` / `cargo add library` / etc.
2. Replace: specific code change description
3. Clean up: remove old files/tests
**Scoring**:
- Pass 1: NN (base NN + evidence NN + context NN)
- Pass 2: NN (base NN + evidence NN + context NN)
- Final: NN (average)
**Why do it**: <concrete benefits — maintenance burden removed, bugs already
fixed upstream, stdlib means zero new deps, covers planned features, etc.>
**Why not**: <concrete reasons to keep NIH — trivial code not worth a dep,
hot path where you need control, intentional design choice, library adds
transitive deps you don't want, coupling risk, etc.>
Return everything inline — no temp files. Include the summary table, specs consulted, and the full detailed findings (one ### Finding block per recommendation above threshold):
## NIH Audit: <scope>
### Summary
- Files scanned: N
- NIH candidates found: N
- Already using best option: N (filtered out)
- Ambiguous (scoring passes diverge >20): N
### All Findings (sorted by score, descending)
| # | Score | P1 | P2 | Category | NIH Code | Replace With | Effort |
|---|-------|----|----|----------|----------|-------------|--------|
| 1 | 92 | 90 | 94 | UUID | src/utils/uuid.ts:12 | crypto.randomUUID() (stdlib) | S |
| 2 | 42 | 45 | 39 | COLOR | theme/generate.sh:68 | pastel (cargo) | S |
### Specs Consulted
- spec-name: <relevant finding or "no NIH justifications">
<detailed findings inline — one ### Finding block per candidate, ALL included>
Tool budget: ~10 calls.
run_in_background=true. Wait for all before Phase 4.clearTimeout + setTimeout combo isn't always a debounce. The orchestrator's scoring step (Phase 4) catches generic matches via the -15 modifier.crypto.randomUUID() replacing a hand-rolled UUID is a no-brainer (no new dep). Always score these highest.deepClone might have done so intentionally (bundle size). The spec/comment check catches this.packages/api/ might be NIH in that workspace but the library is installed in packages/web/. Each workspace's depManifest is independent.tools
Reconstruct what a past coding-agent session was doing so you can resume it — goal, files touched, last verified state, and the next step — by querying the session logs. Use when the user says "what was I working on", "recover that session", "reconstruct where I left off", "resume my last session", "what did that session change", "rebuild context from logs", or invokes /work-recovery. Report-only — it never scores or judges. Do NOT use for usage scoring (that is /skill-improver, /tool-efficiency, /prompt-analytics) or one-off interactive log queries (that is /session-analytics).
development
Curate this repo's hallouminate wiki (.hallouminate/wiki/, the repo:dotfiles:wiki corpus) — add or update architecture pages, per-harness docs, and gotchas. Use when the user says "update the wiki", "document this in the wiki", "refresh the harness docs", "add a wiki page", "curate the wiki", "the wiki is stale", or invokes /wiki-curator. Also use at session end to write back a non-obvious decision or gotcha worth preserving. Grounds the existing wiki first, follows one-topic-per-file conventions, verifies every external doc URL before writing, and reindexes. Do NOT use for general code search (that is cheez-search) or for editing AGENTS.md command reference.
tools
Audit how a tool, command, or MCP server is actually used across coding-agent sessions and produce calibrated recommendations — tool-vs-task fit, error forensics, fix recommendations, permission friction, MCP health, and token economics. Use when the user says "tool efficiency", "am I using X efficiently", "audit tool usage", "why does X keep failing", "how do I fix this error", "what should I change", "permission friction", "is this MCP worth it", "tool error rate", "fix recommendations", or invokes /tool-efficiency. Do NOT use for auditing a skill or agent definition (that is /skill-improver) or for one-off interactive log queries (that is /session-analytics).
tools
Analyze how prompts and skill routing behave across coding-agent sessions and produce calibrated recommendations — prompt-pattern analysis, routing accuracy, and knowledge gaps. Use when the user says "analyze my prompts", "prompt patterns", "is routing working", "which skill should have fired", "knowledge gaps", "what do I keep asking", or invokes /prompt-analytics. Do NOT use for auditing a single skill/agent definition (that is /skill-improver), tool/MCP efficiency (that is /tool-efficiency), or one-off interactive log queries (that is /session-analytics).