skills/codebase-analyzer/SKILL.md
Statistical rule discovery from Go codebase patterns.
npx skillsauth add notque/claude-code-toolkit codebase-analyzerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Statistical rule discovery through measurement of Go codebases. Python scripts count patterns to avoid LLM training bias, then statistics are interpreted to derive confidence-scored rules. The core principle is Measure First, Interpret Second -- what IS in the code is the local standard, not what an LLM thinks "should be" there.
Load these files when the corresponding signals appear:
| Signal | Load |
|--------|------|
| Understanding the three lenses (Consistency, Signature, Idiom) | references/three-lenses.md |
| Worked examples, phase banners, error catalog, reconciliation matrix | references/phase-details.md |
| Full 100-metric catalog across 25 categories | references/metrics-catalog.md |
| Additional real-world analysis workflows | references/examples.md |
| Signal | Load These Files | Why |
|---|---|---|
| example-driven tasks | examples.md | Loads detailed guidance from examples.md. |
| tasks related to this reference | metrics-catalog.md | Loads detailed guidance from metrics-catalog.md. |
| tasks related to this reference | phase-details.md | Loads detailed guidance from phase-details.md. |
| tasks related to this reference | three-lenses.md | Loads detailed guidance from three-lenses.md. |
Goal: Validate target and select analyzer variant.
Read and follow the repository's CLAUDE.md before doing anything else -- project instructions override default behaviors.
Step 1: Validate the target
Step 2: Select cartographer variant
| Variant | Script | Metrics | Use When |
|---------|--------|---------|----------|
| Omni (recommended) | cartographer_omni.py | 100 across 25 categories | Full codebase profiling |
| Basic | cartographer.py | ~15 categories | Quick pattern overview |
| Ultimate | cartographer_ultimate.py | 6 focused categories | Performance pattern detection |
Step 3: Verify environment
See references/phase-details.md for the CONFIGURE banner template.
Gate: Target directory exists, contains 50+ Go files, variant selected. Proceed only when gate passes.
Goal: Run statistical analysis scripts. Pure measurement -- no interpretation yet.
This phase is strictly mechanical. Scripts count and measure; keep interpretation separate from data collection. Combining measurement with interpretation introduces LLM training bias -- the model reports what "should be" instead of what IS. Run scripts first, interpret the numbers second, always as separate steps.
Automatically filter vendor/, testdata/, and generated code (files with "Code generated by..." markers) to avoid polluting statistics with external patterns.
Step 1: Execute the cartographer
python3 ${CLAUDE_SKILL_DIR}/scripts/cartographer_omni.py /path/to/go/repo
# Or for quick overview: python3 ${CLAUDE_SKILL_DIR}/scripts/cartographer.py /path/to/go/repo
Always run the cartographer scripts for measurement; reserve LLM interpretation for Phase 3. When an LLM sees return err it may report "not wrapping errors properly" even if that IS the local standard. The scripts produce deterministic, reproducible counts; the LLM's role begins at interpretation in Phase 3.
Step 2: Verify output integrity
Step 3: Check for data quality issues
See references/phase-details.md for the MEASURE banner template.
Gate: Script completed without errors, JSON output is valid, file count is reasonable. Proceed only when gate passes.
Goal: Derive rules from statistics. This is where LLM interpretation happens -- AFTER measurement is complete.
Report facts and show complete statistics rather than describing them. Report facts without editorializing about code quality -- the numbers speak for themselves.
Step 1: Review the three lenses
| Lens | Question | Measures | |------|----------|----------| | Consistency (Frequency) | "How often do they use X?" | Imports, test frameworks, logging, modern features | | Signature (Structure) | "How do they name/structure things?" | Constructors, receivers, parameter order, variables | | Idiom (Implementation) | "How do they implement patterns?" | Error handling, control flow, context usage, defer |
For detailed lens explanations, see references/three-lenses.md.
Step 2: Extract rules by confidence
Only derive rules from patterns with sufficient consistency. Forcing rules from weak patterns causes false positives in reviews and may impose standards the team has not organically adopted.
| Confidence | Threshold | Action | Example | |------------|-----------|--------|---------| | HIGH | >85% consistency | Extract as enforceable rule | "96% use err not e" -> MUST use err | | MEDIUM | 70-85% consistency | Extract as recommendation | "78% guard clauses" -> SHOULD prefer guards | | Below 70% | Not extracted as rule | Report as observation only | "55% single-letter receivers" -> No rule |
Step 3: Review Style Vector (Omni only)
Step 4: Cross-reference lenses
Gate: Rules extracted with evidence and confidence levels. Style Vector reviewed. Proceed only when gate passes.
Goal: Produce actionable output artifacts.
Step 1: Save statistical report
cartography_data/{repo_name}_cartography.json
Step 2: Generate derived rules document
derived_rules/{repo_name}_rules.md
Rule and Style Vector formats, plus the DELIVER banner template, live in
references/phase-details.md.
Step 3: Summarize Style Vector (Omni only) — see phase-details.md
Step 4: Recommend next steps
Gate: JSON report saved, rules document generated, next steps documented. Analysis complete.
Load references/phase-details.md for:
${CLAUDE_SKILL_DIR}/references/three-lenses.md: Detailed explanation of the three analysis lenses${CLAUDE_SKILL_DIR}/references/examples.md: Real-world analysis examples and workflows${CLAUDE_SKILL_DIR}/references/metrics-catalog.md: Complete 100-metric catalog across 25 categories${CLAUDE_SKILL_DIR}/references/phase-details.md: Phase banners, reconciliation matrix, examples, error handlingdocumentation
Document translation: quick/normal/refined modes with chunked parallel subagents and glossary support.
development
AI image generation: Gemini and Nano Banana backends; single/series/batch workflows with prompt-to-disk.
testing
Unified voice content generation pipeline with mandatory validation and joy-check. 13-phase pipeline: LOAD, GROUND, STATS-CHECKPOINT, GENERATE, HOOK-GATE, VALIDATE, REFINE, VARIETY-GATE, JOY-CHECK, ANTI-AI, CLOSE-GATE, OUTPUT, CLEANUP. Use when writing articles, blog posts, or any content that uses a voice profile. Use for "write article", "blog post", "write in voice", "generate content", "draft article", "write about".
documentation
Critique-and-rewrite loop for voice fidelity validation.