skills/exhaustive-systems-analysis/SKILL.md
Perform evidence-driven, multi-subsystem audits of real codebases to find correctness bugs, race conditions, security gaps, stale documentation, dead code, and production-readiness risks. Use when asked to audit a system end-to-end, verify agent-written code before shipping, analyze a subsystem for correctness across multiple modules, or produce a structured risk report for a real implementation. Prefer other skills for a single isolated bug, a proposal or document review, or a dedicated dead-code cleanup.
npx skillsauth add petekp/claude-code-setup exhaustive-systems-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill for full-system correctness work. The job is to map the system, identify the highest-risk behaviors, prove or refute concrete failure hypotheses, and leave behind a report another engineer can act on without re-reading the whole codebase.
chat-first output. Return findings inline unless the user asks for docs or the audit clearly needs multi-session artifacts.artifact mode for large or resumable audits. Use docs/audit/ or .claude/docs/audit/, matching the repo's existing conventions.Before reading deeply, write a one-screen scope brief using the template in references/templates.md.
Capture:
chat-first or artifact modeIf the request is broad, narrow it to the modules that can actually change user outcomes or ship readiness.
Read only the materials that establish intended behavior:
README, CLAUDE.md, architecture docs, ADRsTODO, FIXME, HACK, and "known issues"Extract:
Map the system into subsystems before deep analysis. Use the coverage ledger template in references/templates.md.
For each subsystem record:
planned | in_progress | done | follow_upPrioritize by user impact first, then by side effects, concurrency, privilege, and recent churn. Folder structure alone is not a priority system.
For each high- or medium-risk subsystem, write 2-3 concrete hypotheses before diving in. Good hypotheses are falsifiable and tied to a behavior boundary.
Examples:
Update or discard hypotheses as evidence comes in. This step prevents aimless scanning.
Read the subsystem end-to-end:
Select only the relevant checklist sections from references/checklists.md. Do not load every checklist if the subsystem only needs one or two.
When subagents are available, assign one bounded subsystem per subagent with disjoint files and ask for:
Every finding must separate observation from inference.
Required fields:
Severity: Critical | High | Medium | LowStatus: Confirmed | Likely | Needs follow-upConfidence: High | Medium | LowType: Bug | Race condition | Security | Stale docs | Dead code | Design flaw | ReliabilityLocation: exact file path and line or functionImpacted behavior: the user-visible workflow, invariant, or contract at riskObserved evidence: code citation, command output, test result, log, or search resultInference: why that evidence implies the reported problemWhat I checked: searches, tests, docs, commits, or alternate explanations ruled outRecommendation: the smallest credible next actionNext verification step: required when status is Needs follow-upUse Confirmed only when the bug is directly demonstrated by code, a failing test, a repro path, or a hard contradiction. Use Likely when the reasoning is strong but not directly reproduced. Use Needs follow-up when something is suspicious but the evidence is incomplete.
After subsystem reviews:
Prefer stronger evidence over more words. From strongest to weakest:
Static reasoning alone can still be valuable, but it should usually produce Likely, not Confirmed.
For dead code or stale docs, always show what you searched and why you believe the code or documentation is obsolete. A dead-code claim without a consumer search is incomplete.
Use the templates in references/templates.md for:
Use single-session mode for small audits. For large audits or when context is tight, create a lightweight control plane:
00-plan.md for the scope brief and coverage ledgerSUMMARY.md for consolidated findings and fix orderHANDOFF.md if work will continue laterA good handoff includes:
Needs follow-upThe audit is complete when:
tools
Comprehensively manually test the Circuit plugin's user-facing surface in either Claude Code or Codex. Use this skill whenever the user asks to "manually test Circuit", "QA the Circuit plugin", "exercise the Circuit surface", "run the Circuit checklist", "smoke test Circuit", "find regressions in Circuit", "test the Claude Circuit plugin", "test the Codex Circuit plugin", or when preparing a Circuit release for marketplace publication. Argument is the host package to test — `claude` or `codex`. Produces a Markdown report with per-command pass/fail, exploratory findings ranked by severity, run-folder evidence links, and a concise terminal summary. Use even if the user does not say the word "test" — phrases like "go through every Circuit command" or "make sure Circuit still works end-to-end" should also trigger.
development
Turn the prompt supplied with this skill into a concise, auditable Codex Goal or explain why a Goal is not the right fit. Use when the user asks to draft, formulate, rewrite, tighten, or create a `/goal` from a plain-language task, especially for multi-step work that needs a durable objective, evidence-based completion, constraints, iteration policy, and a default adversarial review loop.
development
Give the human a fast, plain-English catch-up on what changed in the project: what the agents did, why, and what decisions need their input. Use this whenever the user asks to "catch me up", "what changed", "where are we", "recap", "brief me", "give me the rundown", "what did you do", "summarize the session", "fill me in", or otherwise signals they have been away and want to get back up to speed quickly. Built for someone steering several agent-driven projects at once who does not read the code closely but needs to grasp the core ideas, the choices made, and the open decisions well enough to steer. Trigger even if they do not use these exact words: any request to get oriented on recent progress should use this skill.
tools
Expert Unix and macOS systems engineer for shell scripting, system administration, command-line tools, launchd, Homebrew, networking, and low-level system tasks. Use when the user asks about Unix commands, shell scripts, macOS system configuration, process management, or troubleshooting system issues.