Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

petekp/exhaustive-systems-analysis

Name: exhaustive-systems-analysis
Author: petekp

skills/exhaustive-systems-analysis/SKILL.md

npx skillsauth add petekp/claude-code-setup exhaustive-systems-analysis

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Exhaustive Systems Analysis

Use this skill for full-system correctness work. The job is to map the system, identify the highest-risk behaviors, prove or refute concrete failure hypotheses, and leave behind a report another engineer can act on without re-reading the whole codebase.

Failure Modes To Prevent

Surface-level audits that scan files without following behavior end-to-end
False certainty: reporting suspicions as bugs without enough evidence
Context drift across large audits with many subsystems
Cosmetic reviews that miss the real correctness and ship-readiness risks

Operating Mode

Default to chat-first output. Return findings inline unless the user asks for docs or the audit clearly needs multi-session artifacts.
Switch to artifact mode for large or resumable audits. Use docs/audit/ or .claude/docs/audit/, matching the repo's existing conventions.
Do not start fixing code while auditing unless the user explicitly asks for fixes. This skill is for diagnosis, proof, and prioritization.

Workflow

0. Calibrate The Audit

Before reading deeply, write a one-screen scope brief using the template in references/templates.md.

Capture:

system or area under review
user-visible workflows or contracts that matter most
likely high-risk surfaces: state, side effects, concurrency, auth, persistence, external integrations
out-of-scope areas
output mode: chat-first or artifact mode

If the request is broad, narrow it to the modules that can actually change user outcomes or ship readiness.

1. Load Intent Before Code

Read only the materials that establish intended behavior:

README, CLAUDE.md, architecture docs, ADRs
tests that describe user-visible or contract behavior
recent commits touching the target area
TODO, FIXME, HACK, and "known issues"
incident notes, bug reports, or issue tracker items if available

Extract:

critical workflows
external surfaces
hotspots and recent churn
manual-only surfaces that cannot be fully verified from code alone

2. Build The Coverage Ledger

Map the system into subsystems before deep analysis. Use the coverage ledger template in references/templates.md.

For each subsystem record:

name
entrypoints
files or directories in scope
invariants or promised behaviors
side effects
risk level
status: planned | in_progress | done | follow_up

Prioritize by user impact first, then by side effects, concurrency, privilege, and recent churn. Folder structure alone is not a priority system.

3. Generate Hypotheses Before The Deep Pass

For each high- or medium-risk subsystem, write 2-3 concrete hypotheses before diving in. Good hypotheses are falsifiable and tied to a behavior boundary.

Examples:

"A failure between write A and write B can leave persisted state inconsistent."
"The retry path duplicates a side effect because idempotence is not enforced."
"The docs promise behavior X, but the implementation falls through to Y on invalid input."

Update or discard hypotheses as evidence comes in. This step prevents aimless scanning.

4. Audit One Subsystem At A Time

Read the subsystem end-to-end:

start at entrypoints and trace the happy path
trace error paths, cleanup paths, cancellation or shutdown, and retries
compare implementation to tests, docs, types, and public contracts
run targeted searches, commands, or tests when they strengthen the evidence
record exact commands, searches, and scopes when they support a finding

Select only the relevant checklist sections from references/checklists.md. Do not load every checklist if the subsystem only needs one or two.

When subagents are available, assign one bounded subsystem per subagent with disjoint files and ask for:

hypotheses checked
findings with exact citations
coverage gaps
suggested next verification step

5. Classify Findings With Evidence, Status, And Confidence

Every finding must separate observation from inference.

Required fields:

Severity: Critical | High | Medium | Low
Status: Confirmed | Likely | Needs follow-up
Confidence: High | Medium | Low
Type: Bug | Race condition | Security | Stale docs | Dead code | Design flaw | Reliability
Location: exact file path and line or function
Impacted behavior: the user-visible workflow, invariant, or contract at risk
Observed evidence: code citation, command output, test result, log, or search result
Inference: why that evidence implies the reported problem
What I checked: searches, tests, docs, commits, or alternate explanations ruled out
Recommendation: the smallest credible next action
Next verification step: required when status is Needs follow-up

Use Confirmed only when the bug is directly demonstrated by code, a failing test, a repro path, or a hard contradiction. Use Likely when the reasoning is strong but not directly reproduced. Use Needs follow-up when something is suspicious but the evidence is incomplete.

6. Run A Convergence Pass

After subsystem reviews:

deduplicate cross-cutting findings
re-rank by severity and user impact
run a final residue sweep for stale docs, deprecated names, orphaned helpers, temp flags, TODO or FIXME clusters, and risky APIs
record exact residue queries and counts if they matter to the conclusion
list coverage gaps explicitly instead of pretending the audit was complete where it was not

Evidence Standard

Prefer stronger evidence over more words. From strongest to weakest:

failing or targeted test
reproducible path with exact steps
direct code contradiction with exact citations
logs, telemetry, or command output
scoped search results with counts
static reasoning

Static reasoning alone can still be valuable, but it should usually produce Likely, not Confirmed.

For dead code or stale docs, always show what you searched and why you believe the code or documentation is obsolete. A dead-code claim without a consumer search is incomplete.

Reporting Rules

Lead with findings, not the methodology recap.
Prefer user-impacting correctness issues over stylistic cleanup.
Keep related findings separate unless they share the same root cause.
If nothing serious is wrong, say so directly and still report residual risk and unverified surfaces.
Do not write "looks wrong" or "might be an issue" without saying what you checked and what would prove or disprove it.

Use the templates in references/templates.md for:

scope brief
coverage ledger
finding format
chat-first summary
artifact-mode audit directory

Session Management

Use single-session mode for small audits. For large audits or when context is tight, create a lightweight control plane:

00-plan.md for the scope brief and coverage ledger
one file per subsystem only if the audit is large enough to justify it
SUMMARY.md for consolidated findings and fix order
HANDOFF.md if work will continue later

A good handoff includes:

what was covered
what is now believed to be true
what remains unverified
current blockers
exact next steps

Anti-Patterns

scanning directories without identifying entrypoints or invariants
reporting every code smell as a finding
calling something dead code without a consumer search
calling something a bug without showing the broken behavior or violated contract
collapsing multiple subsystems into one giant writeup
hiding uncertainty instead of marking Needs follow-up

Completion Criteria

The audit is complete when:

high-risk subsystems have a ledger entry and a final status
every reported finding has evidence, confidence, and a concrete location
the final report includes fix order plus unverified surfaces
cross-cutting and residue findings have been consolidated
the report is honest about what was not proven

petekp/exhaustive-systems-analysis

skills/exhaustive-systems-analysis/SKILL.md

Perform evidence-driven, multi-subsystem audits of real codebases to find correctness bugs, race conditions, security gaps, stale documentation, dead code, and production-readiness risks. Use when asked to audit a system end-to-end, verify agent-written code before shipping, analyze a subsystem for correctness across multiple modules, or produce a structured risk report for a real implementation. Prefer other skills for a single isolated bug, a proposal or document review, or a dedicated dead-code cleanup.

35 stars

development

Updated Apr 29, 2026

$ install --global

skillsauth

npx skillsauth add petekp/claude-code-setup exhaustive-systems-analysis

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 22, 2026, 3:50 PM326.1s4 files scanned

SKILL.md

name:: exhaustive-systems-analysis
description:: |

Exhaustive Systems Analysis

Failure Modes To Prevent

Surface-level audits that scan files without following behavior end-to-end
False certainty: reporting suspicions as bugs without enough evidence
Context drift across large audits with many subsystems
Cosmetic reviews that miss the real correctness and ship-readiness risks

Operating Mode

Default to chat-first output. Return findings inline unless the user asks for docs or the audit clearly needs multi-session artifacts.
Switch to artifact mode for large or resumable audits. Use docs/audit/ or .claude/docs/audit/, matching the repo's existing conventions.
Do not start fixing code while auditing unless the user explicitly asks for fixes. This skill is for diagnosis, proof, and prioritization.

Workflow

0. Calibrate The Audit

Before reading deeply, write a one-screen scope brief using the template in references/templates.md.

Capture:

system or area under review
user-visible workflows or contracts that matter most
likely high-risk surfaces: state, side effects, concurrency, auth, persistence, external integrations
out-of-scope areas
output mode: chat-first or artifact mode

If the request is broad, narrow it to the modules that can actually change user outcomes or ship readiness.

1. Load Intent Before Code

Read only the materials that establish intended behavior:

README, CLAUDE.md, architecture docs, ADRs
tests that describe user-visible or contract behavior
recent commits touching the target area
TODO, FIXME, HACK, and "known issues"
incident notes, bug reports, or issue tracker items if available

Extract:

critical workflows
external surfaces
hotspots and recent churn
manual-only surfaces that cannot be fully verified from code alone

2. Build The Coverage Ledger

Map the system into subsystems before deep analysis. Use the coverage ledger template in references/templates.md.

For each subsystem record:

name
entrypoints
files or directories in scope
invariants or promised behaviors
side effects
risk level
status: planned | in_progress | done | follow_up

Prioritize by user impact first, then by side effects, concurrency, privilege, and recent churn. Folder structure alone is not a priority system.

3. Generate Hypotheses Before The Deep Pass

For each high- or medium-risk subsystem, write 2-3 concrete hypotheses before diving in. Good hypotheses are falsifiable and tied to a behavior boundary.

Examples:

"A failure between write A and write B can leave persisted state inconsistent."
"The retry path duplicates a side effect because idempotence is not enforced."
"The docs promise behavior X, but the implementation falls through to Y on invalid input."

Update or discard hypotheses as evidence comes in. This step prevents aimless scanning.

4. Audit One Subsystem At A Time

Read the subsystem end-to-end:

start at entrypoints and trace the happy path
trace error paths, cleanup paths, cancellation or shutdown, and retries
compare implementation to tests, docs, types, and public contracts
run targeted searches, commands, or tests when they strengthen the evidence
record exact commands, searches, and scopes when they support a finding

Select only the relevant checklist sections from references/checklists.md. Do not load every checklist if the subsystem only needs one or two.

When subagents are available, assign one bounded subsystem per subagent with disjoint files and ask for:

hypotheses checked
findings with exact citations
coverage gaps
suggested next verification step

5. Classify Findings With Evidence, Status, And Confidence

Every finding must separate observation from inference.

Required fields:

Severity: Critical | High | Medium | Low
Status: Confirmed | Likely | Needs follow-up
Confidence: High | Medium | Low
Type: Bug | Race condition | Security | Stale docs | Dead code | Design flaw | Reliability
Location: exact file path and line or function
Impacted behavior: the user-visible workflow, invariant, or contract at risk
Observed evidence: code citation, command output, test result, log, or search result
Inference: why that evidence implies the reported problem
What I checked: searches, tests, docs, commits, or alternate explanations ruled out
Recommendation: the smallest credible next action
Next verification step: required when status is Needs follow-up

6. Run A Convergence Pass

After subsystem reviews:

deduplicate cross-cutting findings
re-rank by severity and user impact
run a final residue sweep for stale docs, deprecated names, orphaned helpers, temp flags, TODO or FIXME clusters, and risky APIs
record exact residue queries and counts if they matter to the conclusion
list coverage gaps explicitly instead of pretending the audit was complete where it was not

Evidence Standard

Prefer stronger evidence over more words. From strongest to weakest:

failing or targeted test
reproducible path with exact steps
direct code contradiction with exact citations
logs, telemetry, or command output
scoped search results with counts
static reasoning

Static reasoning alone can still be valuable, but it should usually produce Likely, not Confirmed.

For dead code or stale docs, always show what you searched and why you believe the code or documentation is obsolete. A dead-code claim without a consumer search is incomplete.

Reporting Rules

Lead with findings, not the methodology recap.
Prefer user-impacting correctness issues over stylistic cleanup.
Keep related findings separate unless they share the same root cause.
If nothing serious is wrong, say so directly and still report residual risk and unverified surfaces.
Do not write "looks wrong" or "might be an issue" without saying what you checked and what would prove or disprove it.

Use the templates in references/templates.md for:

scope brief
coverage ledger
finding format
chat-first summary
artifact-mode audit directory

Session Management

Use single-session mode for small audits. For large audits or when context is tight, create a lightweight control plane:

00-plan.md for the scope brief and coverage ledger
one file per subsystem only if the audit is large enough to justify it
SUMMARY.md for consolidated findings and fix order
HANDOFF.md if work will continue later

A good handoff includes:

what was covered
what is now believed to be true
what remains unverified
current blockers
exact next steps

Anti-Patterns

scanning directories without identifying entrypoints or invariants
reporting every code smell as a finding
calling something dead code without a consumer search
calling something a bug without showing the broken behavior or violated contract
collapsing multiple subsystems into one giant writeup
hiding uncertainty instead of marking Needs follow-up

Completion Criteria

The audit is complete when:

high-risk subsystems have a ledger entry and a final status
every reported finding has evidence, confidence, and a concrete location
the final report includes fix order plus unverified surfaces
cross-cutting and residue findings have been consolidated
the report is honest about what was not proven

Related Skills

petekp/pr-self-review

development

VerifiedTrustedCommunity

Draft short, plainspoken notes in the author's voice that help reviewers understand non-obvious choices, boundaries, and preserved behavior in the author's own pull request or local diff. Use when the user asks to self-review, annotate, or add reviewer context to their PR or changes. Draft locally when no PR exists, and post approved notes as one GitHub review when a PR does exist. Do not use for reviewing someone else's PR, writing code comments, explaining code generally, or drafting a PR description. Never post without explicit approval.

40SKILL.mdUpdated Jul 21, 2026

petekp/pr-self-review

petekp/tailwind-plugin-craft

tools

VerifiedTrustedCommunity

Design and build pure-CSS (zero-JavaScript) Tailwind CSS v4 plugins of unusual depth and craft. Use when the user wants to create, architect, or refine a Tailwind utility plugin or CSS effect — e.g. "make a tailwind plugin", "build a tw-* plugin", "a CSS-only shimmer/fade/glow/grain/noise utility", "tailwind v4 @utility", "package this effect as a plugin", or wants an effect with surprising visual depth (gradients, masks, filters, SVG filter tricks, scroll-driven animation). Pairs deep CSS/SVG technique research with a bespoke tuning workbench for dialing the effect in. Inspired by tw-fade and tw-shimmer.

40SKILL.mdUpdated Jul 15, 2026

petekp/tailwind-plugin-craft

petekp/pr-screenshot-comparison

content-media

VerifiedTrustedCommunity

Create clear, polished before-and-after screenshots for a GitHub pull request. Use when a UI change needs visual proof: capture matching states, crop to the relevant UI, stitch and caption one comparison image, attach it natively to the PR, and keep the image out of the repository.

40SKILL.mdUpdated Jul 15, 2026

petekp/pr-screenshot-comparison

petekp/skills/latent-potential

testing

VerifiedTrustedCommunity

--- name: latent-potential description: First-principles, team-of-experts assessment of a software project that surfaces latent potential; underexploited assets, a sharper north star, missing high-leverage capabilities, better framing and messaging. Produces a prioritized, evidence-grounded report with cheap probes, a reframe candidate, a stop-doing list, and an honest skeptic's case. Use whenever the user wants fresh eyes on a project they have built: "what am I sitting on", "what could this be

40SKILL.mdUpdated Jul 15, 2026

petekp/skills/latent-potential

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/petekp/claude-code-setup.git

# Copy into Claude Code skills folder (global)
cp -r claude-code-setup/skills/exhaustive-systems-analysis ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

petekp/claude-code-setup

35 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT