Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

b-open-io/agent-auditor

Name: agent-auditor
Author: b-open-io

skills/agent-auditor/SKILL.md

npx skillsauth add b-open-io/prompts agent-auditor

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Agent Auditor

Systematic audit methodology for evaluating the health, quality, and consistency of agents and skills across the plugin ecosystem. Produces actionable findings with severity ratings and recommended fixes.

Audit Checklist

Every audit evaluates skills across seven dimensions. For each skill, score pass/warn/fail per dimension.

1. Scope & Invocation

Verify the invocation control fields are set correctly.

Check against the invocation matrix:

| Scenario | user-invocable | disable-model-invocation | |----------|-----------------|---------------------------| | Default (user + Claude can invoke) | omit (default true) | omit (default false) | | Agent-only (hidden from / menu) | false | omit | | User-only (Claude cannot auto-invoke) | omit | true | | Agent-only + no auto-invoke | false | true |

Checks:

Does the skill require user interaction (OTP, confirmation, subjective input)? If yes, needs disable-model-invocation: true
Does the skill have irreversible side effects (sends money, publishes, deploys)? If yes, needs disable-model-invocation: true
Would a user ever type /skill-name directly? If no, needs user-invocable: false
Is this purely internal agent plumbing? If yes, needs user-invocable: false
Cross-reference: which agents list this skill in their tools: frontmatter? Does that match the intended audience?

Common failure: Skills that are agent-internal but missing user-invocable: false, cluttering the user's / menu.

2. Location & Cross-Client

Skill lives in the correct plugin repo (bopen-tools, bsv-skills, gemskills, 1sat-skills, product-skills, etc.)
Directory name matches the name field in frontmatter exactly
No spaces, underscores, or capitals in directory name (kebab-case only)
File is named exactly SKILL.md (case-sensitive)
No README.md inside the skill folder (all docs go in SKILL.md or references/)

3. Description Quality

The description is the single most important field -- it determines whether Claude loads the skill.

Structure: [What it does] + [When to use it] + [Key capabilities]

Checks:

Uses third-person format ("This skill should be used when..." not "Use when...")
Includes specific trigger phrases users would actually say
Under 1024 characters
No XML angle brackets (< or >)
Not too vague ("Helps with projects" = fail)
Not too technical ("Implements the X entity model" = fail)
Includes negative triggers if the skill is easily confused with similar skills
Mentions relevant file types if applicable

Test the description: Ask Claude "When would you use the [skill name] skill?" -- Claude should quote the description back accurately. If it can't, the triggers are weak.

4. Structure & Progressive Disclosure

Skills use a three-level system to minimize token usage:

First level (frontmatter): Always in system prompt. Just enough to decide relevance.
Second level (SKILL.md body): Loaded when skill is invoked. Core instructions.
Third level (references/): Additional detail Claude navigates to as needed.

Checks:

SKILL.md body is under 2,000 words (ideally 1,500). Run wc -w to verify.
Detailed documentation lives in references/, not inline
No duplication between SKILL.md body and reference files
Scripts for deterministic tasks live in scripts/
Instructions are specific and actionable, not vague ("validate the data before proceeding" = fail)
Critical instructions appear at the top, not buried at the bottom
Uses bullet points and numbered lists over long prose paragraphs

5. Testing & Benchmarks

Checks:

Has evals/evals.json with trigger and functional test cases
Trigger tests: 10 should-trigger prompts + 10 should-not-trigger prompts (realistic, not contrived)
Functional assertions: 3-5 per eval, specific and verifiable
Assertions target skill-specific knowledge, not generic model capability
Has baseline comparison data (pass_rate vs baseline_pass_rate)
Delta is positive (skill helps vs hurts)

Consult references/testing-strategies.md for the full testing methodology.

6. Agent Equipment

Agents that create or modify skills should have access to the right toolkit:

| Required Skill | Purpose | |---------------|---------| | Skill(skill-creator:skill-creator) | Interactive skill creation workflow | | Skill(plugin-dev:skill-development) | Skill writing best practices | | Skill(bopen-tools:benchmark-skills) | Eval/benchmark harness | | Skill(bopen-tools:agent-auditor) | This audit skill |

Check the agent's tools: frontmatter to verify these are listed.

7. Generative UI Awareness

If the agent's domain involves UI generation, rendering, or cross-platform output, check for generative UI readiness.

Checks:

Does the agent have Skill(bopen-tools:generative-ui) in tools?
If the agent works with React/Next.js, does it know about json-render?
If the agent works with React Native, does it know about @json-render/react-native?
If the agent produces visual assets, does it have relevant gemskills?
Does the agent understand when to use generative UI vs static components?

Applicable agents: designer, agent-builder, nextjs, mobile, integration-expert

Not applicable (skip this dimension): code-auditor, documentation-writer, researcher, devops, database, payments

Audit Workflow

Step 1: Enumerate & Classify (via subagent)

Delegate enumeration and classification to a subagent to keep the main context clean:

Agent(prompt: "Enumerate and classify all skills in the target plugin.

1. Run: ls skills/*/SKILL.md and count total
2. For each skill, read the YAML frontmatter and classify:
   - Type: agent-only (user-invocable: false), user-only (disable-model-invocation: true), or default
   - Plugin it belongs in
   - Which agents reference it (grep agents/*.md for Skill(name))
3. Return a table: | Skill | Type | Referenced By | Notes |

Target directory: skills/",
subagent_type: "general-purpose")

Step 2: Run Dimension Checks (via parallel subagents)

For multi-plugin audits, dispatch one subagent per plugin in parallel. For single-plugin audits, dispatch one subagent per batch of 5-10 skills:

Agent(prompt: "Audit these skills against the seven-dimension checklist:
<list of skills from Step 1>

For each skill, evaluate: Scope & Invocation, Location & Cross-Client, Description Quality, Structure, Testing, Agent Equipment, Generative UI.

Score each dimension as pass/warn/fail. Return findings in the report format.",
subagent_type: "general-purpose")

The main context receives only the formatted audit report, not raw skill file contents.

Record per dimension:

Pass: Meets criteria
Warn: Minor issue, non-blocking
Fail: Must fix before publishing

Step 4: Generate Report

Format findings as:

## Audit Report: [plugin-name]

### Summary
- Total skills: N
- Pass: N | Warn: N | Fail: N

### Findings

#### [skill-name]
| Dimension | Status | Notes |
|-----------|--------|-------|
| Scope & Invocation | pass/warn/fail | details |
| Location & Cross-Client | pass/warn/fail | details |
| Description Quality | pass/warn/fail | details |
| Structure | pass/warn/fail | details |
| Testing | pass/warn/fail | details |
| Agent Equipment | pass/warn/fail | details |
| Generative UI | pass/warn/fail/skip | details |

**Recommended fixes:**
1. [specific, actionable fix]

Step 5: Fix & Re-audit

Apply fixes, then re-run the audit on changed skills only. Use the evaluator-optimizer loop from references/workflow-patterns.md for iterative improvement.

Workflow Patterns

For multi-plugin audits, use parallelization -- dispatch one subagent per plugin. See references/workflow-patterns.md for:

Sequential audit pipeline (single plugin)
Parallel dispatch (multiple plugins)
Evaluator-optimizer loop (iterative fixes)

Testing Strategy

See references/testing-strategies.md for:

Trigger testing methodology (should-trigger / should-not-trigger)
Functional testing with evals.json assertions
Performance comparison (with-skill vs without-skill baselines)
Quantitative and qualitative metrics
Description optimization loops

Reference Files

| File | When to Consult | |------|----------------| | references/skill-quality-guide.md | Writing or reviewing description, structure, and instructions | | references/workflow-patterns.md | Planning multi-plugin audits or iterative fix cycles | | references/testing-strategies.md | Creating evals, running benchmarks, measuring effectiveness |

b-open-io/agent-auditor

skills/agent-auditor/SKILL.md

Comprehensive audit skill for agents and skills across the plugin ecosystem. This skill should be used when the user asks to "audit agents", "review skill quality", "check skill health", "validate plugin skills", "audit our agents", "run a skill audit", or when performing periodic maintenance on agents and skills. Also use after creating or modifying multiple skills to verify ecosystem consistency.

13 stars

tools

Updated Jul 10, 2026

$ install --global

skillsauth

npx skillsauth add b-open-io/prompts agent-auditor

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jul 10, 2026, 3:52 AM294.0s4 files scanned

SKILL.md

name:: agent-auditor
version:: 0.1.2
description:: >-
user-invocable:: false

Agent Auditor

Audit Checklist

Every audit evaluates skills across seven dimensions. For each skill, score pass/warn/fail per dimension.

1. Scope & Invocation

Verify the invocation control fields are set correctly.

Check against the invocation matrix:

Checks:

Does the skill require user interaction (OTP, confirmation, subjective input)? If yes, needs disable-model-invocation: true
Does the skill have irreversible side effects (sends money, publishes, deploys)? If yes, needs disable-model-invocation: true
Would a user ever type /skill-name directly? If no, needs user-invocable: false
Is this purely internal agent plumbing? If yes, needs user-invocable: false
Cross-reference: which agents list this skill in their tools: frontmatter? Does that match the intended audience?

Common failure: Skills that are agent-internal but missing user-invocable: false, cluttering the user's / menu.

2. Location & Cross-Client

Skill lives in the correct plugin repo (bopen-tools, bsv-skills, gemskills, 1sat-skills, product-skills, etc.)
Directory name matches the name field in frontmatter exactly
No spaces, underscores, or capitals in directory name (kebab-case only)
File is named exactly SKILL.md (case-sensitive)
No README.md inside the skill folder (all docs go in SKILL.md or references/)

3. Description Quality

The description is the single most important field -- it determines whether Claude loads the skill.

Structure: [What it does] + [When to use it] + [Key capabilities]

Checks:

Uses third-person format ("This skill should be used when..." not "Use when...")
Includes specific trigger phrases users would actually say
Under 1024 characters
No XML angle brackets (< or >)
Not too vague ("Helps with projects" = fail)
Not too technical ("Implements the X entity model" = fail)
Includes negative triggers if the skill is easily confused with similar skills
Mentions relevant file types if applicable

Test the description: Ask Claude "When would you use the [skill name] skill?" -- Claude should quote the description back accurately. If it can't, the triggers are weak.

4. Structure & Progressive Disclosure

Skills use a three-level system to minimize token usage:

First level (frontmatter): Always in system prompt. Just enough to decide relevance.
Second level (SKILL.md body): Loaded when skill is invoked. Core instructions.
Third level (references/): Additional detail Claude navigates to as needed.

Checks:

SKILL.md body is under 2,000 words (ideally 1,500). Run wc -w to verify.
Detailed documentation lives in references/, not inline
No duplication between SKILL.md body and reference files
Scripts for deterministic tasks live in scripts/
Instructions are specific and actionable, not vague ("validate the data before proceeding" = fail)
Critical instructions appear at the top, not buried at the bottom
Uses bullet points and numbered lists over long prose paragraphs

5. Testing & Benchmarks

Checks:

Has evals/evals.json with trigger and functional test cases
Trigger tests: 10 should-trigger prompts + 10 should-not-trigger prompts (realistic, not contrived)
Functional assertions: 3-5 per eval, specific and verifiable
Assertions target skill-specific knowledge, not generic model capability
Has baseline comparison data (pass_rate vs baseline_pass_rate)
Delta is positive (skill helps vs hurts)

Consult references/testing-strategies.md for the full testing methodology.

6. Agent Equipment

Agents that create or modify skills should have access to the right toolkit:

Check the agent's tools: frontmatter to verify these are listed.

7. Generative UI Awareness

If the agent's domain involves UI generation, rendering, or cross-platform output, check for generative UI readiness.

Checks:

Does the agent have Skill(bopen-tools:generative-ui) in tools?
If the agent works with React/Next.js, does it know about json-render?
If the agent works with React Native, does it know about @json-render/react-native?
If the agent produces visual assets, does it have relevant gemskills?
Does the agent understand when to use generative UI vs static components?

Applicable agents: designer, agent-builder, nextjs, mobile, integration-expert

Not applicable (skip this dimension): code-auditor, documentation-writer, researcher, devops, database, payments

Audit Workflow

Step 1: Enumerate & Classify (via subagent)

Delegate enumeration and classification to a subagent to keep the main context clean:

Agent(prompt: "Enumerate and classify all skills in the target plugin.

1. Run: ls skills/*/SKILL.md and count total
2. For each skill, read the YAML frontmatter and classify:
   - Type: agent-only (user-invocable: false), user-only (disable-model-invocation: true), or default
   - Plugin it belongs in
   - Which agents reference it (grep agents/*.md for Skill(name))
3. Return a table: | Skill | Type | Referenced By | Notes |

Target directory: skills/",
subagent_type: "general-purpose")

Step 2: Run Dimension Checks (via parallel subagents)

For multi-plugin audits, dispatch one subagent per plugin in parallel. For single-plugin audits, dispatch one subagent per batch of 5-10 skills:

Agent(prompt: "Audit these skills against the seven-dimension checklist:
<list of skills from Step 1>

For each skill, evaluate: Scope & Invocation, Location & Cross-Client, Description Quality, Structure, Testing, Agent Equipment, Generative UI.

Score each dimension as pass/warn/fail. Return findings in the report format.",
subagent_type: "general-purpose")

The main context receives only the formatted audit report, not raw skill file contents.

Record per dimension:

Pass: Meets criteria
Warn: Minor issue, non-blocking
Fail: Must fix before publishing

Step 4: Generate Report

Format findings as:

## Audit Report: [plugin-name]

### Summary
- Total skills: N
- Pass: N | Warn: N | Fail: N

### Findings

#### [skill-name]
| Dimension | Status | Notes |
|-----------|--------|-------|
| Scope & Invocation | pass/warn/fail | details |
| Location & Cross-Client | pass/warn/fail | details |
| Description Quality | pass/warn/fail | details |
| Structure | pass/warn/fail | details |
| Testing | pass/warn/fail | details |
| Agent Equipment | pass/warn/fail | details |
| Generative UI | pass/warn/fail/skip | details |

**Recommended fixes:**
1. [specific, actionable fix]

Step 5: Fix & Re-audit

Apply fixes, then re-run the audit on changed skills only. Use the evaluator-optimizer loop from references/workflow-patterns.md for iterative improvement.

Workflow Patterns

For multi-plugin audits, use parallelization -- dispatch one subagent per plugin. See references/workflow-patterns.md for:

Sequential audit pipeline (single plugin)
Parallel dispatch (multiple plugins)
Evaluator-optimizer loop (iterative fixes)

Testing Strategy

See references/testing-strategies.md for:

Trigger testing methodology (should-trigger / should-not-trigger)
Functional testing with evals.json assertions
Performance comparison (with-skill vs without-skill baselines)
Quantitative and qualitative metrics
Description optimization loops

Reference Files

Related Skills

b-open-io/claudex

tools

VerifiedTrustedCommunity

This skill should be used when a Claude Code session needs to keep working after Anthropic usage runs out, or when the user asks to run the Claude Code harness on GPT-5.6 Sol. Trigger phrases include "my Anthropic usage ran out", "I'm out of Claude usage", "usage limit reached, what now", "keep working on another model", "run Claude Code on GPT-5.6 Sol", "use GPT-5.6 Sol as the model", "set up claudex", "claudex isn't working", "route the harness through CLIProxyAPI", or "bill against my ChatGPT/Codex subscription". It stands up a local proxy so the Claude Code CLI runs on OpenAI's Codex backend as an escape hatch, and diagnoses that setup when it drifts. macOS + Homebrew.

14SKILL.mdUpdated Jul 17, 2026

b-open-io/visual-wayfinder

testing

VerifiedTrustedCommunity

This skill should be used when the user asks to "open Visual Wayfinder", "answer a Wayfinder ticket visually", "turn this decision into a configurator", "show Wayfinder choices as a dashboard", "prototype the Wayfinder questionnaire", or wants interactive choice cards, tradeoff controls, rankings, ranges, toggles, and consequence previews for one active Wayfinder decision. It wraps the Wayfinder skill and JSON Render; it never replaces the tracker or resolves more than the active decision.

14SKILL.mdUpdated Jul 16, 2026

b-open-io/visual-wayfinder

b-open-io/visual-proposal

development

VerifiedTrustedCommunity

This skill should be used when the user asks to "make a visual proposal", "write this up so I can share it", "present these options visually", "diagram the trade-offs", "turn this plan into something reviewable", or requests a shareable design pitch, architecture proposal, RFC, options comparison, or visual roadmap for work that has not been built. It produces one self-contained, theme-aware HTML page led by grounded diagrams. Use visual-review instead for completed code changes; do not use this skill for internal task tracking.

14SKILL.mdUpdated Jul 16, 2026

b-open-io/visual-proposal

b-open-io/plugin-settings

tools

VerifiedTrustedCommunity

This skill should be used when the user asks to "add plugin settings", "make a plugin configurable", "store per-project plugin configuration", "use settings.local.json", "create a plugin state file", "expose skill settings in Agent Master", or "add a skill interface". Distinguishes official Claude Code settings from project-owned configuration and documents bOpen Agent Master skill interface discovery.

14SKILL.mdUpdated Jul 16, 2026

b-open-io/plugin-settings

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/b-open-io/prompts.git

# Copy into Claude Code skills folder (global)
cp -r prompts/skills/agent-auditor ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

b-open-io/prompts

13 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT