Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

athola/skills-eval

Name: skills-eval
Author: athola

plugins/abstract/skills/skills-eval/SKILL.md

npx skillsauth add athola/claude-night-market skills-eval

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Skills Evaluation and Improvement

When NOT To Use

Writing a new skill (use abstract:skill-authoring)
Evaluating hooks (use abstract:hooks-eval)
Evaluating rules (use abstract:rules-eval)

Overview
Quick Start
Evaluation Workflow
Evaluation and Optimization
Resources

Overview

This framework audits Claude skills against quality standards to improve performance and reduce token consumption. Automated tools analyze skill structure, measure context usage, and identify specific technical improvements. Run verification commands after each audit to confirm fixes work correctly.

The skills-auditor provides structural analysis, while the improvement-suggester ranks fixes by impact. Compliance is verified through the compliance-checker. Runtime efficiency is monitored by tool-performance-analyzer and token-usage-tracker.

Quick Start

Basic Audit

Run a full audit of all skills or target a specific file to identify structural issues.

# Audit all skills
make audit-all

# Audit specific skill
make audit-skill TARGET=path/to/skill/SKILL.md

Analysis and Optimization

Use skill_analyzer.py for complexity checks and token_estimator.py to verify the context budget.

make analyze-skill TARGET=path/to/skill/SKILL.md
make estimate-tokens TARGET=path/to/skill/SKILL.md

Improvements

Generate a prioritized plan and verify standards compliance using improvement_suggester.py and compliance_checker.py.

make improve-skill TARGET=path/to/skill/SKILL.md
make check-compliance TARGET=path/to/skill/SKILL.md

Evaluation Workflow

Start with make audit-all to inventory skills and identify high-priority targets. For each skill requiring attention, run analysis with analyze-skill to map complexity. Generate an improvement plan, apply fixes, and run check-compliance to verify the skill meets project standards. Finalize by checking the token budget for efficiency.

Evaluation and Optimization

Quality assessments use the skills-auditor and improvement-suggester to generate detailed reports. Performance analysis focuses on token efficiency through the token-usage-tracker and tool performance via tool-performance-analyzer. For standards compliance, the compliance-checker automates common fixes for structural issues.

Scoring and Prioritization

We evaluate skills across five dimensions: structure compliance, content quality, token efficiency, activation reliability, and tool integration. Scores above 90 represent production-ready skills, while scores below 50 indicate critical issues requiring immediate attention.

Improvements are prioritized by impact. Critical issues include security vulnerabilities or broken functionality. High-priority items cover structural flaws that hinder discoverability. Medium and low priorities focus on best practices and minor optimizations.

Structural Patterns

Deprecated: skills/shared/modules/ directories. Shared modules must be relocated into the consuming skill's own modules/ directory. The evaluator flags any remaining skills/shared/ as a structural warning.

Current: Each skill owns its modules at skills/<skill-name>/modules/. Cross-skill references use relative paths (e.g., ../skill-authoring/modules/anti-rationalization.md).

Resources

Shared Modules: Cross-Skill Patterns

Anti-Rationalization Patterns: See anti-rationalization.md
Enforcement Language: See enforcement-language.md
Trigger Patterns: See trigger-patterns.md

Skill-Specific Modules

Trigger Isolation Analysis: See modules/trigger-isolation-analysis.md
Authoring Checklist: See modules/authoring-checklist.md
Evaluation Workflows: See modules/evaluation-workflows.md
Advanced Tool Use Analysis: See modules/advanced-tool-use-analysis.md
Evaluation Framework: See modules/evaluation-framework.md
Integration Patterns: See modules/integration.md
Troubleshooting: See modules/troubleshooting.md
Pressure Testing: See modules/pressure-testing.md
Integration Testing: See modules/integration-testing.md
Performance Benchmarking: See modules/performance-benchmarking.md

Tools and Automation

Tools: Executable analysis utilities in scripts/ directory.
Automation: Setup and validation scripts in scripts/automation/.

Exit Criteria

[ ] Every audited skill receives a score across all five dimensions (structure compliance, content quality, token efficiency, activation reliability, tool integration) summing to 100.
[ ] Any skill with a deprecated skills/shared/ module reference is listed as a structural warning in the audit output.
[ ] make check-compliance TARGET=<skill> exits 0 for skills reported as passing, confirming the compliance-checker agrees with the audit score.
[ ] An improvement plan is produced for any skill scoring below 75, with findings ordered by priority: critical > high > medium > low.

athola/skills-eval

plugins/abstract/skills/skills-eval/SKILL.md

Evaluate Claude skill quality through auditing. Use when reviewing or auditing skills.

323 stars

testing

Updated Jul 15, 2026

$ install --global

skillsauth

npx skillsauth add athola/claude-night-market skills-eval

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jul 15, 2026, 5:26 AM136.7s16 files scanned

SKILL.md

name:: skills-eval
description:: Evaluate Claude skill quality through auditing. Use when reviewing or auditing skills.
alwaysApply:: false
category:: skill-management
tools:: []
estimated_tokens:: 1800
complexity:: advanced
model_hint:: deep
structure_compliance:: 25
metadata_quality:: 20
token_efficiency:: 25
tool_integration:: 20
claude_sdk_compliance:: 10
role:: entrypoint

Skills Evaluation and Improvement

When NOT To Use

Writing a new skill (use abstract:skill-authoring)
Evaluating hooks (use abstract:hooks-eval)
Evaluating rules (use abstract:rules-eval)

Overview
Quick Start
Evaluation Workflow
Evaluation and Optimization
Resources

Overview

Quick Start

Basic Audit

Run a full audit of all skills or target a specific file to identify structural issues.

# Audit all skills
make audit-all

# Audit specific skill
make audit-skill TARGET=path/to/skill/SKILL.md

Analysis and Optimization

Use skill_analyzer.py for complexity checks and token_estimator.py to verify the context budget.

make analyze-skill TARGET=path/to/skill/SKILL.md
make estimate-tokens TARGET=path/to/skill/SKILL.md

Improvements

Generate a prioritized plan and verify standards compliance using improvement_suggester.py and compliance_checker.py.

make improve-skill TARGET=path/to/skill/SKILL.md
make check-compliance TARGET=path/to/skill/SKILL.md

Evaluation Workflow

Evaluation and Optimization

Scoring and Prioritization

Structural Patterns

Current: Each skill owns its modules at skills/<skill-name>/modules/. Cross-skill references use relative paths (e.g., ../skill-authoring/modules/anti-rationalization.md).

Resources

Shared Modules: Cross-Skill Patterns

Anti-Rationalization Patterns: See anti-rationalization.md
Enforcement Language: See enforcement-language.md
Trigger Patterns: See trigger-patterns.md

Skill-Specific Modules

Trigger Isolation Analysis: See modules/trigger-isolation-analysis.md
Authoring Checklist: See modules/authoring-checklist.md
Evaluation Workflows: See modules/evaluation-workflows.md
Advanced Tool Use Analysis: See modules/advanced-tool-use-analysis.md
Evaluation Framework: See modules/evaluation-framework.md
Integration Patterns: See modules/integration.md
Troubleshooting: See modules/troubleshooting.md
Pressure Testing: See modules/pressure-testing.md
Integration Testing: See modules/integration-testing.md
Performance Benchmarking: See modules/performance-benchmarking.md

Tools and Automation

Tools: Executable analysis utilities in scripts/ directory.
Automation: Setup and validation scripts in scripts/automation/.

Exit Criteria

[ ] Every audited skill receives a score across all five dimensions (structure compliance, content quality, token efficiency, activation reliability, tool integration) summing to 100.
[ ] Any skill with a deprecated skills/shared/ module reference is listed as a structural warning in the audit output.
[ ] make check-compliance TARGET=<skill> exits 0 for skills reported as passing, confirming the compliance-checker agrees with the audit score.
[ ] An improvement plan is produced for any skill scoring below 75, with findings ordered by priority: critical > high > medium > low.

Related Skills

athola/architecture-paradigm-domain-driven

data-ai

VerifiedTrustedCommunity

Models a business in its own language. Use when the domain has real business rules to capture.

323SKILL.mdUpdated Jul 15, 2026

athola/architecture-paradigm-domain-driven

athola/ideate

research

VerifiedTrustedCommunity

Generate diverse solution candidates with category-spanning ideation methods and rotation. Use when stuck on a design or fighting repetitive LLM output.

323SKILL.mdUpdated Jun 8, 2026

athola/validate-pr

development

VerifiedTrustedCommunity

Generates and self-executes a diff-derived test plan for a PR. Use when validating PR changes before merge. Do not use for code review; use sanctum:pr-review.

323SKILL.mdUpdated Jun 8, 2026

athola/graduated-implementation

development

VerifiedTrustedCommunity

Ramps implementation ambition a notch only after the prior increment is understood. Use when building a feature you must understand, not just ship.

323SKILL.mdUpdated Jun 8, 2026

athola/graduated-implementation

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/athola/claude-night-market.git

# Copy into Claude Code skills folder (global)
cp -r claude-night-market/plugins/abstract/skills/skills-eval ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

athola/claude-night-market

323 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

athola/skills-eval

$ install --global

Security Scan Results

SKILL.md

Skills Evaluation and Improvement

When NOT To Use

Table of Contents

Overview

Quick Start

Basic Audit

Analysis and Optimization

Improvements

Evaluation Workflow

Evaluation and Optimization

Scoring and Prioritization

Structural Patterns

Resources

Shared Modules: Cross-Skill Patterns

Skill-Specific Modules

Tools and Automation

Exit Criteria

Related Skills

athola/architecture-paradigm-domain-driven

athola/ideate

athola/validate-pr

athola/graduated-implementation

athola/skills-eval

$ install --global

Security Scan Results

SKILL.md

Skills Evaluation and Improvement

When NOT To Use

Table of Contents

Overview

Quick Start

Basic Audit

Analysis and Optimization

Improvements

Evaluation Workflow

Evaluation and Optimization

Scoring and Prioritization

Structural Patterns

Resources

Shared Modules: Cross-Skill Patterns

Skill-Specific Modules

Tools and Automation

Exit Criteria

Related Skills

athola/architecture-paradigm-domain-driven

athola/ideate

athola/validate-pr

athola/graduated-implementation