Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

vuralserhat86/skill_evaluator

Name: skill_evaluator
Author: vuralserhat86

skills/skill_evaluator/SKILL.md

npx skillsauth add vuralserhat86/antigravity-agentic-skills skill_evaluator

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Skill Evaluator (WIP)

Evaluates skills against Anthropic's official best practices for agent skill authoring. Produces structured evaluation reports with scores and actionable recommendations.

Quick Start

Read the skill's SKILL.md and understand its purpose
Run automated validation: scripts/validate_skill.py <skill-path>
Perform manual evaluation against criteria below
Generate evaluation report with scores and recommendations

Evaluation Workflow

Step 1: Automated Validation

Run the validation script first:

scripts/validate_skill.py <path/to/skill>

This checks:

SKILL.md exists with valid YAML frontmatter
Name follows conventions (lowercase, hyphens, max 64 chars)
Description is present and under 1024 chars
Body is under 500 lines
File references are one-level deep

Step 2: Manual Evaluation

Evaluate each dimension and assign a score (1-5):

A. Naming (Weight: 10%)

| Score | Criteria | |-------|----------| | 5 | Gerund form (-ing), clear purpose, memorable | | 4 | Descriptive, follows conventions | | 3 | Acceptable but could be clearer | | 2 | Vague or misleading | | 1 | Violates naming rules |

Rules: Max 64 chars, lowercase + numbers + hyphens only, no reserved words (anthropic, claude), no XML tags.

Good: processing-pdfs, analyzing-spreadsheets, building-dashboards Bad: pdf, my-skill, ClaudeHelper, anthropic-tools

B. Description (Weight: 20%)

| Score | Criteria | |-------|----------| | 5 | Clear functionality + specific activation triggers + third person | | 4 | Good description with some triggers | | 3 | Adequate but missing triggers or vague | | 2 | Too brief or unclear purpose | | 1 | Missing or unhelpful |

Must include: What the skill does AND when to use it. Good: "Extracts text from PDFs. Use when working with PDF documents for text extraction, form parsing, or content analysis." Bad: "A skill for PDFs." or "Helps with documents."

C. Content Quality (Weight: 30%)

| Score | Criteria | |-------|----------| | 5 | Concise, assumes Claude intelligence, actionable instructions | | 4 | Generally good, minor verbosity | | 3 | Some unnecessary explanations or redundancy | | 2 | Overly verbose or confusing | | 1 | Bloated, explains obvious concepts |

Ask: "Does Claude really need this explanation?" Remove anything Claude already knows.

D. Structure & Organization (Weight: 25%)

| Score | Criteria | |-------|----------| | 5 | Excellent progressive disclosure, clear navigation, optimal length | | 4 | Good organization, appropriate file splits | | 3 | Acceptable but could be better organized | | 2 | Poor organization, missing references, or bloated SKILL.md | | 1 | No structure, everything dumped in SKILL.md |

Check:

SKILL.md under 500 lines
References are one-level deep (no nested chains)
Long reference files (>100 lines) have table of contents
Uses forward slashes in all paths

E. Degrees of Freedom (Weight: 10%)

| Score | Criteria | |-------|----------| | 5 | Perfect match: high freedom for flexible tasks, low for fragile operations | | 4 | Generally appropriate freedom levels | | 3 | Acceptable but could be better calibrated | | 2 | Mismatched: too rigid or too loose | | 1 | Completely wrong freedom level for the task type |

Guideline:

High freedom (text): Multiple valid approaches, context-dependent
Medium freedom (parameterized): Preferred pattern exists, some variation OK
Low freedom (specific scripts): Fragile operations, exact sequence required

F. Anti-Pattern Check (Weight: 5%)

Deduct points for each anti-pattern found:

[ ] Too many options without clear recommendation (-1)
[ ] Time-sensitive information with date conditionals (-1)
[ ] Inconsistent terminology (-1)
[ ] Windows-style paths (backslashes) (-1)
[ ] Deeply nested references (more than one level) (-2)
[ ] Scripts that punt error handling to Claude (-1)
[ ] Magic numbers without justification (-1)

Step 3: Generate Report

Use this template:

# Skill Evaluation Report: [skill-name]

## Summary
- **Overall Score**: X.X/5.0
- **Recommendation**: [Ready for publication / Needs minor improvements / Needs major revision]

## Dimension Scores
| Dimension | Score | Weight | Weighted |
|-----------|-------|--------|----------|
| Naming | X/5 | 10% | X.XX |
| Description | X/5 | 20% | X.XX |
| Content Quality | X/5 | 30% | X.XX |
| Structure | X/5 | 25% | X.XX |
| Degrees of Freedom | X/5 | 10% | X.XX |
| Anti-Patterns | X/5 | 5% | X.XX |
| **Total** | | 100% | **X.XX** |

## Strengths
- [List 2-3 things done well]

## Areas for Improvement
- [List specific issues with actionable fixes]

## Anti-Patterns Found
- [List any anti-patterns detected]

## Recommendations
1. [Priority 1 fix]
2. [Priority 2 fix]
3. [Priority 3 fix]

## Pre-Publication Checklist
- [ ] Description is specific with activation triggers
- [ ] SKILL.md under 500 lines
- [ ] One-level-deep file references
- [ ] Forward slashes in all paths
- [ ] No time-sensitive information
- [ ] Consistent terminology
- [ ] Concrete examples provided
- [ ] Scripts handle errors explicitly
- [ ] All configuration values justified
- [ ] Required packages listed
- [ ] Tested with Haiku, Sonnet, Opus

Score Interpretation

| Score Range | Rating | Action | |-------------|--------|--------| | 4.5 - 5.0 | Excellent | Ready for publication | | 4.0 - 4.4 | Good | Minor improvements recommended | | 3.0 - 3.9 | Acceptable | Several improvements needed | | 2.0 - 2.9 | Needs Work | Major revision required | | 1.0 - 1.9 | Poor | Fundamental redesign needed |

References

references/evaluation-criteria.md - Detailed evaluation criteria with examples
references/scoring-rubric.md - Complete scoring rubric and edge cases

Skill Evaluator v1.1 - Enhanced

🔄 Workflow

Kaynak: Google Engineering Practices - Code Review & Anthropic System Prompts

Aşama 1: Structural Analysis

[ ] Compliance: Dosya yapısı (scripts/, references/) standarta uyuyor mu?
[ ] Metadata: YAML frontmatter (name, description) eksiksiz ve valid mi?
[ ] Modularity: Skill çok mu büyük? Bölünmesi gerekiyor mu? (Single Responsibility Principle).

Aşama 2: Content & Semantic Review

[ ] Clarity: Talimatlar emir kipiyle (Imperative) ve net yazılmış mı? Belirsizlik var mı?
[ ] Context Efficiency: "Gereksiz nezaket" veya "aşırı açıklama" var mı? Token israfı önlenmeli.
[ ] Safety: Skill tehlikeli bir işlem (dosya silme, yetkisiz erişim) öneriyor mu?

Aşama 3: Functionality Verification

[ ] Script Audit: scripts/ içindeki Python/Bash kodları güvenli ve çalışır durumda mı?
[ ] Reference Check: references/ dosyaları gerçekten gerekli mi? Yoksa SKILL.md içine mi gömülmeli?
[ ] Usability: Bir kullanıcı (veya ajan) bu skill'i okuyup hemen kullanabilir mi?

Kontrol Noktaları

| Aşama | Doğrulama | |-------|-----------| | 1 | Skill adı ve açıklaması birbiriyle tutarlı mı? | | 2 | Anti-pattern (örn: Hardcoded path) tespit edildi mi? | | 3 | Puanlama rubriğine göre objektif bir skor (1-5) verildi mi? |

vuralserhat86/skill_evaluator

skills/skill_evaluator/SKILL.md

Evaluates agent skills against Anthropic's best practices. Use when asked to review, evaluate, assess, or audit a skill for quality. Analyzes SKILL.md structure, naming conventions, description quality, content organization, and identifies anti-patterns. Produces actionable improvement recommendations.

42 stars

testing

Updated Apr 22, 2026

$ install --global

skillsauth

npx skillsauth add vuralserhat86/antigravity-agentic-skills skill_evaluator

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 22, 2026, 6:08 AM122.8s7 files scanned

SKILL.md

name:: skill_evaluator
router_kit:: FullStackKit
description:: Evaluates agent skills against Anthropic's best practices. Use when asked to review, evaluate, assess, or audit a skill for quality. Analyzes SKILL.md structure, naming conventions, description quality, content organization, and identifies anti-patterns. Produces actionable improvement recommendations.
category:: skills
tags:: [architecture, audit, automation, best practices, clean code, coding, collaboration, compliance, debugging, design patterns, development, documentation, efficiency, git, metrics, optimization, productivity, programming, project management, quality assurance, quality check, refactoring, review, skill evaluator, software engineering, standards, testing, utilities, version control, workflow]

Skill Evaluator (WIP)

Evaluates skills against Anthropic's official best practices for agent skill authoring. Produces structured evaluation reports with scores and actionable recommendations.

Quick Start

Read the skill's SKILL.md and understand its purpose
Run automated validation: scripts/validate_skill.py <skill-path>
Perform manual evaluation against criteria below
Generate evaluation report with scores and recommendations

Evaluation Workflow

Step 1: Automated Validation

Run the validation script first:

scripts/validate_skill.py <path/to/skill>

This checks:

SKILL.md exists with valid YAML frontmatter
Name follows conventions (lowercase, hyphens, max 64 chars)
Description is present and under 1024 chars
Body is under 500 lines
File references are one-level deep

Step 2: Manual Evaluation

Evaluate each dimension and assign a score (1-5):

A. Naming (Weight: 10%)

Rules: Max 64 chars, lowercase + numbers + hyphens only, no reserved words (anthropic, claude), no XML tags.

Good: processing-pdfs, analyzing-spreadsheets, building-dashboards Bad: pdf, my-skill, ClaudeHelper, anthropic-tools

B. Description (Weight: 20%)

C. Content Quality (Weight: 30%)

Ask: "Does Claude really need this explanation?" Remove anything Claude already knows.

D. Structure & Organization (Weight: 25%)

Check:

SKILL.md under 500 lines
References are one-level deep (no nested chains)
Long reference files (>100 lines) have table of contents
Uses forward slashes in all paths

E. Degrees of Freedom (Weight: 10%)

Guideline:

High freedom (text): Multiple valid approaches, context-dependent
Medium freedom (parameterized): Preferred pattern exists, some variation OK
Low freedom (specific scripts): Fragile operations, exact sequence required

F. Anti-Pattern Check (Weight: 5%)

Deduct points for each anti-pattern found:

[ ] Too many options without clear recommendation (-1)
[ ] Time-sensitive information with date conditionals (-1)
[ ] Inconsistent terminology (-1)
[ ] Windows-style paths (backslashes) (-1)
[ ] Deeply nested references (more than one level) (-2)
[ ] Scripts that punt error handling to Claude (-1)
[ ] Magic numbers without justification (-1)

Step 3: Generate Report

Use this template:

# Skill Evaluation Report: [skill-name]

## Summary
- **Overall Score**: X.X/5.0
- **Recommendation**: [Ready for publication / Needs minor improvements / Needs major revision]

## Dimension Scores
| Dimension | Score | Weight | Weighted |
|-----------|-------|--------|----------|
| Naming | X/5 | 10% | X.XX |
| Description | X/5 | 20% | X.XX |
| Content Quality | X/5 | 30% | X.XX |
| Structure | X/5 | 25% | X.XX |
| Degrees of Freedom | X/5 | 10% | X.XX |
| Anti-Patterns | X/5 | 5% | X.XX |
| **Total** | | 100% | **X.XX** |

## Strengths
- [List 2-3 things done well]

## Areas for Improvement
- [List specific issues with actionable fixes]

## Anti-Patterns Found
- [List any anti-patterns detected]

## Recommendations
1. [Priority 1 fix]
2. [Priority 2 fix]
3. [Priority 3 fix]

## Pre-Publication Checklist
- [ ] Description is specific with activation triggers
- [ ] SKILL.md under 500 lines
- [ ] One-level-deep file references
- [ ] Forward slashes in all paths
- [ ] No time-sensitive information
- [ ] Consistent terminology
- [ ] Concrete examples provided
- [ ] Scripts handle errors explicitly
- [ ] All configuration values justified
- [ ] Required packages listed
- [ ] Tested with Haiku, Sonnet, Opus

Score Interpretation

References

references/evaluation-criteria.md - Detailed evaluation criteria with examples
references/scoring-rubric.md - Complete scoring rubric and edge cases

Skill Evaluator v1.1 - Enhanced

🔄 Workflow

Kaynak: Google Engineering Practices - Code Review & Anthropic System Prompts

Aşama 1: Structural Analysis

[ ] Compliance: Dosya yapısı (scripts/, references/) standarta uyuyor mu?
[ ] Metadata: YAML frontmatter (name, description) eksiksiz ve valid mi?
[ ] Modularity: Skill çok mu büyük? Bölünmesi gerekiyor mu? (Single Responsibility Principle).

Aşama 2: Content & Semantic Review

[ ] Clarity: Talimatlar emir kipiyle (Imperative) ve net yazılmış mı? Belirsizlik var mı?
[ ] Context Efficiency: "Gereksiz nezaket" veya "aşırı açıklama" var mı? Token israfı önlenmeli.
[ ] Safety: Skill tehlikeli bir işlem (dosya silme, yetkisiz erişim) öneriyor mu?

Aşama 3: Functionality Verification

[ ] Script Audit: scripts/ içindeki Python/Bash kodları güvenli ve çalışır durumda mı?
[ ] Reference Check: references/ dosyaları gerçekten gerekli mi? Yoksa SKILL.md içine mi gömülmeli?
[ ] Usability: Bir kullanıcı (veya ajan) bu skill'i okuyup hemen kullanabilir mi?

Kontrol Noktaları

Related Skills

vuralserhat86/zustand_state

tools

VerifiedTrustedCommunity

Production-tested setup for Zustand state management in React. Includes patterns for persistence, devtools, and TypeScript patterns. Prevents hydration mismatches and render loops.

42SKILL.mdUpdated Apr 22, 2026

vuralserhat86/zustand_state

vuralserhat86/xlsx

development

VerifiedTrustedCommunity

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas

42SKILL.mdUpdated Apr 22, 2026

vuralserhat86/skills/websocket_engineer

development

VerifiedTrustedCommunity

--- name: websocket_engineer router_kit: FullStackKit description: WebSocket specialist for real-time communication systems. Invoke for Socket.IO, WebSocket servers, bidirectional messaging, presence systems. Keywords: WebSocket, Socket.IO, real-time, pub/sub, Redis. triggers: - WebSocket - Socket.IO - real-time communication - bidirectional messaging - pub/sub - server push - live updates - chat systems - presence tracking role: specialist scope: implementation output-format:

42SKILL.mdUpdated Apr 22, 2026

vuralserhat86/skills/websocket_engineer

vuralserhat86/webapp_testing

tools

VerifiedTrustedCommunity

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

42SKILL.mdUpdated Apr 22, 2026

vuralserhat86/webapp_testing

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/vuralserhat86/antigravity-agentic-skills.git

# Copy into Claude Code skills folder (global)
cp -r antigravity-agentic-skills/skills/skill_evaluator ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

vuralserhat86/antigravity-agentic-skills

42 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT