skills/creating-skills/SKILL.md
Guides creation of agent skills following best practices and the open format specification. Covers pattern selection, frontmatter, directory structure, reference files, validation, and iteration. Use when creating a new skill, updating SKILL.md, or asking "how to write a skill"
npx skillsauth add riccardogrin/skills creating-skillsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill walks you through creating a well-structured agent skill from scratch. Follow the workflow below step by step. Use the reference files for detailed guidance on specific topics.
| File | Read When |
|------|-----------|
| references/format-specification.md | Checking format rules, frontmatter constraints, or naming conventions |
| references/skill-patterns.md | Choosing a skill pattern or viewing skeleton templates |
| references/workflow-and-output-patterns.md | Designing workflows, output formats, or feedback loops |
| references/quality-checklist.md | Running the pre-ship quality checklist |
| references/anti-patterns.md | Reviewing common mistakes to avoid |
| references/evaluation-guide.md | Creating evaluations to measure skill quality objectively |
| references/hooks-recipes.md | Setting up hooks in skills or understanding hook patterns |
Before diving into mechanics, internalize these design principles:
Context window is a public good. Every token in a skill competes with the user's code, conversation, and other tools. Challenge each line with three questions: (1) Does Claude already know this? (2) Will the agent use this on every invocation? (3) Can this live in a reference file instead?
Degrees of freedom. Match your specificity to the task's tolerance for variation:
| Level | When | Example | |-------|------|---------| | High specificity | Output must be exact (configs, schemas) | "Generate exactly this YAML structure" | | Medium specificity | Process matters, details vary (workflows) | "Follow these steps, adapt to the project" | | Low specificity | Agent judgment is the point (reviews, analysis) | "Check for these categories of issues" |
Claude is already smart. Don't teach programming, well-known APIs, or common patterns. Only add context Claude doesn't already have: project conventions, domain rules, non-obvious constraints.
Enforce mechanically, not with checklists. If the skill's purpose is catching problems (review, audit, compliance), it should generate hooks, lint rules, or check scripts that run automatically — not be a list of things the agent must remember to check. A PostToolUse hook runs on every file edit; a skill instruction relies on agent discipline.
Don't duplicate existing tools. Before building a skill, check what linters, formatters, and CI tools already handle in the target ecosystem. If ESLint catches unused imports, don't create a skill that also checks for them. Skills should add value beyond what standard tooling provides.
Design for discoverability. A skill that never gets invoked is worthless. The description's "Use when" triggers are necessary but not sufficient — the agent scans them at session start and may not connect them to the right moment. Add cross-references in related skills ("Related Skills" section) and in CLAUDE.md for broadly useful skills. Ask: "When would someone need this, and what will they be looking at right before they need it?"
Before writing anything, identify which pattern fits the task:
| Pattern | Best For | Example |
|---------|----------|---------|
| Guided Workflow | Multi-step processes with decisions | creating-skills, deploying-apps |
| Rules-Based Audit | Code review, linting, compliance | reviewing-typescript, auditing-security |
| Scaffolding / Generation | Creating files from templates | generating-apis, initializing-projects |
| Knowledge Reference | Lookup tables, conventions, mappings | mapping-status-codes, converting-units |
Not sure? Ask these questions:
Read references/skill-patterns.md for full skeleton templates and directory structures.
Copy this checklist and work through it step by step:
- [ ] Step 1: Understand the skill
- [ ] Step 2: Choose a pattern
- [ ] Step 3: Initialize the skill
- [ ] Step 4: Write the SKILL.md body
- [ ] Step 5: Add reference files
- [ ] Step 6: Validate
- [ ] Step 7: Create evaluations
- [ ] Step 8: Test with real usage
- [ ] Step 9: Iterate
Before writing any files, have a conversation to clarify:
Synthesize into a draft description using the formula:
[Does what] for/using [domain]. [Checks/covers what]. Use when [triggers]
Use the decision table above to select a primary pattern. Most skills combine patterns — pick the dominant one and incorporate elements from others.
For example, creating-skills is primarily a Guided Workflow but includes:
Quick start with the official CLI:
npx skills init <skill-name>
Or use the enhanced scaffolding script (includes best-practice guidance in the template):
python scripts/init_skill.py <skill-name> --path skills
This creates:
skills/<skill-name>/
├── SKILL.md (template with TODOs)
└── references/ (empty, ready for use)
If not using the script, create this structure manually. The name must be kebab-case, preferably with a gerund first word (e.g., analyzing-data, not data-analyzer).
Open the generated SKILL.md and fill it in:
Frontmatter: Fill in name and description
references/format-specification.md for the full specReference Files table: List every reference file with a "Read When" condition
Main content: Follow the skeleton for your chosen pattern
Workflows: Use copyable checklists for sequential steps
references/workflow-and-output-patterns.md for patternsCreate reference files in references/ for detailed content that would bloat the main SKILL.md.
Common reference file types:
Rules for reference files:
references/)Run the validator against your skill:
python scripts/validate_skill.py skills/<skill-name>
The validator checks:
For official spec compliance, also run: skills-ref validate skills/<skill-name>
(install: pip install skills-ref)
Fix all errors (blocking). Review all warnings (advisory — fix or acknowledge).
If errors persist after 3 attempts, review references/anti-patterns.md for common mistakes.
Also run through the manual references/quality-checklist.md for items the automated validator cannot check (content quality, token efficiency, terminology consistency).
Before testing, define what success looks like.
Write 3–5 evaluation cases that cover the happy path, an edge case, and a failure mode you anticipate.
See references/evaluation-guide.md for the full methodology and JSON format.
Key steps:
Automated validation catches format issues but not usability problems. Test with real tasks:
~/.claude/skills/ or use npx skills addWhat to observe during testing:
Multi-model testing: Test with at least two capability levels:
If the skill fails or produces poor output, go back to Step 4 and revise.
Skills improve through real usage and feedback.
Establish a baseline first: If you haven't already (Step 7), complete the target task without the skill to see what the agent does on its own. This reveals which parts of your skill actually add value vs. what the agent already handles.
Use the Claude A/B pattern:
Gather team feedback: If others use the skill, ask them what worked and what didn't. Different users trigger skills differently — their experience reveals gaps your testing missed.
This external feedback loop catches blind spots that self-review misses.
Avoid these common mistakes (see references/anti-patterns.md for full details):
| Anti-Pattern | Fix |
|-------------|-----|
| Over-explaining (things Claude already knows) | Only include domain-specific or non-obvious info |
| Windows-style backslash paths | Always use forward slashes |
| Deeply nested references | Keep references/ one level deep |
| Too many options without a default | Always recommend a default |
| "When to use" in body instead of description | Put triggers in the frontmatter description |
| Unlisted reference files | List every file in the Reference Files table |
| Time-sensitive claims | Use evergreen phrasing or link to sources |
| Wrong voice in description | Use third-person: "Guides..." not "Guide..." |
| Building enforcement as a checklist | Generate hooks, lint rules, or check scripts instead |
| Duplicating linter/formatter functionality | Check what tools exist first; skills should add value beyond standard tooling |
| Platform-specific scripts without fallbacks | Scripts must work on both Windows and macOS/Linux — branch on platform.system() where needed |
When using this skill alongside others:
data-ai
Downloads YouTube videos, transcribes audio via OpenAI Whisper, and produces summaries stored locally. Covers yt-dlp download, audio extraction, transcription, caching, and summarization. Use when a YouTube link is shared and the user wants a transcript or summary
development
Runs an adversarial code review loop that spawns independent reviewer and fixer subagents, iterating until only nitpicks remain. Scores findings by confidence, fixes real issues, and re-reviews with fresh eyes — all internal, no GitHub comments. Use when asked to review code, self-review, adversarial review, or polish code before pushing
development
Creates implementation-ready plans through discovery interviews, external research, and codebase analysis. Covers requirements, competitor research, architecture decisions, and change sequencing. Use when planning features, roadmaps, specs, or any work that needs discovery before coding
development
Generates an autonomous game design loop that iteratively expands a game concept into a comprehensive vision and implementation plan across multiple sessions. Covers mechanic exploration, system design, competitor research, and plan generation. Use when developing a game idea from seed concept to full implementation plan