plugins/core/skills/skill-creator/SKILL.md
Guide for creating or updating a Claude Code skill. Use this skill when defining a new skill, restructuring an existing one, deciding what belongs in SKILL.md vs bundled resources, or improving a skill that under-triggers, over-prescribes, or lacks high-signal guidance.
npx skillsauth add rbozydar/rbw-claude-code skill-creatorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill provides guidance for creating effective skills. Use it to design, restructure, or harden a skill so it triggers correctly, stays lean, and contains the non-obvious guidance that actually changes Claude's behavior.
Treat a skill as a folder-based capability, not just a markdown note. A strong skill gives Claude non-obvious knowledge, reusable tools, and just enough structure to execute well without being railroaded.
Optimize for the delta from Claude's default behavior:
A mediocre skill restates obvious steps. A strong skill changes what Claude can do or how reliably it can do it.
Most strong skills fall primarily into one category. Prefer a clean primary category instead of blending several unrelated ones.
Library and API reference
Product verification
Data fetching and analysis
Business process and team automation
Code scaffolding and templates
Code quality and review
CI/CD and deployment
Runbooks
Infrastructure operations
If a proposed skill does not fit any of these categories, verify that it is still a real repeatable capability instead of a one-off note.
Every skill consists of a required SKILL.md file and optional bundled resources:
skill-name/
├── SKILL.md
├── scripts/ # executable helpers
├── references/ # detailed docs loaded on demand
├── templates/ # reusable output structures
├── assets/ # boilerplate files, sample outputs, images, fixtures
├── config.json # optional user/team-specific configuration
└── logs/ or data/ if truly needed for local state
Use the folder structure deliberately:
When persistent writable data is required across upgrades, prefer ${CLAUDE_PLUGIN_DATA} instead of storing state directly inside the shipped skill folder.
Design the skill so Claude reads only what is needed:
This means:
Good pattern:
references/api.mdscripts/fetch_metrics.pyThe description field is for the model, not for a human catalog page. It should describe when the skill should trigger.
Write descriptions that mention:
Prefer trigger-oriented wording like:
Avoid vague summaries like:
A gotchas section is often the highest-signal part of the skill.
Include:
If the skill is mostly knowledge, prioritize gotchas over generic explanation. Start small if needed: one excellent gotcha is more valuable than pages of boilerplate.
Describe what tools and approaches are available and when each is useful. Only enforce a rigid sequence when safety or correctness requires it.
If Claude keeps rewriting the same helper logic, ship it as a script instead of describing it. Scripts are better when they provide deterministic reliability, lower token usage, or easier composition. Do not stop at describing what a helper should do when the helper can be shipped directly.
Verification skills compound in value. When relevant, include driving scripts, assertions on expected state, screenshot/video capture, and structured result output. A good verification skill gives Claude a repeatable way to prove the feature works, not just "test the feature".
config.json when needed.${CLAUDE_PLUGIN_DATA} for durable writable state.Follow this process in order unless there is a clear reason to skip a step.
Understand how the skill should actually be invoked. Gather or propose concrete examples such as:
Ask only a few focused questions at a time. A skill is ready to design once there is a clear picture of the recurring task and its trigger patterns.
Identify the primary category from the list above. Then identify the delta from Claude's default capabilities:
If there is no meaningful delta, the task may not need a skill.
For each representative example, decide what should live in:
${CLAUDE_PLUGIN_DATA}Use this rule of thumb:
When creating a new skill from scratch, initialize the folder first. From the repository root, run:
uv run python plugins/core/skills/skill-creator/scripts/init_skill.py <skill-name> --path <output-directory>
This script creates:
scripts/, references/, templates/, and assets/ directoriesDelete any example files that are not actually useful.
Before writing from scratch, use the bundled starter resources in this skill:
templates/simple-skill.md for single-file skillstemplates/router-skill.md for multi-workflow/router-style skillsreferences/recommended-structure.md for deciding simple vs router structurereferences/using-scripts.md for script design and workflow integrationreferences/using-templates.md for template design and placeholder conventionsscripts/evaluate_skill.py for summarizing repeated trigger and negative-trigger eval runsCopy the closest template, then edit heavily for your actual domain.
Add evals/skill-evals.yaml early so trigger behavior is tested, not guessed.
When editing SKILL.md, optimize for usefulness to another Claude instance.
Include:
Write instructions in clear, direct, imperative language. Focus on non-obvious procedural knowledge and concrete guidance.
Remove filler, duplicated explanation, and obvious advice.
Move long reference material into references/.
If the skill starts feeling bloated, split detail out instead of expanding SKILL.md indefinitely.
Before packaging, validate the finished skill. From the repository root, run:
uv run python plugins/core/skills/skill-creator/scripts/quick_validate.py <path/to/skill-folder>
Then summarize repeated trigger results using the eval harness:
uv run python plugins/core/skills/skill-creator/scripts/evaluate_skill.py <path/to/skill-folder>/evals/skill-evals.yaml --results <results.json>
The results JSON should capture repeated true/false outcomes for both should_trigger and should_not_trigger prompts.
To package a distributable zip, run:
uv run python plugins/core/skills/skill-creator/scripts/package_skill.py <path/to/skill-folder>
Optional output directory:
uv run python plugins/core/skills/skill-creator/scripts/package_skill.py <path/to/skill-folder> ./dist
The best skills improve through real use. After a skill is used:
evals/skill-evals.yaml with the new trigger and negative-trigger casesTreat iteration as part of the design, not as cleanup.
When updating an existing skill, check for these failure modes:
evals/skill-evals.yamlA description like "Use this skill when working with files" will trigger on nearly every request. Descriptions must name the specific domain, task type, or failure mode. Test by asking: would this description match requests that should NOT use this skill?
The init_skill.py scaffold includes TODO markers and example files (scripts/example.py, references/reference_notes.md, templates/example_template.md, assets/example_asset.txt). The validator catches obvious TODO-style markers, but subtler template remnants (for example, generic placeholder instructions) can slip through. Always delete or replace every scaffold file before packaging.
The most common single mistake. "A powerful skill for managing deployments" tells the model nothing about when to activate. Rewrite as: "Use this skill when deploying to production, rolling back a release, or debugging CI pipeline failures in the deploy workflow."
Skills that start lean accumulate detail over iterations until they consume excessive context on every trigger. After each round of additions, re-evaluate whether new content is guidance (belongs in SKILL.md) or reference material (belongs in references/).
If quick_validate.py and package_skill.py behaviors diverge over time, packaging may succeed while quality checks regress (or vice versa). Keep both scripts aligned when adding new resource types, validation rules, or exclusions.
A gotchas section that says "be careful with edge cases" adds no value. Every gotcha should describe a specific situation where the obvious approach fails, what goes wrong, and what to do instead. If you do not have real failure data yet, start with one concrete gotcha and add more as the skill is used.
After creating or updating a skill, verify it by running the validator from the repository root:
uv run python plugins/core/skills/skill-creator/scripts/quick_validate.py <path/to/skill-folder>
The validator checks:
scripts/, references/, templates/, assets/, config.json) actually exist on diskevals/skill-evals.yamlA clean validation pass is necessary but not sufficient. After the skill is used in a real session, check whether it triggered correctly and whether Claude's output was meaningfully better than it would have been without the skill. Prefer repeated trials and negative-trigger checks over one-off spot checks.
A skill is successful when:
development
This skill should be loaded when writing, reviewing, or refactoring Python code to apply strict coding standards directly in the current context without spawning a subagent. It provides comprehensive Python development standards covering SOLID principles, asyncio patterns, type hints, testing, and production-quality code.
tools
This skill should be used when invoking the Gemini CLI for code review, plan review, or any prompt-based task. It provides correct invocation patterns emphasizing stdin piping and @ syntax over shell variable gymnastics.
development
Use this skill when writing, reviewing, or debugging Quickshell configurations (QML files for desktop shell UI on Wayland/Hyprland). Triggers on: QML files with Quickshell imports, shell.qml entry points, PanelWindow or FloatingWindow usage, Quickshell service integration (PipeWire, MPRIS, notifications, Hyprland IPC), Wayland layer-shell or session-lock code, custom bar/panel/widget/dock/OSD/lockscreen/launcher development, or any question about building a desktop shell with Quickshell on Hyprland.
development
This skill should be used when thorough, multi-perspective research with citations is needed. It performs comprehensive research using a diffusion research loop with domain specialization, supporting general web research and specialized domains (geopolitical with GDELT). Auto-detects domain from query or accepts an explicit --domain flag.