Name: 技能创建助手
Author: laborany

Skill Creator

Create and iteratively improve skills through evaluation, scoring, and description optimization.

About Skills

Skills are modular, self-contained packages that extend Claude's capabilities with specialized knowledge, workflows, and tools. They transform Claude from a general-purpose agent into a specialized one equipped with procedural knowledge.

What Skills Provide

Specialized workflows — Multi-step procedures for specific domains
Tool integrations — Instructions for working with specific file formats or APIs
Domain expertise — Company-specific knowledge, schemas, business logic
Bundled resources — Scripts, references, and assets for complex and repetitive tasks

Core Principles

Concise is Key

The context window is a public good shared with system prompt, conversation history, other skills' metadata, and the user request.

Default assumption: Claude is already very smart. Only add context Claude doesn't already have. Challenge each piece of information: "Does Claude really need this?" and "Does this paragraph justify its token cost?"

Set Appropriate Degrees of Freedom

High freedom (text-based instructions): Multiple approaches valid, decisions depend on context
Medium freedom (pseudocode/scripts with parameters): Preferred pattern exists, some variation acceptable
Low freedom (specific scripts, few parameters): Operations fragile, consistency critical

Anatomy of a Skill

skill-name/
├── SKILL.md (required)
│   ├── YAML frontmatter (name + description required)
│   └── Markdown instructions
└── Bundled Resources (optional)
    ├── scripts/      — Executable code
    ├── references/   — Documentation loaded into context as needed
    └── assets/       — Files used in output (templates, icons, fonts)

SKILL.md (required)

Frontmatter (YAML): name and description fields. These determine when the skill triggers — be clear and comprehensive.
Body (Markdown): Instructions loaded AFTER the skill triggers.

Bundled Resources (optional)

scripts/: Executable code for tasks needing deterministic reliability or repeatedly rewritten code
references/: Documentation loaded as needed to inform Claude's process
assets/: Files used in output, not loaded into context

What to Not Include

Do NOT create extraneous documentation: README.md, INSTALLATION_GUIDE.md, CHANGELOG.md, etc. The skill should only contain information needed for an AI agent to do the job.

Communicating with the User

Many skill users are not technical. When communicating:

Use plain language. Avoid jargon unless the user introduced it first.
Explain what you're doing and why, not just the technical details.
When asking for input, provide concrete examples of what you need.
If something fails, explain what happened in user-friendly terms and what you'll try next.
Celebrate progress — let the user know when milestones are reached.

Skill Creation Process

Understand the skill with concrete examples
Plan reusable skill contents (scripts, references, assets)
Initialize the skill (run init_skill.py)
Edit the skill (implement resources and write SKILL.md)
Package the skill (run package_skill.py)
Run and evaluate test cases
Improve the skill description
Iterate based on evaluation results

Step 1: Understanding the Skill

Skip only when usage patterns are already clearly understood.

Ask targeted questions:

"What functionality should this skill support?"
"Can you give examples of how it would be used?"
"What would a user say that should trigger this skill?"

Avoid overwhelming users — start with the most important questions.

Step 2: Planning Reusable Contents

Analyze each concrete example:

Consider how to execute from scratch
Identify what scripts, references, and assets would help when executing repeatedly

Step 3: Initializing the Skill

Run init_skill.py to generate a template skill directory:

scripts/init_skill.py <skill-name> --path <output-directory>

Skip if the skill already exists and only needs iteration.

Step 4: Edit the Skill

Remember: the skill is for another Claude instance to use. Include non-obvious procedural knowledge and domain-specific details.

Frontmatter

Write name and description:

description is the primary triggering mechanism
Include both what the skill does AND specific triggers/contexts
All "when to use" info goes here — the body is only loaded after triggering

For LaborAny skills, also include icon and category.

Body

Write instructions for using the skill and its bundled resources. Keep SKILL.md body under 500 lines. Split into reference files when approaching this limit.

Step 5: Packaging

scripts/package_skill.py <path/to/skill-folder>

Validates the skill and creates a distributable .skill file (zip format).

Step 6: Running and Evaluating Test Cases

This is the core of the evaluate system. The goal is to quantify skill quality and identify areas for improvement.

6.1 Define Test Cases

Create eval/eval_metadata.json in the skill directory. See references/schemas.md for the schema. Each test case has:

A user prompt (what the user would say)
Assertions (expected behaviors/properties of the output)
Optional tags and weights

6.2 Spawn Evaluation Runs

Use scripts/run_eval.py to execute test cases against the skill:

python -m scripts.run_eval <skill-dir> [--test-case <id>] [--all]

Each run invokes claude -p with the skill loaded and captures the output.

6.3 Grade Results

The grader agent (agents/grader.md) evaluates each run's output against the assertions. It produces:

Pass/fail for each assertion with evidence
Overall score (0.0 to 1.0)
Eval quality critique (are the assertions good enough?)

6.4 Aggregate and Benchmark

Use scripts/aggregate_benchmark.py to collect scores across runs into eval/benchmark.json. The analyzer agent (agents/analyzer.md) can then surface patterns and regressions.

6.5 Generate Review

Use eval-viewer/generate_review.py to create an HTML report for visual inspection of results and benchmark trends.

Step 7: Improving the Skill Description

Use scripts/improve_description.py to optimize the skill description based on evaluation results:

python -m scripts.improve_description <skill-dir>

This calls Claude via CLI to analyze eval results and propose a better description. The <new_description> tag in the response is extracted and applied.

For the full eval-improve loop:

python -m scripts.run_loop <skill-dir> [--iterations <n>]

This automates: run evals → grade → aggregate → improve description → repeat.

Step 8: Iterate

After evaluation, iterate based on results:

Review the HTML report from eval-viewer
Check which test cases score lowest
Use the analyzer agent to find patterns
Update SKILL.md or bundled resources
Re-run evaluations to confirm improvement

Writing Patterns

Sequential Workflows

Break complex tasks into clear steps with an overview:

Processing involves these steps:
1. Analyze input (run analyze.py)
2. Transform data (run transform.py)
3. Validate output (run validate.py)

Conditional Workflows

Guide through decision points:

1. Determine the task type:
   **Creating new?** → Follow "Creation workflow"
   **Editing existing?** → Follow "Editing workflow"

Template Pattern

Provide output templates with appropriate strictness level.

Examples Pattern

Provide input/output pairs when output quality depends on seeing examples.

Writing Style

Use imperative/infinitive form in instructions
Be concise — every sentence should justify its token cost
Prefer examples over explanations
Keep reference files one level deep from SKILL.md
Structure files >100 lines with a table of contents

自动分类规则

创建新 skill 时，必须根据功能添加 category 和 icon 字段：

| 关键词 | Category | 推荐 Icon | |--------|----------|-----------| | 文档、Word、PDF、PPT、Excel | 办公 | 📝📄📊📈 | | 股票、金融、投资、财报 | 金融 | 💹📊 | | 论文、学术、研究 | 学术 | 📚🎓 | | 设计、UI、前端、网页 | 设计 | 🎨🖼️ | | 数据、监控、分析 | 数据 | 📈📉 | | 报销、费用、财务 | 财务 | 💰💳 | | 社交、运营、营销 | 运营 | 📱📣 | | 开发、代码、编程 | 开发 | 🛠️💻 | | 其他 | 工具 | 🔧⚙️ |

Frontmatter 示例：

---
name: 技能名称
description: |
  技能描述...
icon: 📝
category: 办公
---

LaborAny Skill Install Rules (Mandatory)

When the user asks to install a skill, do not run a free-form manual process. Always follow this deterministic flow:

Extract install source from user input. Supported source forms:
- GitHub repo/tree URL (for example: https://github.com/org/repo/tree/main/skills/agent-browser)
- GitHub short form (for example: org/repo/skills/agent-browser)
- Direct downloadable ZIP/TAR URL (for example: https://example.com/agent-browser.zip or https://example.com/agent-browser.tar.gz)
Use LaborAny's built-in installation API/flow to install into the user skill directory.
Never copy files into builtin skills/ manually.
Ensure metadata is valid for LaborAny:
- icon and category must exist
- fill missing values according to skill purpose
- do not override valid existing values
After install, clearly report:
- installed skill ID
- absolute installed path
- where to find it in UI (能力管理 -> 我的能力)

If install fails, report concrete reason and next action, such as:

invalid source URL/path
archive has no SKILL.md
archive has multiple skill directories and cannot determine target

If source structure is not fully compliant with LaborAny skill format, adapt it automatically:

create/repair SKILL.md template
ensure name, description, icon, category are available
keep original files as references/scripts/assets when possible

Skill Creator

Create and iteratively improve skills through evaluation, scoring, and description optimization.

About Skills

What Skills Provide

Specialized workflows — Multi-step procedures for specific domains
Tool integrations — Instructions for working with specific file formats or APIs
Domain expertise — Company-specific knowledge, schemas, business logic
Bundled resources — Scripts, references, and assets for complex and repetitive tasks

Core Principles

Concise is Key

The context window is a public good shared with system prompt, conversation history, other skills' metadata, and the user request.

Set Appropriate Degrees of Freedom

High freedom (text-based instructions): Multiple approaches valid, decisions depend on context
Medium freedom (pseudocode/scripts with parameters): Preferred pattern exists, some variation acceptable
Low freedom (specific scripts, few parameters): Operations fragile, consistency critical

Anatomy of a Skill

skill-name/
├── SKILL.md (required)
│   ├── YAML frontmatter (name + description required)
│   └── Markdown instructions
└── Bundled Resources (optional)
    ├── scripts/      — Executable code
    ├── references/   — Documentation loaded into context as needed
    └── assets/       — Files used in output (templates, icons, fonts)

SKILL.md (required)

Frontmatter (YAML): name and description fields. These determine when the skill triggers — be clear and comprehensive.
Body (Markdown): Instructions loaded AFTER the skill triggers.

Bundled Resources (optional)

scripts/: Executable code for tasks needing deterministic reliability or repeatedly rewritten code
references/: Documentation loaded as needed to inform Claude's process
assets/: Files used in output, not loaded into context

What to Not Include

Do NOT create extraneous documentation: README.md, INSTALLATION_GUIDE.md, CHANGELOG.md, etc. The skill should only contain information needed for an AI agent to do the job.

Communicating with the User

Many skill users are not technical. When communicating:

Use plain language. Avoid jargon unless the user introduced it first.
Explain what you're doing and why, not just the technical details.
When asking for input, provide concrete examples of what you need.
If something fails, explain what happened in user-friendly terms and what you'll try next.
Celebrate progress — let the user know when milestones are reached.

Skill Creation Process

Understand the skill with concrete examples
Plan reusable skill contents (scripts, references, assets)
Initialize the skill (run init_skill.py)
Edit the skill (implement resources and write SKILL.md)
Package the skill (run package_skill.py)
Run and evaluate test cases
Improve the skill description
Iterate based on evaluation results

Step 1: Understanding the Skill

Skip only when usage patterns are already clearly understood.

Ask targeted questions:

"What functionality should this skill support?"
"Can you give examples of how it would be used?"
"What would a user say that should trigger this skill?"

Avoid overwhelming users — start with the most important questions.

Step 2: Planning Reusable Contents

Analyze each concrete example:

Consider how to execute from scratch
Identify what scripts, references, and assets would help when executing repeatedly

Step 3: Initializing the Skill

Run init_skill.py to generate a template skill directory:

scripts/init_skill.py <skill-name> --path <output-directory>

Skip if the skill already exists and only needs iteration.

Step 4: Edit the Skill

Remember: the skill is for another Claude instance to use. Include non-obvious procedural knowledge and domain-specific details.

Frontmatter

Write name and description:

description is the primary triggering mechanism
Include both what the skill does AND specific triggers/contexts
All "when to use" info goes here — the body is only loaded after triggering

For LaborAny skills, also include icon and category.

Body

Write instructions for using the skill and its bundled resources. Keep SKILL.md body under 500 lines. Split into reference files when approaching this limit.

Step 5: Packaging

scripts/package_skill.py <path/to/skill-folder>

Validates the skill and creates a distributable .skill file (zip format).

Step 6: Running and Evaluating Test Cases

This is the core of the evaluate system. The goal is to quantify skill quality and identify areas for improvement.

6.1 Define Test Cases

Create eval/eval_metadata.json in the skill directory. See references/schemas.md for the schema. Each test case has:

A user prompt (what the user would say)
Assertions (expected behaviors/properties of the output)
Optional tags and weights

6.2 Spawn Evaluation Runs

Use scripts/run_eval.py to execute test cases against the skill:

python -m scripts.run_eval <skill-dir> [--test-case <id>] [--all]

Each run invokes claude -p with the skill loaded and captures the output.

6.3 Grade Results

The grader agent (agents/grader.md) evaluates each run's output against the assertions. It produces:

Pass/fail for each assertion with evidence
Overall score (0.0 to 1.0)
Eval quality critique (are the assertions good enough?)

6.4 Aggregate and Benchmark

Use scripts/aggregate_benchmark.py to collect scores across runs into eval/benchmark.json. The analyzer agent (agents/analyzer.md) can then surface patterns and regressions.

6.5 Generate Review

Use eval-viewer/generate_review.py to create an HTML report for visual inspection of results and benchmark trends.

Step 7: Improving the Skill Description

Use scripts/improve_description.py to optimize the skill description based on evaluation results:

python -m scripts.improve_description <skill-dir>

This calls Claude via CLI to analyze eval results and propose a better description. The <new_description> tag in the response is extracted and applied.

For the full eval-improve loop:

python -m scripts.run_loop <skill-dir> [--iterations <n>]

This automates: run evals → grade → aggregate → improve description → repeat.

Step 8: Iterate

After evaluation, iterate based on results:

Review the HTML report from eval-viewer
Check which test cases score lowest
Use the analyzer agent to find patterns
Update SKILL.md or bundled resources
Re-run evaluations to confirm improvement

Writing Patterns

Sequential Workflows

Break complex tasks into clear steps with an overview:

Processing involves these steps:
1. Analyze input (run analyze.py)
2. Transform data (run transform.py)
3. Validate output (run validate.py)

Conditional Workflows

Guide through decision points:

1. Determine the task type:
   **Creating new?** → Follow "Creation workflow"
   **Editing existing?** → Follow "Editing workflow"

Template Pattern

Provide output templates with appropriate strictness level.

Examples Pattern

Provide input/output pairs when output quality depends on seeing examples.

Writing Style

Use imperative/infinitive form in instructions
Be concise — every sentence should justify its token cost
Prefer examples over explanations
Keep reference files one level deep from SKILL.md
Structure files >100 lines with a table of contents

自动分类规则

创建新 skill 时，必须根据功能添加 category 和 icon 字段：

Frontmatter 示例：

---
name: 技能名称
description: |
  技能描述...
icon: 📝
category: 办公
---

LaborAny Skill Install Rules (Mandatory)

When the user asks to install a skill, do not run a free-form manual process. Always follow this deterministic flow:

Extract install source from user input. Supported source forms:
- GitHub repo/tree URL (for example: https://github.com/org/repo/tree/main/skills/agent-browser)
- GitHub short form (for example: org/repo/skills/agent-browser)
- Direct downloadable ZIP/TAR URL (for example: https://example.com/agent-browser.zip or https://example.com/agent-browser.tar.gz)
Use LaborAny's built-in installation API/flow to install into the user skill directory.
Never copy files into builtin skills/ manually.
Ensure metadata is valid for LaborAny:
- icon and category must exist
- fill missing values according to skill purpose
- do not override valid existing values
After install, clearly report:
- installed skill ID
- absolute installed path
- where to find it in UI (能力管理 -> 我的能力)

If install fails, report concrete reason and next action, such as:

invalid source URL/path
archive has no SKILL.md
archive has multiple skill directories and cannot determine target

If source structure is not fully compliant with LaborAny skill format, adapt it automatically:

create/repair SKILL.md template
ensure name, description, icon, category are available
keep original files as references/scripts/assets when possible

Adoption

laborany/技能创建助手

$ install --global

Security Scan Results

SKILL.md

Skill Creator

About Skills

What Skills Provide

Core Principles

Concise is Key

Set Appropriate Degrees of Freedom

Anatomy of a Skill

SKILL.md (required)

Bundled Resources (optional)

What to Not Include

Communicating with the User

Skill Creation Process

Step 1: Understanding the Skill

Step 2: Planning Reusable Contents

Step 3: Initializing the Skill

Step 4: Edit the Skill

Frontmatter

Body

Step 5: Packaging

Step 6: Running and Evaluating Test Cases

6.1 Define Test Cases

6.2 Spawn Evaluation Runs

6.3 Grade Results

6.4 Aggregate and Benchmark

6.5 Generate Review

Step 7: Improving the Skill Description

Step 8: Iterate

Writing Patterns

Sequential Workflows

Conditional Workflows

Template Pattern

Examples Pattern

Writing Style

自动分类规则

LaborAny Skill Install Rules (Mandatory)

Related Skills

laborany/AI 视频工厂

laborany/playwright-trace

laborany/playwright-cli

laborany/设计大师

laborany/技能创建助手

$ install --global

Security Scan Results

SKILL.md

Skill Creator

About Skills

What Skills Provide

Core Principles

Concise is Key

Set Appropriate Degrees of Freedom

Anatomy of a Skill

SKILL.md (required)

Bundled Resources (optional)

What to Not Include

Communicating with the User

Skill Creation Process

Step 1: Understanding the Skill

Step 2: Planning Reusable Contents

Step 3: Initializing the Skill

Step 4: Edit the Skill

Frontmatter

Body

Step 5: Packaging

Step 6: Running and Evaluating Test Cases

6.1 Define Test Cases

6.2 Spawn Evaluation Runs

6.3 Grade Results

6.4 Aggregate and Benchmark

6.5 Generate Review

Step 7: Improving the Skill Description

Step 8: Iterate

Writing Patterns

Sequential Workflows

Conditional Workflows

Template Pattern