Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

mathews-tom/skill-distiller

Name: skill-distiller
Author: mathews-tom

skills/skill-distiller/SKILL.md

npx skillsauth add mathews-tom/armory skill-distiller

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Skill Distiller

Transform skills authored for high-capability models (Opus) into deterministic workflows that execute reliably on lower-cost models (Sonnet, Haiku). The core insight from EvoSkills: skills encode reusable task structure, not model-specific artifacts. A skill evolved on Opus transfers with +35-45pp gains to other models — but only when the instructions are sufficiently deterministic that lower-capability models can follow them without improvising.

Reference Files

| File | Contents | Load When | | -------------------------------------- | -------------------------------------------------- | ------------------------------------ | | references/distillation-patterns.md | Pattern catalog for converting reasoning to rules | Always |

Prerequisites

The source skill must exist and pass package-evaluator at >= 70%
Access to both the source model (Opus) and target model (Haiku/Sonnet) for validation
The surrogate-verifier skill for cross-model assertion checking

Workflow

Phase 1: Complexity Analysis

Score each section of the source SKILL.md for reasoning difficulty:

| Complexity Signal | Score | Distillation Action | | ------------------------------------- | ----- | -------------------------------------------- | | Decision tree with 3+ branches | HIGH | Convert to explicit if/then lookup table | | "Use judgment" or "consider context" | HIGH | Replace with concrete heuristic rules | | Multi-step inference chain | HIGH | Break into numbered atomic steps | | Reference to domain expertise | MED | Add explicit reference file with knowledge | | Clear enumerated steps | LOW | Keep as-is | | Concrete examples with expected output| LOW | Keep as-is |

Produce a complexity map: section name -> complexity score -> planned action.

Phase 2: Trace Collection

Execute the source skill with Opus on 5 representative tasks:

Select tasks from evals/cases.yaml (positive cases) or generate new ones
For each task, capture the full execution trace:
- Tool calls made (which tools, in what order)
- Intermediate reasoning visible in output
- Final output structure and content
- Time taken and token usage
Store traces as structured data for pattern extraction

Phase 3: Pattern Extraction

From the collected traces, extract deterministic patterns:

Decision paths — For each HIGH-complexity section, find the actual decisions Opus made across the 5 tasks. If Opus chose the same path in 4/5 cases, that path becomes the default rule
Lookup tables — Where Opus applied domain knowledge, build explicit lookup tables (e.g., "if input contains SQL, use these patterns; if input contains Python, use those")
Concrete examples — Extract representative input/output pairs from traces to serve as few-shot examples in the distilled skill
Tool sequences — Identify the common tool invocation pattern and make it explicit ("Step 1: Read the file. Step 2: Grep for pattern X. Step 3: Write output.")

Phase 4: Distilled Rewrite

Rewrite the SKILL.md applying all distillation actions from Phase 1:

| Source Pattern | Distilled Replacement | | -------------------------------------- | ------------------------------------------------------------ | | "Analyze the code and determine..." | "Check for these 5 specific patterns: [list]" | | "Use appropriate formatting" | "Output as a markdown table with columns: [A, B, C]" | | "Consider the context to decide..." | "If [condition A]: do X. If [condition B]: do Y. Default: Z" | | "Apply best practices for..." | Reference file with explicit best practices enumerated | | Multi-paragraph reasoning instruction | Numbered step list with single-sentence steps |

Rules for the rewrite:

Every instruction must be actionable by a model with no domain expertise
No step should require inference — each step's input and output must be explicit
Replace all "consider", "analyze", "determine" verbs with "check", "count", "list", "output"
Add concrete examples for any step that could be ambiguous
Keep the SKILL.md under 500 lines (distillation should reduce, not expand)

Phase 5: Target Model Validation

Run the distilled skill on the target model (Haiku or Sonnet):

Execute the same 5 tasks from Phase 2 with the distilled skill loaded
Use the surrogate-verifier to generate assertions for each task output
Compare pass rates:

| Metric | Source (Opus + original) | Target (Haiku + distilled) | Delta | | ------------------------------- | ------------------------ | -------------------------- | ----- | | Assertions passed | N/M | N/M | ± | | Weighted score | X.XX | X.XX | ± | | Output completeness | % | % | ± | | Format compliance | % | % | ± |

If target model score < 80% of source model score, iterate:
- Identify which assertions the target model fails
- Add more explicit instructions for those specific failure points
- Re-run validation (max 3 iterations)

Phase 6: Cross-Model Report

Produce the final comparison:

# Skill Distillation Report: <skill-name>

## Complexity Reduction
- Sections distilled: N/M (HIGH → LOW)
- Instruction word count: original X → distilled Y (Z% reduction)
- Decision points replaced with lookup tables: N

## Cross-Model Performance
| Model   | Assertions Passed | Weighted Score | Format Compliance |
|---------|-------------------|----------------|-------------------|
| Opus    | 7/7               | 1.00           | 100%              |
| Sonnet  | 6/7               | 0.92           | 100%              |
| Haiku   | 5/7               | 0.85           | 85%               |

## Changes Made
1. [Section] "Analyze complexity" → explicit 5-item checklist
2. [Section] "Apply formatting" → fixed markdown table template
...

## Recommendation
[SHIP | ITERATE | MANUAL_REVIEW_NEEDED]

Error Handling

| Error | Resolution | | ----------------------------------- | ------------------------------------------------------------- | | Source skill scores below 70% | Refuse distillation; recommend evolution via test-engineer | | No execution traces available | Generate synthetic tasks and collect traces before proceeding | | Target model fails all assertions | Skill may be too complex for target model; report with detail | | Distilled skill longer than source | Review distillation; patterns may need consolidation |

Limitations

Cannot distill skills that rely on open-ended adaptive reasoning at many decision points or multi-turn reasoning
Visual/interactive skills (HTML generation, browser automation) may not distill well
Distillation optimizes for determinism, not creativity — skills requiring open-ended generation (writing, brainstorming) are poor candidates
Trace collection requires actual model execution, incurring API costs

mathews-tom/skill-distiller

skills/skill-distiller/SKILL.md

Converts Opus-quality skills into deterministic Haiku-executable workflows via trace-driven distillation and cross-model validation. Triggers on: "distill this skill", "make this skill work on Haiku", "cross-model optimization", "optimize skill for cost". NOT for code simplification, use code-refiner.

221 stars

development

Updated May 4, 2026

$ install --global

skillsauth

npx skillsauth add mathews-tom/armory skill-distiller

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 4, 2026, 7:09 AM153.6s3 files scanned

SKILL.md

name:: skill-distiller
description:: Converts Opus-quality skills into deterministic Haiku-executable workflows via trace-driven distillation and cross-model validation. Triggers on: "distill this skill", "make this skill work on Haiku", "cross-model optimization", "optimize skill for cost". NOT for code simplification, use code-refiner.
version:: 1.0.1
category:: development
tags:: [distillation, cross-model, optimization, haiku, deterministic]
difficulty:: advanced
phase:: build

Skill Distiller

Reference Files

Prerequisites

The source skill must exist and pass package-evaluator at >= 70%
Access to both the source model (Opus) and target model (Haiku/Sonnet) for validation
The surrogate-verifier skill for cross-model assertion checking

Workflow

Phase 1: Complexity Analysis

Score each section of the source SKILL.md for reasoning difficulty:

Produce a complexity map: section name -> complexity score -> planned action.

Phase 2: Trace Collection

Execute the source skill with Opus on 5 representative tasks:

Select tasks from evals/cases.yaml (positive cases) or generate new ones
For each task, capture the full execution trace:
- Tool calls made (which tools, in what order)
- Intermediate reasoning visible in output
- Final output structure and content
- Time taken and token usage
Store traces as structured data for pattern extraction

Phase 3: Pattern Extraction

From the collected traces, extract deterministic patterns:

Decision paths — For each HIGH-complexity section, find the actual decisions Opus made across the 5 tasks. If Opus chose the same path in 4/5 cases, that path becomes the default rule
Lookup tables — Where Opus applied domain knowledge, build explicit lookup tables (e.g., "if input contains SQL, use these patterns; if input contains Python, use those")
Concrete examples — Extract representative input/output pairs from traces to serve as few-shot examples in the distilled skill
Tool sequences — Identify the common tool invocation pattern and make it explicit ("Step 1: Read the file. Step 2: Grep for pattern X. Step 3: Write output.")

Phase 4: Distilled Rewrite

Rewrite the SKILL.md applying all distillation actions from Phase 1:

Rules for the rewrite:

Every instruction must be actionable by a model with no domain expertise
No step should require inference — each step's input and output must be explicit
Replace all "consider", "analyze", "determine" verbs with "check", "count", "list", "output"
Add concrete examples for any step that could be ambiguous
Keep the SKILL.md under 500 lines (distillation should reduce, not expand)

Phase 5: Target Model Validation

Run the distilled skill on the target model (Haiku or Sonnet):

Execute the same 5 tasks from Phase 2 with the distilled skill loaded
Use the surrogate-verifier to generate assertions for each task output
Compare pass rates:

If target model score < 80% of source model score, iterate:
- Identify which assertions the target model fails
- Add more explicit instructions for those specific failure points
- Re-run validation (max 3 iterations)

Phase 6: Cross-Model Report

Produce the final comparison:

# Skill Distillation Report: <skill-name>

## Complexity Reduction
- Sections distilled: N/M (HIGH → LOW)
- Instruction word count: original X → distilled Y (Z% reduction)
- Decision points replaced with lookup tables: N

## Cross-Model Performance
| Model   | Assertions Passed | Weighted Score | Format Compliance |
|---------|-------------------|----------------|-------------------|
| Opus    | 7/7               | 1.00           | 100%              |
| Sonnet  | 6/7               | 0.92           | 100%              |
| Haiku   | 5/7               | 0.85           | 85%               |

## Changes Made
1. [Section] "Analyze complexity" → explicit 5-item checklist
2. [Section] "Apply formatting" → fixed markdown table template
...

## Recommendation
[SHIP | ITERATE | MANUAL_REVIEW_NEEDED]

Error Handling

Limitations

Cannot distill skills that rely on open-ended adaptive reasoning at many decision points or multi-turn reasoning
Visual/interactive skills (HTML generation, browser automation) may not distill well
Distillation optimizes for determinism, not creativity — skills requiring open-ended generation (writing, brainstorming) are poor candidates
Trace collection requires actual model execution, incurring API costs

Related Skills

mathews-tom/stacked-prs

testing

VerifiedTrustedCommunity

Manages dependent branch stacks and stacked pull requests using safe Git topology rules. Triggers on: "create stacked PRs", "publish this stack", "sync my PR stack", "rebase this stack", "merge the stack", "retarget child PRs", "split this branch into stacked PRs", "validate this stack", "cleanup stacked branches". Use when local branches or one source branch need to become a dependency-ordered PR stack with correct parent bases, validation, synchronization, merge order, and cleanup.

242SKILL.mdUpdated May 23, 2026

mathews-tom/stacked-prs

mathews-tom/project-context-setup

development

VerifiedTrustedCommunity

Scaffolds per-repository agent context so coding agents share the same issue tracker rules, triage label vocabulary, domain glossary, ADR layout, and handoff conventions. Triggers on: "set up project context", "configure agent docs", "create CONTEXT.md", "setup agent workflow", "agent issue tracker setup", "triage labels", "domain glossary for agents". Use when a repo needs durable context files before planning, triage, debugging, TDD, architecture review, or multi-agent implementation.

230SKILL.mdUpdated May 12, 2026

mathews-tom/project-context-setup

mathews-tom/task-decomposer

testing

VerifiedTrustedCommunity

Produces phased task boards from feature requests: dependency-mapped work items, parallelization flags, risk flags, edge cases, test matrices. Triggers on: "decompose this feature", "task breakdown with dependencies", "phased implementation plan", "work breakdown structure". NOT for effort estimates, use estimate-calibrator.

230SKILL.mdUpdated Apr 6, 2026

mathews-tom/task-decomposer

mathews-tom/debug-investigator

development

VerifiedTrustedCommunity

Hypothesis-driven debugging with ranked hypotheses, git bisect strategy, instrumentation planning, and minimal reproduction design. Triggers on: "debug this systematically", "root cause analysis", "bisect this bug", "rank hypotheses", "isolate this issue", "minimal reproduction". NOT for general reasoning.

230SKILL.mdUpdated Apr 6, 2026

mathews-tom/debug-investigator

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/mathews-tom/armory.git

# Copy into Claude Code skills folder (global)
cp -r armory/skills/skill-distiller ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

mathews-tom/armory

221 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT