Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

markus41/model-routing

Name: model-routing
Author: markus41

plugins/claude-code-expert/archive/v7.6.0/skills/model-routing/SKILL.md

npx skillsauth add markus41/claude model-routing

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Model Routing Intelligence

Select the right Claude model for each task to optimize the cost/quality tradeoff.

Goal

Eliminate wasted spend by routing tasks to the cheapest model that produces acceptable quality, while ensuring complex tasks get the reasoning depth they need.

Decision Matrix

Task → Model mapping

| Task Type | Recommended Model | Reasoning | |-----------|-------------------|-----------| | Architecture decisions | Opus 4.6 | Needs deep multi-step reasoning, hidden coupling detection | | Complex debugging | Opus 4.6 | Root cause analysis requires holding many hypotheses | | Security review | Opus 4.6 | Must not miss subtle vulnerabilities | | Standard implementation | Sonnet 4.6 | Best balance of speed, quality, and cost for code generation | | Code review | Sonnet 4.6 | Good pattern recognition at reasonable cost | | Refactoring | Sonnet 4.6 | Mechanical transformations with quality checks | | Test writing | Sonnet 4.6 | Formulaic but needs understanding of code under test | | File search / grep | Haiku 4.5 | Simple lookup, no deep reasoning needed | | Documentation lookup | Haiku 4.5 | Reading and summarizing existing content | | Commit message generation | Haiku 4.5 | Short, formulaic output | | Simple Q&A | Haiku 4.5 | Direct answers, no complex analysis | | Research subagents | Haiku 4.5 | Exploration tasks that return summaries |

Complexity signals

Use these signals to decide when to escalate from Sonnet to Opus:

Multiple interacting systems or modules
Non-obvious failure modes
"Why does this work?" questions
Tasks where a wrong answer is expensive to fix
Cross-cutting concerns (auth, caching, observability)
Migration or backward-compatibility requirements

Use these signals to downgrade from Sonnet to Haiku:

Single-file changes
Mechanical transformations (rename, reformat)
Reading and summarizing (no generation)
Answering factual questions about code

Cost Tables

Per-token pricing (USD per million tokens)

| Model | Input | Output | Cache Write | Cache Read | |-------|------:|-------:|------------:|-----------:| | Opus 4.6 | $15.00 | $75.00 | $18.75 | $1.50 | | Sonnet 4.6 | $3.00 | $15.00 | $3.75 | $0.30 | | Haiku 4.5 | $0.80 | $4.00 | $1.00 | $0.08 |

Cost multipliers

| Comparison | Input | Output | |-----------|------:|-------:| | Opus vs Sonnet | 5x | 5x | | Sonnet vs Haiku | 3.75x | 3.75x | | Opus vs Haiku | 18.75x | 18.75x |

Typical session costs

| Task | Model | Est. Tokens (in/out) | Est. Cost | |------|-------|---------------------:|----------:| | Simple bug fix | Sonnet | 50k/10k | ~$0.30 | | Feature implementation | Sonnet | 200k/50k | ~$1.35 | | Architecture review | Opus | 200k/30k | ~$5.25 | | Quick lookup | Haiku | 20k/2k | ~$0.02 | | Research subagent | Haiku | 80k/10k | ~$0.10 | | Full code review (council) | Mixed | 500k/100k | ~$3-8 |

Subagent Model Assignment

Orchestration patterns

When using cc-orchestrate or spawning subagents, assign models by role:

Research agents     → Haiku (cheap exploration, summary return)
Implementation agents → Sonnet (code generation quality)
Review/audit agents → Sonnet or Opus (depends on risk)
Architecture agents → Opus (deep reasoning required)

Example: builder-validator template

builder agent   → Sonnet 4.6 (writes code)
validator agent → Sonnet 4.6 (reviews code)

Example: research-council template

researcher agents (3x) → Haiku 4.5 (parallel exploration)
synthesizer agent      → Sonnet 4.6 (combines findings)

Budget Planning

Setting a session budget

Before starting a task, estimate cost:

Classify the task using the decision matrix above
Estimate token volume based on file count and task scope
Calculate cost using the pricing table
Set model with /model or claude -m

Token estimation rules of thumb

| Content Type | Tokens per Line | |-------------|----------------:| | TypeScript/JavaScript | ~10 | | Python | ~8 | | JSON/YAML | ~6 | | Markdown | ~5 | | Minified code | ~15 |

Cost control techniques

Start with Haiku for research, switch to Sonnet for implementation
Use subagents to isolate expensive research from main context
Compact early at 60-70% context to avoid expensive re-reads
Limit tool output — avoid cat-ing entire large files; use Grep with limits
Batch related tasks to benefit from prompt caching (cache read = 10% of input cost)
Use --max-turns in headless mode to cap automated sessions

Model switching workflow

# Start with research on Haiku
/model claude-haiku-4-5-20251001
# "Find all files related to auth, summarize the architecture"

# Switch to Sonnet for implementation
/model claude-sonnet-4-6
# "Implement the new auth middleware based on the research above"

# Switch to Opus for the tricky part
/model claude-opus-4-6
# "Review the session handling for race conditions and edge cases"

Environment Variables

CLAUDE_MODEL=claude-sonnet-4-6          # Default model for sessions
ANTHROPIC_MODEL=claude-sonnet-4-6       # Alternative env var

Settings Configuration

{
  "model": "claude-sonnet-4-6",
  "smallFastModel": "claude-haiku-4-5-20251001"
}

The smallFastModel is used for internal operations like skill matching and context compression. Keep it on Haiku for cost efficiency.

Anti-patterns

Using Opus for everything — 5x the cost of Sonnet with marginal quality improvement on simple tasks
Using Haiku for complex implementation — saves money but produces lower-quality code that needs more iterations
Not using subagents — research in main context inflates token count for every subsequent turn
Re-reading large files — each read costs tokens; anchor important content instead
Ignoring cache hits — restructure prompts to maximize cache read tokens (10% of input cost)

markus41/model-routing

plugins/claude-code-expert/archive/v7.6.0/skills/model-routing/SKILL.md

Intelligent model selection for Claude Code — decision matrices, cost tables, budget planning, and subagent model assignment for optimal cost/quality tradeoffs

10 stars

development

Updated Apr 17, 2026

$ install --global

skillsauth

npx skillsauth add markus41/claude model-routing

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 7, 2026, 2:32 AM51.0s1 file scanned

SKILL.md

name:: model-routing
description:: Intelligent model selection for Claude Code — decision matrices, cost tables, budget planning, and subagent model assignment for optimal cost/quality tradeoffs

Model Routing Intelligence

Select the right Claude model for each task to optimize the cost/quality tradeoff.

Goal

Eliminate wasted spend by routing tasks to the cheapest model that produces acceptable quality, while ensuring complex tasks get the reasoning depth they need.

Decision Matrix

Task → Model mapping

Complexity signals

Use these signals to decide when to escalate from Sonnet to Opus:

Multiple interacting systems or modules
Non-obvious failure modes
"Why does this work?" questions
Tasks where a wrong answer is expensive to fix
Cross-cutting concerns (auth, caching, observability)
Migration or backward-compatibility requirements

Use these signals to downgrade from Sonnet to Haiku:

Single-file changes
Mechanical transformations (rename, reformat)
Reading and summarizing (no generation)
Answering factual questions about code

Cost Tables

Per-token pricing (USD per million tokens)

Cost multipliers

| Comparison | Input | Output | |-----------|------:|-------:| | Opus vs Sonnet | 5x | 5x | | Sonnet vs Haiku | 3.75x | 3.75x | | Opus vs Haiku | 18.75x | 18.75x |

Typical session costs

Subagent Model Assignment

Orchestration patterns

When using cc-orchestrate or spawning subagents, assign models by role:

Research agents     → Haiku (cheap exploration, summary return)
Implementation agents → Sonnet (code generation quality)
Review/audit agents → Sonnet or Opus (depends on risk)
Architecture agents → Opus (deep reasoning required)

Example: builder-validator template

builder agent   → Sonnet 4.6 (writes code)
validator agent → Sonnet 4.6 (reviews code)

Example: research-council template

researcher agents (3x) → Haiku 4.5 (parallel exploration)
synthesizer agent      → Sonnet 4.6 (combines findings)

Budget Planning

Setting a session budget

Before starting a task, estimate cost:

Classify the task using the decision matrix above
Estimate token volume based on file count and task scope
Calculate cost using the pricing table
Set model with /model or claude -m

Token estimation rules of thumb

| Content Type | Tokens per Line | |-------------|----------------:| | TypeScript/JavaScript | ~10 | | Python | ~8 | | JSON/YAML | ~6 | | Markdown | ~5 | | Minified code | ~15 |

Cost control techniques

Start with Haiku for research, switch to Sonnet for implementation
Use subagents to isolate expensive research from main context
Compact early at 60-70% context to avoid expensive re-reads
Limit tool output — avoid cat-ing entire large files; use Grep with limits
Batch related tasks to benefit from prompt caching (cache read = 10% of input cost)
Use --max-turns in headless mode to cap automated sessions

Model switching workflow

# Start with research on Haiku
/model claude-haiku-4-5-20251001
# "Find all files related to auth, summarize the architecture"

# Switch to Sonnet for implementation
/model claude-sonnet-4-6
# "Implement the new auth middleware based on the research above"

# Switch to Opus for the tricky part
/model claude-opus-4-6
# "Review the session handling for race conditions and edge cases"

Environment Variables

CLAUDE_MODEL=claude-sonnet-4-6          # Default model for sessions
ANTHROPIC_MODEL=claude-sonnet-4-6       # Alternative env var

Settings Configuration

{
  "model": "claude-sonnet-4-6",
  "smallFastModel": "claude-haiku-4-5-20251001"
}

The smallFastModel is used for internal operations like skill matching and context compression. Keep it on Haiku for cost efficiency.

Anti-patterns

Using Opus for everything — 5x the cost of Sonnet with marginal quality improvement on simple tasks
Using Haiku for complex implementation — saves money but produces lower-quality code that needs more iterations
Not using subagents — research in main context inflates token count for every subsequent turn
Re-reading large files — each read costs tokens; anchor important content instead
Ignoring cache hits — restructure prompts to maximize cache read tokens (10% of input cost)

Related Skills

markus41/plugins/microsoft-agents-expert/skills/teams-agents

tools

VerifiedTrustedCommunity

Build Teams-native agents with the Teams SDK (formerly Teams AI Library v2) — App class, activity routing, adaptive cards, streaming, AI-generated labels, feedback, message extensions, Teams-as-MCP-server, and the bring-your-own-AI pattern with Agent Framework.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/teams-agents

markus41/plugins/microsoft-agents-expert/skills/microsoft-foundry

tools

VerifiedTrustedCommunity

Run agents on Microsoft Foundry (formerly Azure AI Foundry) Agent Service — prompt agents vs hosted agents, threads/runs and the Responses API, built-in tools (Bing grounding, code interpreter, file search, MCP, OpenAPI, A2A), connected agents, Entra agent identity, SDKs, and observability/evaluations.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/microsoft-foundry

markus41/plugins/microsoft-agents-expert/skills/m365-agents-sdk

tools

VerifiedTrustedCommunity

Build and host custom engine agents with the Microsoft 365 Agents SDK — AgentApplication, the Activity protocol, channel reach via Azure Bot Service, hosting Agent Framework or Semantic Kernel engines, and the Agents Toolkit/Playground workflow. Successor to the Bot Framework SDK.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/m365-agents-sdk

markus41/plugins/microsoft-agents-expert/skills/copilot-studio

tools

VerifiedTrustedCommunity

Design, govern, and extend Microsoft Copilot Studio agents — topics, generative orchestration, knowledge, tools and MCP, agent flows, autonomous triggers, publishing channels, Copilot Credits pricing, and solution-based ALM on Power Platform.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/copilot-studio

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/markus41/claude.git

# Copy into Claude Code skills folder (global)
cp -r claude/plugins/claude-code-expert/archive/v7.6.0/skills/model-routing ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

markus41/claude

10 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT