plugins/claude-code-expert/archive/v7.6.0/skills/model-routing/SKILL.md
Intelligent model selection for Claude Code — decision matrices, cost tables, budget planning, and subagent model assignment for optimal cost/quality tradeoffs
npx skillsauth add markus41/claude model-routingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Select the right Claude model for each task to optimize the cost/quality tradeoff.
Eliminate wasted spend by routing tasks to the cheapest model that produces acceptable quality, while ensuring complex tasks get the reasoning depth they need.
| Task Type | Recommended Model | Reasoning | |-----------|-------------------|-----------| | Architecture decisions | Opus 4.6 | Needs deep multi-step reasoning, hidden coupling detection | | Complex debugging | Opus 4.6 | Root cause analysis requires holding many hypotheses | | Security review | Opus 4.6 | Must not miss subtle vulnerabilities | | Standard implementation | Sonnet 4.6 | Best balance of speed, quality, and cost for code generation | | Code review | Sonnet 4.6 | Good pattern recognition at reasonable cost | | Refactoring | Sonnet 4.6 | Mechanical transformations with quality checks | | Test writing | Sonnet 4.6 | Formulaic but needs understanding of code under test | | File search / grep | Haiku 4.5 | Simple lookup, no deep reasoning needed | | Documentation lookup | Haiku 4.5 | Reading and summarizing existing content | | Commit message generation | Haiku 4.5 | Short, formulaic output | | Simple Q&A | Haiku 4.5 | Direct answers, no complex analysis | | Research subagents | Haiku 4.5 | Exploration tasks that return summaries |
Use these signals to decide when to escalate from Sonnet to Opus:
Use these signals to downgrade from Sonnet to Haiku:
| Model | Input | Output | Cache Write | Cache Read | |-------|------:|-------:|------------:|-----------:| | Opus 4.6 | $15.00 | $75.00 | $18.75 | $1.50 | | Sonnet 4.6 | $3.00 | $15.00 | $3.75 | $0.30 | | Haiku 4.5 | $0.80 | $4.00 | $1.00 | $0.08 |
| Comparison | Input | Output | |-----------|------:|-------:| | Opus vs Sonnet | 5x | 5x | | Sonnet vs Haiku | 3.75x | 3.75x | | Opus vs Haiku | 18.75x | 18.75x |
| Task | Model | Est. Tokens (in/out) | Est. Cost | |------|-------|---------------------:|----------:| | Simple bug fix | Sonnet | 50k/10k | ~$0.30 | | Feature implementation | Sonnet | 200k/50k | ~$1.35 | | Architecture review | Opus | 200k/30k | ~$5.25 | | Quick lookup | Haiku | 20k/2k | ~$0.02 | | Research subagent | Haiku | 80k/10k | ~$0.10 | | Full code review (council) | Mixed | 500k/100k | ~$3-8 |
When using cc-orchestrate or spawning subagents, assign models by role:
Research agents → Haiku (cheap exploration, summary return)
Implementation agents → Sonnet (code generation quality)
Review/audit agents → Sonnet or Opus (depends on risk)
Architecture agents → Opus (deep reasoning required)
builder agent → Sonnet 4.6 (writes code)
validator agent → Sonnet 4.6 (reviews code)
researcher agents (3x) → Haiku 4.5 (parallel exploration)
synthesizer agent → Sonnet 4.6 (combines findings)
Before starting a task, estimate cost:
/model or claude -m| Content Type | Tokens per Line | |-------------|----------------:| | TypeScript/JavaScript | ~10 | | Python | ~8 | | JSON/YAML | ~6 | | Markdown | ~5 | | Minified code | ~15 |
cat-ing entire large files; use Grep with limits--max-turns in headless mode to cap automated sessions# Start with research on Haiku
/model claude-haiku-4-5-20251001
# "Find all files related to auth, summarize the architecture"
# Switch to Sonnet for implementation
/model claude-sonnet-4-6
# "Implement the new auth middleware based on the research above"
# Switch to Opus for the tricky part
/model claude-opus-4-6
# "Review the session handling for race conditions and edge cases"
CLAUDE_MODEL=claude-sonnet-4-6 # Default model for sessions
ANTHROPIC_MODEL=claude-sonnet-4-6 # Alternative env var
{
"model": "claude-sonnet-4-6",
"smallFastModel": "claude-haiku-4-5-20251001"
}
The smallFastModel is used for internal operations like skill matching and context compression. Keep it on Haiku for cost efficiency.
development
Enhanced plan-authoring skill with Pre-Writing context gathering, task metadata, non-TDD templates, Red Flags, telemetry, and an automated plan linter. Use when you have a spec or requirements for a multi-step task, before touching code.
tools
Documentation intelligence engine with graph-based API docs, algorithm library, and drift detection
tools
Ultraplan cloud planning — kick off a plan in the cloud from your terminal, review and revise in the browser, then execute remotely or send back to CLI
tools
--- name: mcp description: Configure MCP servers for Claude Code — stdio vs HTTP, authentication, Tools/Resources/Prompts distinction, channels (CI webhook, mobile relay, Discord bridge, fakechat), and cost of always-loaded tools. Use this skill whenever adding an MCP server, debugging connection issues, choosing between MCP Tools vs Prompts vs Resources, installing channel servers, or managing .mcp.json. Triggers on: "MCP server", "mcp config", "add Obsidian MCP", "install context7", "channels"