plugins/claude-code-expert/skills/model-routing/SKILL.md
--- name: model-routing description: Pick the right Claude model (Opus, Sonnet, Haiku) for a task and manage cost — decision matrix, cost tables, budget planning, cascading strategy. Use this skill whenever choosing a model, setting a token budget, optimizing session cost, or deciding whether to upgrade/downgrade mid-task. Triggers on: "which model", "cost", "budget", "haiku vs sonnet", "opus for this", "save tokens", "model cascading", "/cc-budget". --- # Model Routing Claude model choice is
npx skillsauth add markus41/claude plugins/claude-code-expert/skills/model-routingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Claude model choice is the biggest cost lever in Claude Code. Match the model to the work.
| Task type | Model | Why | |---|---|---| | Architecture decision | Opus | Multi-step reasoning; hidden-cost detection | | Root-cause debugging (hard) | Opus | Hypothesis trees, multi-source evidence | | Security review | Opus | Risk sensitivity; knowledge of OWASP/CWE | | Feature implementation | Sonnet | Standard generation; good reasoning | | Code review (routine PR) | Sonnet | Fast; catches most issues | | Test writing | Sonnet | Pattern-based | | Research / docs lookup | Haiku | Fast; cheap; sufficient for retrieval | | Bulk file edits (rename, reformat) | Haiku | Mechanical work | | Dependency audit | Haiku | Running commands, parsing output | | Simple Q&A | Haiku | One-shot factual answers |
cc_docs_model_recommend for current)| Model | Alias / ID | Input $/M | Output $/M | Relative |
|---|---|---|---|---|
| Opus 4.8 | opus / claude-opus-4-8 | ~$15 | ~$75 | 5× |
| Sonnet 4.6 | sonnet / claude-sonnet-4-6 | ~$3 | ~$15 | 1× |
| Haiku 4.5 | haiku / claude-haiku-4-5-20251001 | ~$0.80 | ~$4 | 0.3× |
Output tokens are the dominant cost in most Claude Code sessions. Opus is ~5× the cost but ~2× the capability on hard tasks — use it where the capability matters.
Aliases auto-resolve to the latest generation — prefer opus/sonnet/haiku over pinned IDs so a model refresh doesn't strand your config. Use opusplan for Opus-reasoning + Sonnet-execution, or best for "most capable available". Extended 1M-token context: opus[1m] / sonnet[1m].
Fast mode (/fast in-session, --fast at launch) keeps you on Opus (4.6/4.7/4.8) but optimizes for faster output — it does not downgrade to a smaller model. Toggle it when you want Opus-level reasoning without the usual latency.
Effort levels scale reasoning depth independently of model: low · medium · high · xhigh · max (Opus 4.7/4.8 add xhigh). Set via /effort, --effort <level>, or effort: in skill/agent frontmatter — cheaper than jumping a model tier when you just need deeper thinking.
The high-leverage pattern: start with a cheap model for planning, delegate implementation to cheap, reserve Opus for review gates.
| Phase | Model | |---|---| | Plan mode (Shift+Tab) | Opus | | Implementation | Sonnet | | Subagent research | Haiku | | Code review gate | Opus | | Final sign-off | Opus |
Net effect: most tokens are on Sonnet/Haiku; Opus tokens are where they matter most.
For a task estimated at N turns:
Use cc_docs_model_recommend(task, budget) to get a specific recommendation with cost projection.
Downgrade to Haiku when:
Upgrade to Opus when:
Shift+Tab toggles plan mode — uses Opus to think deeper without producing code. Use for:
Don't use plan mode for: known patterns, mechanical work, small tweaks.
| Need | Tool |
|---|---|
| Model recommendation for a task | cc_docs_model_recommend(task, budget?) |
| Compare two model choices | cc_docs_compare(["opus", "sonnet"]) |
| Check cost of an autonomy profile | cc_kb_autonomy_profile(profile) |
/plan on new work → code-first on unfamiliar problems wastes tokens.development
Enhanced plan-authoring skill with Pre-Writing context gathering, task metadata, non-TDD templates, Red Flags, telemetry, and an automated plan linter. Use when you have a spec or requirements for a multi-step task, before touching code.
tools
Documentation intelligence engine with graph-based API docs, algorithm library, and drift detection
tools
Ultraplan cloud planning — kick off a plan in the cloud from your terminal, review and revise in the browser, then execute remotely or send back to CLI
tools
--- name: mcp description: Configure MCP servers for Claude Code — stdio vs HTTP, authentication, Tools/Resources/Prompts distinction, channels (CI webhook, mobile relay, Discord bridge, fakechat), and cost of always-loaded tools. Use this skill whenever adding an MCP server, debugging connection issues, choosing between MCP Tools vs Prompts vs Resources, installing channel servers, or managing .mcp.json. Triggers on: "MCP server", "mcp config", "add Obsidian MCP", "install context7", "channels"