plugins/claude-code-expert/archive/v7.6.0/skills/cost-optimization/SKILL.md
# Claude Code Cost Optimization Complete guide to managing costs, model routing, token usage, and caching. ## Cost Tracking ### /cost Command ``` /cost ``` Shows: - Input tokens consumed - Output tokens consumed - Cache read tokens (cheaper) - Cache write tokens - Total estimated cost (USD) ## Model Selection & Routing ### Available Models | Model | ID | Best For | Cost | |-------|-----|---------|------| | Opus 4.6 | `claude-opus-4-6` | Architecture, complex decisions | Highest | | Sonnet 4
npx skillsauth add markus41/claude plugins/claude-code-expert/archive/v7.6.0/skills/cost-optimizationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Complete guide to managing costs, model routing, token usage, and caching.
/cost
Shows:
| Model | ID | Best For | Cost |
|-------|-----|---------|------|
| Opus 4.6 | claude-opus-4-6 | Architecture, complex decisions | Highest |
| Sonnet 4.6 | claude-sonnet-4-6 | General development, implementation | Medium |
| Haiku 4.5 | claude-haiku-4-5-20251001 | Quick lookups, simple tasks | Lowest |
/model claude-haiku-4-5-20251001 # Switch to Haiku for simple tasks
/model claude-sonnet-4-6 # Switch back to Sonnet
/model claude-opus-4-6 # Switch to Opus for complex work
claude -m claude-haiku-4-5-20251001 -p "quick question"
{
"model": "claude-sonnet-4-6",
"smallFastModel": "claude-haiku-4-5-20251001"
}
/compact # Compress full conversation
/compact focus on the API # Compress with specific focus
Reduces context window size, lowering per-message input costs.
// Expensive: read entire large file
Read(file_path="large-file.ts") // ~5000 tokens
// Cheap: read specific section
Read(file_path="large-file.ts", offset=100, limit=30) // ~300 tokens
// Cheap: search first
Grep(pattern="function auth", path="src/") // ~100 tokens
Sub-agents process information internally and return summaries:
// Main context gets only the summary (~500 tokens)
// Instead of 20 file reads (~50,000 tokens)
Agent(subagent_type="Explore", prompt="Find all database models")
// Don't read every file looking for something
// Search first, then read only matching files
Grep(pattern="TODO|FIXME", type="ts")
// Long tasks don't consume main context while running
Agent(run_in_background=true, ...)
Bash(command="npm test", run_in_background=true)
/clear # Reset context for new topic
--append-system-prompt frequentlyconst response = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
system: [
{
type: "text",
text: "Your system prompt here...",
cache_control: { type: "ephemeral" }
}
],
messages: [...]
});
// Usage shows cache info
console.log(response.usage.cache_creation_input_tokens);
console.log(response.usage.cache_read_input_tokens);
Standard pricing, most features.
CLAUDE_CODE_USE_BEDROCK=1 claude
CLAUDE_CODE_USE_VERTEX=1 claude
For non-interactive workloads, use the Message Batches API:
const batch = await client.messages.batches.create({
requests: [
{
custom_id: "review-1",
params: {
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [{ role: "user", content: "Review file1.ts" }]
}
},
{
custom_id: "review-2",
params: {
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [{ role: "user", content: "Review file2.ts" }]
}
}
]
});
Batch processing gives 50% cost reduction with 24-hour SLA.
| Task | Approximate Cost | |------|-----------------| | Simple question | $0.01 - $0.05 | | Code review (1 file) | $0.05 - $0.15 | | Feature implementation | $0.20 - $1.00 | | Complex refactoring | $0.50 - $2.00 | | Full project analysis | $1.00 - $5.00 |
development
Enhanced plan-authoring skill with Pre-Writing context gathering, task metadata, non-TDD templates, Red Flags, telemetry, and an automated plan linter. Use when you have a spec or requirements for a multi-step task, before touching code.
tools
Documentation intelligence engine with graph-based API docs, algorithm library, and drift detection
tools
Ultraplan cloud planning — kick off a plan in the cloud from your terminal, review and revise in the browser, then execute remotely or send back to CLI
tools
--- name: mcp description: Configure MCP servers for Claude Code — stdio vs HTTP, authentication, Tools/Resources/Prompts distinction, channels (CI webhook, mobile relay, Discord bridge, fakechat), and cost of always-loaded tools. Use this skill whenever adding an MCP server, debugging connection issues, choosing between MCP Tools vs Prompts vs Resources, installing channel servers, or managing .mcp.json. Triggers on: "MCP server", "mcp config", "add Obsidian MCP", "install context7", "channels"