skills/model-hierarchy-skill/SKILL.md
Cost-optimize AI agent operations by routing tasks to appropriate models based on complexity. Use this skill when: (1) deciding which model to use for a task, (2) spawning sub-agents, (3) considering cost efficiency, (4) the current model feels like overkill for the task. Triggers: "model routing", "cost optimization", "which model", "too expensive", "spawn agent".
npx skillsauth add aaaaqwq/agi-super-team model-hierarchyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Route tasks to the cheapest model that can handle them. Most agent work is routine.
80% of agent tasks are janitorial. File reads, status checks, formatting, simple Q&A. These don't need expensive models. Reserve premium models for problems that actually require deep reasoning.
| Model | Input | Output | Best For | |-------|-------|--------|----------| | DeepSeek V3 | $0.14 | $0.28 | General routine work | | GPT-4o-mini | $0.15 | $0.60 | Quick responses | | Claude Haiku | $0.25 | $1.25 | Fast tool use | | Gemini Flash | $0.075 | $0.30 | High volume | | GLM 5 (Zhipu) | (OpenRouter Z.AI) | (OpenRouter Z.AI) | Routine + moderate text; 200K context; text-only — do not use for image/vision | | Kimi K2.5 (Moonshot) | $0.45 | $2.25 | Routine + moderate; 262K context; multimodal (text + image + video) |
Text-only models (e.g. GLM 5): Do not use for any task that requires image input or vision — no photo analysis, screenshots, image-generation tools, or document/chart vision. Route to a vision-capable model (e.g. Kimi K2.5, GPT-4o, Gemini, Claude with vision, GLM-4.5V/4.6V).
Vision-capable Tier 1/2 (e.g. Kimi K2.5): Use for routine or moderate tasks that may involve images — screenshots, photo analysis, docs, image-generation orchestration — without moving to premium vision models.
| Model | Input | Output | Best For | |-------|-------|--------|----------| | Claude Sonnet | $3.00 | $15.00 | Balanced performance | | GPT-4o | $2.50 | $10.00 | Multimodal tasks | | Gemini Pro | $1.25 | $5.00 | Long context |
| Model | Input | Output | Best For | |-------|-------|--------|----------| | Claude Opus | $15.00 | $75.00 | Complex reasoning | | GPT-4.5 | $75.00 | $150.00 | Frontier tasks | | o1 | $15.00 | $60.00 | Multi-step reasoning | | o3-mini | $1.10 | $4.40 | Reasoning on budget |
Prices as of Feb 2026. Check provider docs for current rates.
Before executing any task, classify it:
Requires image/vision → Do not assign to text-only models (GLM 5, etc.). Use a vision-capable model from Tier 1/2 or 3 (e.g. Kimi K2.5, GPT-4o, Gemini, Claude, GLM-4.5V).
Characteristics:
Examples:
Characteristics:
Examples:
Characteristics:
Examples:
function selectModel(task):
# Rule 1: Vision override (Tier 1/2 includes text-only models)
if task.requiresImageInput or task.requiresVision:
return VISION_CAPABLE_MODEL # e.g. Kimi K2.5, GPT-4o, Gemini, Claude; do not use GLM 5 or other text-only
# Rule 2: Escalation override
if task.previousAttemptFailed:
return nextTierUp(task.previousModel)
# Rule 3: Explicit complexity signals
if task.hasSignal("debug", "architect", "design", "security"):
return TIER_3
if task.hasSignal("write", "code", "summarize", "analyze"):
return TIER_2
# Rule 4: Default classification
complexity = classifyTask(task)
if complexity == ROUTINE:
return TIER_1
elif complexity == MODERATE:
return TIER_2
else:
return TIER_3
When suggesting model changes, use clear language:
Downgrade suggestion:
"This looks like routine file work. Want me to spawn a sub-agent on DeepSeek for this? Same result, fraction of the cost."
Upgrade request:
"I'm hitting the limits of what I can figure out here. This needs Opus-level reasoning. Switching up."
Explaining hierarchy:
"I'm running the heavy analysis on Sonnet while sub-agents fetch the data on DeepSeek. Keeps costs down without sacrificing quality where it matters."
Assuming 100K tokens/day average usage:
| Strategy | Monthly Cost | Notes | |----------|--------------|-------| | Pure Opus | ~$225 | Maximum capability, maximum spend | | Pure Sonnet | ~$45 | Good default for most work | | Pure DeepSeek | ~$8 | Cheap but limited on hard problems | | Hierarchy (80/15/5) | ~$19 | Best of all worlds |
The 80/15/5 split:
Result: 10x cost reduction vs pure premium, with equivalent quality on complex tasks.
# config.yml - set default model
model: anthropic/claude-sonnet-4
# In session, switch models
/model opus # upgrade for complex task
/model deepseek # downgrade for routine
# Spawn sub-agent on cheap model
sessions_spawn:
task: "Fetch and parse these 50 URLs"
model: deepseek
OpenRouter (Tier 1 with vision or text-only):
# Tier 1 with vision — Kimi K2.5 (multimodal)
model: openrouter/moonshotai/kimi-k2.5
# Heartbeats, cron, image-involving tasks: K2.5 handles text and vision.
# Tier 1 text-only — GLM 5 (no vision)
# model: openrouter/z-ai/glm-5 # exact ID TBD on OpenRouter Z.AI
# Routine text-only only; for image tasks use Kimi K2.5 or another vision-capable model.
# In CLAUDE.md or project instructions
When spawning background agents, use claude-3-haiku for:
- File operations
- Simple searches
- Status checks
Reserve claude-sonnet-4 for:
- Code generation
- Analysis tasks
def get_model_for_task(task_description: str) -> str:
routine_signals = ['read', 'fetch', 'check', 'list', 'format', 'status']
complex_signals = ['debug', 'architect', 'design', 'security', 'why']
desc_lower = task_description.lower()
if any(signal in desc_lower for signal in complex_signals):
return "claude-opus-4"
elif any(signal in desc_lower for signal in routine_signals):
return "deepseek-v3"
else:
return "claude-sonnet-4"
DON'T:
DO:
To customize for your use case:
development
Technology-agnostic prompt generator that creates customizable AI prompts for scanning codebases and identifying high-quality code exemplars. Supports multiple programming languages (.NET, Java, JavaScript, TypeScript, React, Angular, Python) with configurable analysis depth, categorization methods, and documentation formats to establish coding standards and maintain consistency across development teams.
tools
Expert-level browser automation, debugging, and performance analysis using Chrome DevTools MCP. Use for interacting with web pages, capturing screenshots, analyzing network traffic, and profiling performance.
data-ai
Prompt for creating detailed feature implementation plans, following Epoch monorepo structure.
tools
Interactive prompt refinement workflow: interrogates scope, deliverables, constraints; copies final markdown to clipboard; never writes code. Requires the Joyride extension.