Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

alinaqi/skills/model-routing

Name: skills/model-routing
Author: alinaqi

skills/model-routing/SKILL.md

npx skillsauth add alinaqi/claude-bootstrap skills/model-routing

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Model Routing System

How Routing Decisions Are Made

Every user prompt goes through a 9-tier classification pipeline before any AI model processes it. The system answers three questions:

Which model should handle this? — 9-tier cost/complexity classification
Is the classifier itself working? — Cascading fallback (qwen3 → kimi → deepseek → cache)
Can we verify the result? — Tool-level fallback + auto-evaluation

The Pipeline

User types prompt
    ↓
UserPromptSubmit hook fires (~/.claude/hooks/route-task-hook)
    ↓
Classifier: qwen3 (local, free) classifies into tier
    ↓  (fails?)
Classifier: kimi (local, free) retries
    ↓  (fails?)
Classifier: deepseek-flash (~$0.0001) retries
    ↓  (fails?)
Classifier: cached tier from last success
    ↓
Hook injects routing decision into Claude's context
    ↓
Claude delegates to the right model or handles directly

9-Tier Routing Table

| Tier | Model | Input (per M) | Output (per M) | Handles | |------|-------|---------------|----------------|---------| | 0 | Qwen3 (local) | $0 | $0 | grep, find, shell, syntax, log reading | | 1 | Gemini 2.5 Flash-Lite | $0.10 | $0.40 | Bulk extraction, classification, CIG pipelines | | 2 | DeepSeek V4 Flash | $0.14 | $0.28 | Simple code, CRUD, test writing, small fixes | | 3 | DeepSeek V4 Pro | $0.44 | $0.87 | Multi-file features, refactors, debugging (~80% of work) | | 4 | Gemini 2.5 Flash | $0.15 | $0.60 | Multimodal (images, video, audio), brand analysis | | 5 | Kimi K2.6 | $0.60 | $2.50 | Code review, commit messages, diff summaries | | 6 | Gemini 3.1 Pro + Search | $1.25 | $10.00 | Deep research, Google grounding, 2M context | | 7 | Codex | varies | varies | Bulk generation, code review | | 8 | Claude Sonnet/Opus | $3-5 | $15-25 | Architecture, security, quality-critical |

Delegation Commands

When the hook says "delegate to X", run the matching command and return its output:

# Tier 0 — Qwen3
~/bin/qwen3 "prompt"

# Tier 1 — Gemini Flash-Lite
~/bin/gemini --flash-lite "prompt"

# Tier 2 — DeepSeek Flash
~/bin/deepseek --flash "prompt"

# Tier 3 — DeepSeek Pro
~/bin/deepseek --pro "prompt"

# Tier 4 — Gemini Flash
~/bin/gemini --flash "prompt"

# Tier 5 — Kimi
~/bin/kimi --quiet -p "prompt"

# Tier 6 — Gemini Pro Search
~/bin/gemini --pro-search "prompt"

# Tier 7 — Codex
codex exec "prompt"

# Tier 8 — Claude
# Handle directly (no delegation)

Delegation Script Contract

Every ~/bin/ script follows the same pattern:

Accepts prompt as argument: script "what is 2+2"
Model flags: --flash, --pro, --flash-lite, --pro-search
Quiet mode: --quiet (where applicable)
Output: writes response to stdout, errors to stderr
Exit codes: 0 on success, non-zero on failure

Available Scripts

~/bin/
├── qwen3       # Shell: curl to local Ollama API
├── kimi        # Shell: execs Kimi CLI binary
├── deepseek    # Python: httpx to DeepSeek Anthropic-compat API
├── gemini      # Python: httpx to Gemini OpenAI-compat API
├── research    # Python: multi-backend research with auto-evaluation
└── route-task  # Shell: qwen3-powered task classification

Classifier Fallback Chain

The classifier itself can fail. When it does, cascading fallback kicks in:

| Level | Classifier | Cost | Threshold | |-------|-----------|------|-----------| | 1 | qwen3 (Ollama) | $0 | 2s connect, 8s classify | | 2 | kimi CLI | $0 | Local process | | 3 | deepseek-flash | ~$0.0001 | API call | | 4 | Cached tier | $0 | From ~/.claude/routing-cache.json |

The cache (~/.claude/routing-cache.json) saves the last successful tier and timestamp. After compaction, when Ollama may be briefly unreachable, the cache ensures routing continues without dropping to CLAUDE by default.

Tool Fallback Protocol

When Claude's built-in tools fail, external backends take over:

| Failed Tool | Fallback 1 | Fallback 2 | |-------------|------------|------------| | WebSearch / WebFetch | ~/bin/research "query" | ~/bin/deepseek --pro "query" | | Read / file access | cat via Bash | — | | Grep | grep -r via Bash | — |

Research Tool (`~/bin/research`)

Multi-backend research with auto-evaluation:

Tries deepseek-flash → deepseek-pro in sequence
Scores results 0-10 on content quality, structure, length
Auto-adjusts preferred backend based on evaluation scores
View stats: ~/bin/research --eval
Score log: ~/.claude/research-eval.jsonl

Maggy Integration

Maggy's model_router.py mirrors the same 9-tier structure in DEFAULT_TIERS. The PiAdapter uses the same delegation scripts for execution. Task type overrides in routing_rules_defaults.py ensure:

research, competitor → Gemini Pro Search (Google grounding)
bulk → Gemini Flash-Lite (cheapest)
security, architecture, planning → Claude (quality-critical)
docs, tests → DeepSeek Pro (cost-efficient)
review → Claude (security + architecture depth)

Environment

# Required for delegation scripts (in ~/.zshrc)
export DEEPSEEK_API_KEY="sk-..."
export GEMINI_API_KEY="..."       # For gemini delegator
export OPENAI_API_KEY="sk-..."    # For codex CLI

# Ollama must be running locally for qwen3
ollama serve  # or launch at startup

Observability

Routing log: ~/.claude/routing-log.jsonl — every classification with tier, classifier used, tokens saved
Routing cache: ~/.claude/routing-cache.json — last tier for post-compact recovery
Research eval: ~/.claude/research-eval.jsonl — per-query backend scoring
Maggy routing heatmap: Dashboard → Models tab → per-model reward scores

alinaqi/skills/model-routing

skills/model-routing/SKILL.md

# Model Routing System ## How Routing Decisions Are Made Every user prompt goes through a 9-tier classification pipeline before any AI model processes it. The system answers three questions: 1. **Which model should handle this?** — 9-tier cost/complexity classification 2. **Is the classifier itself working?** — Cascading fallback (qwen3 → kimi → deepseek → cache) 3. **Can we verify the result?** — Tool-level fallback + auto-evaluation ### The Pipeline ``` User types prompt ↓ UserPromptS

631 stars

tools

Updated May 17, 2026

$ install --global

skillsauth

npx skillsauth add alinaqi/claude-bootstrap skills/model-routing

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 17, 2026, 3:55 AM56.1s1 file scanned

SKILL.md

Model Routing System

How Routing Decisions Are Made

Every user prompt goes through a 9-tier classification pipeline before any AI model processes it. The system answers three questions:

Which model should handle this? — 9-tier cost/complexity classification
Is the classifier itself working? — Cascading fallback (qwen3 → kimi → deepseek → cache)
Can we verify the result? — Tool-level fallback + auto-evaluation

The Pipeline

User types prompt
    ↓
UserPromptSubmit hook fires (~/.claude/hooks/route-task-hook)
    ↓
Classifier: qwen3 (local, free) classifies into tier
    ↓  (fails?)
Classifier: kimi (local, free) retries
    ↓  (fails?)
Classifier: deepseek-flash (~$0.0001) retries
    ↓  (fails?)
Classifier: cached tier from last success
    ↓
Hook injects routing decision into Claude's context
    ↓
Claude delegates to the right model or handles directly

9-Tier Routing Table

Delegation Commands

When the hook says "delegate to X", run the matching command and return its output:

# Tier 0 — Qwen3
~/bin/qwen3 "prompt"

# Tier 1 — Gemini Flash-Lite
~/bin/gemini --flash-lite "prompt"

# Tier 2 — DeepSeek Flash
~/bin/deepseek --flash "prompt"

# Tier 3 — DeepSeek Pro
~/bin/deepseek --pro "prompt"

# Tier 4 — Gemini Flash
~/bin/gemini --flash "prompt"

# Tier 5 — Kimi
~/bin/kimi --quiet -p "prompt"

# Tier 6 — Gemini Pro Search
~/bin/gemini --pro-search "prompt"

# Tier 7 — Codex
codex exec "prompt"

# Tier 8 — Claude
# Handle directly (no delegation)

Delegation Script Contract

Every ~/bin/ script follows the same pattern:

Accepts prompt as argument: script "what is 2+2"
Model flags: --flash, --pro, --flash-lite, --pro-search
Quiet mode: --quiet (where applicable)
Output: writes response to stdout, errors to stderr
Exit codes: 0 on success, non-zero on failure

Available Scripts

~/bin/
├── qwen3       # Shell: curl to local Ollama API
├── kimi        # Shell: execs Kimi CLI binary
├── deepseek    # Python: httpx to DeepSeek Anthropic-compat API
├── gemini      # Python: httpx to Gemini OpenAI-compat API
├── research    # Python: multi-backend research with auto-evaluation
└── route-task  # Shell: qwen3-powered task classification

Classifier Fallback Chain

The classifier itself can fail. When it does, cascading fallback kicks in:

Tool Fallback Protocol

When Claude's built-in tools fail, external backends take over:

Research Tool (`~/bin/research`)

Multi-backend research with auto-evaluation:

Tries deepseek-flash → deepseek-pro in sequence
Scores results 0-10 on content quality, structure, length
Auto-adjusts preferred backend based on evaluation scores
View stats: ~/bin/research --eval
Score log: ~/.claude/research-eval.jsonl

Maggy Integration

research, competitor → Gemini Pro Search (Google grounding)
bulk → Gemini Flash-Lite (cheapest)
security, architecture, planning → Claude (quality-critical)
docs, tests → DeepSeek Pro (cost-efficient)
review → Claude (security + architecture depth)

Environment

# Required for delegation scripts (in ~/.zshrc)
export DEEPSEEK_API_KEY="sk-..."
export GEMINI_API_KEY="..."       # For gemini delegator
export OPENAI_API_KEY="sk-..."    # For codex CLI

# Ollama must be running locally for qwen3
ollama serve  # or launch at startup

Observability

Routing log: ~/.claude/routing-log.jsonl — every classification with tier, classifier used, tokens saved
Routing cache: ~/.claude/routing-cache.json — last tier for post-compact recovery
Research eval: ~/.claude/research-eval.jsonl — per-query backend scoring
Maggy routing heatmap: Dashboard → Models tab → per-model reward scores

Related Skills

alinaqi/council-review

testing

VerifiedTrustedCommunity

Multi-model validation council — auto-validate plans, architecture changes, and PRs via validate-plan/review before executing

693SKILL.mdUpdated Jun 5, 2026

alinaqi/council-review

alinaqi/mnemos

development

VerifiedTrustedCommunity

Task-scoped memory lifecycle — typed MnemoGraph prevents lossy context compaction by treating facts/decisions/code-refs/handoffs as distinct node types with per-type eviction policies

690SKILL.mdUpdated Apr 11, 2026

alinaqi/code-review

development

VerifiedTrustedCommunity

Mandatory code reviews via /code-review before commits and deploys

641SKILL.mdUpdated Mar 20, 2026

alinaqi/skills/visual-validation

development

VerifiedTrustedCommunity

# Visual Validation — Autonomous Screenshot Verification ## Philosophy Every UI change should be visually verified before it ships. Peekaboo captures pixel-accurate screenshots. The system compares before/after and flags visual regressions. No manual "looks good to me" — the machine verifies what the machine built. ## Autonomous Flow ``` static/* files modified (detected by auto-review-hook or E2E testkit) ↓ peekaboo image --mode screen → ~/.maggy/visual-verify/after-{ts}.png ↓ Compa

635SKILL.mdUpdated May 19, 2026

alinaqi/skills/visual-validation

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/alinaqi/claude-bootstrap.git

# Copy into Claude Code skills folder (global)
cp -r claude-bootstrap/skills/model-routing ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

alinaqi/claude-bootstrap

631 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

alinaqi/skills/model-routing

$ install --global

Security Scan Results

SKILL.md

Model Routing System

How Routing Decisions Are Made

The Pipeline

9-Tier Routing Table

Delegation Commands

Delegation Script Contract

Available Scripts

Classifier Fallback Chain

Tool Fallback Protocol

Research Tool (~/bin/research)

Maggy Integration

Environment

Observability

Related Skills

alinaqi/council-review

alinaqi/mnemos

alinaqi/code-review

alinaqi/skills/visual-validation

alinaqi/skills/model-routing

$ install --global

Security Scan Results

SKILL.md

Model Routing System

How Routing Decisions Are Made

The Pipeline

9-Tier Routing Table

Delegation Commands

Delegation Script Contract

Available Scripts

Classifier Fallback Chain

Tool Fallback Protocol

Research Tool (~/bin/research)

Maggy Integration

Environment

Observability

Related Skills

alinaqi/council-review

alinaqi/mnemos

alinaqi/code-review

alinaqi/skills/visual-validation

Research Tool (`~/bin/research`)

Research Tool (`~/bin/research`)