Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

curiositech/llm-router

Name: llm-router
Author: curiositech

skills/llm-router/SKILL.md

npx skillsauth add curiositech/windags-skills llm-router

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

LLM Router

Selects the optimal LLM model for each task. The single biggest cost lever in multi-agent systems — intelligent routing saves 45-85% while maintaining 95%+ of top-model quality.

When to Use

✅ Use for:

Deciding which model to call for a specific task
Assigning models to DAG nodes in agent workflows
Optimizing LLM API costs across a system
Building cascading try-cheap-first patterns

❌ NOT for:

Prompt engineering (use prompt-engineer)
Model fine-tuning or training
Comparing model architectures (academic research)

Routing Decision Tree

flowchart TD
  A{Task type?} -->|Classify / validate / format / extract| T1["Tier 1: Haiku, GPT-4o-mini (~$0.001)"]
  A -->|Write / implement / review / synthesize| T2["Tier 2: Sonnet, GPT-4o (~$0.01)"]
  A -->|Reason / architect / judge / decompose| T3["Tier 3: Opus, o1 (~$0.10)"]
  
  T1 --> Q1{Quality sufficient?}
  Q1 -->|Yes| Done1[Use cheap model]
  Q1 -->|No| T2
  
  T2 --> Q2{Quality sufficient?}
  Q2 -->|Yes| Done2[Use balanced model]
  Q2 -->|No| T3

Tier Assignment Table

| Task Type | Tier | Models | Cost/Call | Why This Tier | |-----------|------|--------|-----------|---------------| | Classify input type | 1 | Haiku, GPT-4o-mini | ~$0.001 | Deterministic categorization | | Validate schema/format | 1 | Haiku, GPT-4o-mini | ~$0.001 | Mechanical checking | | Format output / template | 1 | Haiku, GPT-4o-mini | ~$0.001 | Structured transformation | | Extract structured data | 1 | Haiku, GPT-4o-mini | ~$0.001 | Pattern matching | | Summarize text | 1-2 | Haiku → Sonnet | ~$0.001-0.01 | Short summaries: Haiku; nuanced: Sonnet | | Write content/docs | 2 | Sonnet, GPT-4o | ~$0.01 | Creative quality matters | | Implement code | 2 | Sonnet, GPT-4o | ~$0.01 | Correctness + style | | Review code/diffs | 2 | Sonnet, GPT-4o | ~$0.01 | Needs judgment, not just pattern matching | | Research synthesis | 2 | Sonnet, GPT-4o | ~$0.01 | Multi-source reasoning | | Decompose ambiguous problem | 3 | Opus, o1 | ~$0.10 | Requires deep understanding | | Design architecture | 3 | Opus, o1 | ~$0.10 | Complex system reasoning | | Judge output quality | 3 | Opus, o1 | ~$0.10 | Meta-reasoning about quality | | Plan multi-step strategy | 3 | Opus, o1 | ~$0.10 | Long-horizon planning |

Three Routing Strategies

Strategy 1: Static Tier Assignment (Start Here)

Assign model by task type at DAG design time. No runtime logic. Gets 60-70% of possible savings.

nodes:
  - id: classify
    model: claude-haiku-4-5     # Tier 1: $0.001
  - id: implement
    model: claude-sonnet-4-5    # Tier 2: $0.01  
  - id: evaluate
    model: claude-opus-4-5      # Tier 3: $0.10

Strategy 2: Cascading (Try Cheap First)

Try the cheap model; if quality is below threshold, escalate. Adds ~1s latency but saves 50-80% on nodes where cheap succeeds.

1. Execute with Tier 1 model
2. Quick quality check (also Tier 1 — costs ~$0.001)
3. If quality ≥ threshold → done
4. If quality < threshold → re-execute with Tier 2

Best for nodes where you're genuinely unsure which tier is needed.

Strategy 3: Adaptive (Learn from History)

Record success/failure per task type per model. Over time, the router learns:

"Classification nodes always succeed on Haiku" → stay cheap
"Code review nodes fail on Haiku 40% of the time" → upgrade to Sonnet
"Architecture nodes succeed on Sonnet 90% of the time" → don't need Opus

Gets 75-85% savings after ~100 executions of training data.

Provider Selection

Once model tier is chosen, select the provider:

| Model Class | Provider Options | Selection Criteria | |------------|-----------------|-------------------| | Haiku-class | Anthropic, AWS Bedrock | Latency, regional availability | | Sonnet-class | Anthropic, AWS Bedrock, GCP Vertex | Cost, rate limits | | Opus-class | Anthropic | Only provider | | GPT-4o-class | OpenAI, Azure OpenAI | Rate limits, compliance | | Open-source | Ollama (local), Together.ai, Fireworks | Cost ($0), latency, GPU availability |

Cost Impact Example

10-node DAG, "refactor a codebase":

| Strategy | Mix | Cost | Savings | |----------|-----|------|---------| | All Opus | 10× $0.10 | $1.00 | — | | All Sonnet | 10× $0.01 | $0.10 | 90% | | Static tiers | 4× Haiku + 4× Sonnet + 2× Opus | $0.24 | 76% | | Cascading | 6× Haiku + 3× Sonnet + 1× Opus | $0.14 | 86% | | Adaptive (trained) | Dynamic | ~$0.08 | 92% |

Anti-Patterns

Always Use the Best Model

Wrong: Route everything to Opus/o1 "for quality." Reality: 60%+ of typical DAG nodes are classification, validation, or formatting — tasks where Haiku performs identically to Opus. You're burning money.

Always Use the Cheapest Model

Wrong: Route everything to Haiku "for cost." Reality: Complex reasoning, architecture design, and quality judgment genuinely need stronger models. Haiku will produce plausible-looking but subtly wrong output on hard tasks.

Ignoring Latency

Wrong: Only optimizing for cost, ignoring that Opus takes 5-10x longer than Haiku. Reality: In a 10-node DAG, model choice affects total execution time as much as cost. Route time-critical paths to faster models.

No Feedback Loop

Wrong: Setting model tiers once and never adjusting. Reality: As models improve (Haiku gets smarter every generation), tasks that needed Sonnet last month may work on Haiku today. Record outcomes and adapt.

Output Contract

This skill produces:

Model integration code with inference pipeline and input/output typing
Data preprocessing pipeline with validation, normalization, and feature extraction
Evaluation metrics with benchmarks and acceptance thresholds
Deployment configuration with model serving setup and resource requirements
Monitoring hooks for model performance tracking and drift detection

curiositech/llm-router

skills/llm-router/SKILL.md

Selects the optimal LLM model and provider for each task based on complexity, cost budget, and capability requirements. Routes cheap tasks to Haiku/GPT-4o-mini and complex tasks to Sonnet/Opus/o1. Use when deciding which model to call, optimizing LLM costs, or building multi-model agent systems. Activate on "which model", "model selection", "route to model", "LLM cost", "model routing", "cheap vs expensive model". NOT for prompt engineering (use prompt-engineer), model fine-tuning, or training custom models.

development

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add curiositech/windags-skills llm-router

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 2:15 PM4.3s2 files scanned

SKILL.md

name:: llm-router
license:: Apache-2.0
description:: Selects the optimal LLM model and provider for each task based on complexity, cost budget, and capability requirements. Routes cheap tasks to Haiku/GPT-4o-mini and complex tasks to Sonnet/Opus/o1. Use when deciding which model to call, optimizing LLM costs, or building multi-model agent systems. Activate on "which model", "model selection", "route to model", "LLM cost", "model routing", "cheap vs expensive model". NOT for prompt engineering (use prompt-engineer), model fine-tuning, or training custom models.
allowed-tools:: Read
argument-hint:: [task-description] [budget: low|medium|high]
category:: AI & Machine Learning
- skill:: prompt-engineer
reason:: Prompt complexity analysis determines which model tier the router should select

LLM Router

Selects the optimal LLM model for each task. The single biggest cost lever in multi-agent systems — intelligent routing saves 45-85% while maintaining 95%+ of top-model quality.

When to Use

✅ Use for:

Deciding which model to call for a specific task
Assigning models to DAG nodes in agent workflows
Optimizing LLM API costs across a system
Building cascading try-cheap-first patterns

❌ NOT for:

Prompt engineering (use prompt-engineer)
Model fine-tuning or training
Comparing model architectures (academic research)

Routing Decision Tree

flowchart TD
  A{Task type?} -->|Classify / validate / format / extract| T1["Tier 1: Haiku, GPT-4o-mini (~$0.001)"]
  A -->|Write / implement / review / synthesize| T2["Tier 2: Sonnet, GPT-4o (~$0.01)"]
  A -->|Reason / architect / judge / decompose| T3["Tier 3: Opus, o1 (~$0.10)"]
  
  T1 --> Q1{Quality sufficient?}
  Q1 -->|Yes| Done1[Use cheap model]
  Q1 -->|No| T2
  
  T2 --> Q2{Quality sufficient?}
  Q2 -->|Yes| Done2[Use balanced model]
  Q2 -->|No| T3

Tier Assignment Table

Three Routing Strategies

Strategy 1: Static Tier Assignment (Start Here)

Assign model by task type at DAG design time. No runtime logic. Gets 60-70% of possible savings.

nodes:
  - id: classify
    model: claude-haiku-4-5     # Tier 1: $0.001
  - id: implement
    model: claude-sonnet-4-5    # Tier 2: $0.01  
  - id: evaluate
    model: claude-opus-4-5      # Tier 3: $0.10

Strategy 2: Cascading (Try Cheap First)

Try the cheap model; if quality is below threshold, escalate. Adds ~1s latency but saves 50-80% on nodes where cheap succeeds.

1. Execute with Tier 1 model
2. Quick quality check (also Tier 1 — costs ~$0.001)
3. If quality ≥ threshold → done
4. If quality < threshold → re-execute with Tier 2

Best for nodes where you're genuinely unsure which tier is needed.

Strategy 3: Adaptive (Learn from History)

Record success/failure per task type per model. Over time, the router learns:

"Classification nodes always succeed on Haiku" → stay cheap
"Code review nodes fail on Haiku 40% of the time" → upgrade to Sonnet
"Architecture nodes succeed on Sonnet 90% of the time" → don't need Opus

Gets 75-85% savings after ~100 executions of training data.

Provider Selection

Once model tier is chosen, select the provider:

Cost Impact Example

10-node DAG, "refactor a codebase":

Anti-Patterns

Always Use the Best Model

Always Use the Cheapest Model

Ignoring Latency

No Feedback Loop

Output Contract

This skill produces:

Model integration code with inference pipeline and input/output typing
Data preprocessing pipeline with validation, normalization, and feature extraction
Evaluation metrics with benchmarks and acceptance thresholds
Deployment configuration with model serving setup and resource requirements
Monitoring hooks for model performance tracking and drift detection

Related Skills

curiositech/revisiting-interview-data-analysing-turn

data-ai

VerifiedTrustedCommunity

license: Apache-2.0 NOT for unrelated tasks outside this domain.

8SKILL.mdUpdated Jul 19, 2026

curiositech/revisiting-interview-data-analysing-turn

curiositech/redis-patterns-expert

development

VerifiedTrustedCommunity

Use when designing caching strategies (cache-aside, write-through, write-behind), implementing distributed locks, building rate limiters, leaderboards, real-time streams (XADD/consumer groups), pub/sub, or tuning eviction policies. Triggers: thundering-herd on cache miss, dogpile on key expiry, Redlock vs SET-NX-PX choice, sliding-window rate limiter, hot-key on a single cluster slot, big-key blowup, MULTI/EXEC across slots, KEYS in production. NOT for Redis Cluster operations/admin (different domain), embedded KV (SQLite, leveldb), in-process LRU caches, or Memcached.

8SKILL.mdUpdated Jul 19, 2026

curiositech/redis-patterns-expert

curiositech/react-server-components-boundary

tools

VerifiedTrustedCommunity

Drawing the `'use client'` boundary correctly in React Server Components apps (Next.js App Router, RSC frameworks) — leaf-pushing, slot composition, serialization rules, and environment poisoning prevention. Grounded in react.dev and Next.js 16 docs.

8SKILL.mdUpdated Jul 19, 2026

curiositech/react-server-components-boundary

curiositech/rate-limiting-strategy

development

VerifiedTrustedCommunity

Use when designing rate limiting for an API, choosing between token bucket / sliding window / leaky bucket / fixed window, implementing it in Redis, deciding edge (Cloudflare/Upstash) vs origin enforcement, sizing per-user vs per-IP vs per-endpoint quotas, returning the right 429 response with Retry-After, or fixing the boundary-burst bug in fixed-window limiters. Triggers: 429 too many requests, INCR + EXPIRE, ZADD + ZREMRANGEBYSCORE + ZCARD, X-RateLimit-Remaining header, Cloudflare WAF rate limiting rules, Upstash @upstash/ratelimit, leaky bucket shaping vs policing, distributed rate limiter consistency. NOT for DDoS mitigation specifically (different scale), CAPTCHA / bot management, full WAF design, or per-user quota billing.

8SKILL.mdUpdated Jul 19, 2026

curiositech/rate-limiting-strategy

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/curiositech/windags-skills.git

# Copy into Claude Code skills folder (global)
cp -r windags-skills/skills/llm-router ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

curiositech/windags-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT