/SKILL.md
Use when selecting AI models, configuring API parameters, or implementing LLM calls. Covers OpenAI (GPT-5.2, GPT-5.1, GPT-4.1, o3), Anthropic (Claude 4.5), Google (Gemini 2.5/3), DeepSeek (V3.2, R1), and embedding models with specs, gotchas, and code templates.
npx skillsauth add jaymay549/ai-model-selector ai-model-selectorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Comprehensive guide to selecting and implementing AI models. Updated January 2026.
What's your primary need?
│
├─► CODING/AGENTIC TASKS
│ ├─► Best quality → Claude Sonnet 4.5 or GPT-5.2-Codex
│ ├─► Complex reasoning → Claude Opus 4.5 (with effort param)
│ └─► Budget → DeepSeek-chat ($0.28/1M input)
│
├─► REASONING/MATH/SCIENCE
│ ├─► Maximum intelligence → GPT-5.2 Pro or Claude Opus 4.5
│ ├─► Good balance → GPT-5.2 (xhigh effort) or Gemini 2.5 Pro
│ └─► Budget → DeepSeek-reasoner (visible CoT)
│
├─► LONG DOCUMENTS (>200K tokens)
│ ├─► Up to 1M tokens → Claude Sonnet 4.5 (beta) or Gemini 2.5 Pro
│ ├─► Up to 400K → GPT-5.2
│ └─► Budget → DeepSeek-chat (128K)
│
├─► HIGH-VOLUME/LOW-LATENCY
│ ├─► Best speed → Claude Haiku 4.5
│ ├─► Cheapest → Gemini 2.5 Flash-Lite ($0.10/$0.40)
│ └─► Free tier → Gemini via AI Studio
│
├─► EMBEDDINGS/RAG
│ ├─► Best quality → Voyage 3.5 or voyage-3-large
│ ├─► Code-specific → voyage-code-3
│ ├─► Budget → text-embedding-3-small ($0.02/1M)
│ └─► Free → gemini-embedding-001
│
└─► MULTIMODAL (images/audio/video)
├─► Images → GPT-4o, Gemini 2.5 Pro/Flash, Claude 4.5
├─► Image generation → GPT Image 1, Imagen 4.0
└─► Video generation → Veo 3.1
| Model | Context | Max Output | Input/Output $/1M | Best For | |-------|---------|------------|-------------------|----------| | GPT-5.2 | 400K | 128K | $1.75/$14 | Complex reasoning, coding | | GPT-5.2 Pro | 400K | 128K | $21/$168 | Hardest problems | | Claude Opus 4.5 | 200K | 64K | $5/$25 | Deep reasoning, agents | | Claude Sonnet 4.5 | 200K (1M beta) | 64K | $3/$15 | Coding, balanced | | Gemini 2.5 Pro | 1M | 64K | $1.25/$10 | Long context | | Gemini 3 Pro | 1M | 64K | $2/$12 | Latest Google (preview) |
| Model | Context | Input/Output $/1M | Best For | |-------|---------|-------------------|----------| | Claude Haiku 4.5 | 200K | $1/$5 | Fast, high-volume | | Gemini 2.5 Flash | 1M | $0.30/$2.50 | Large-scale processing | | Gemini 2.5 Flash-Lite | 1M | $0.10/$0.40 | Cheapest cloud option | | DeepSeek-chat | 128K | $0.28/$0.42 | 10x cheaper than GPT | | GPT-4o-mini | 128K | $0.15/$0.60 | Simple tasks |
// WRONG - will error on GPT-5.2, o3, o4-mini
{
temperature: 0.7, // ❌ Not supported
top_p: 0.9, // ❌ Not supported
max_tokens: 4096, // ❌ Use max_completion_tokens
}
// CORRECT
{
reasoning: { effort: "high" }, // none, low, medium, high, xhigh
text: { verbosity: "medium" }, // low, medium, high
max_completion_tokens: 4096
}
200K tokens: $6/$22.50 per 1M (automatic)
All major providers offer batch processing for non-urgent tasks:
Route simple queries to cheap models, complex to expensive:
Simple question → Haiku 4.5 ($1/$5)
Complex task → Sonnet 4.5 ($3/$15)
Hardest problems → Opus 4.5 ($5/$25)
const response = await fetch("https://api.openai.com/v1/responses", {
method: "POST",
headers: {
"Authorization": `Bearer ${OPENAI_API_KEY}`,
"Content-Type": "application/json"
},
body: JSON.stringify({
model: "gpt-5.2",
input: [{ role: "user", content: "Hello" }],
reasoning: { effort: "medium" }
})
});
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic();
const response = await anthropic.messages.create({
model: "claude-sonnet-4-5-20250929",
max_tokens: 4096,
messages: [{ role: "user", content: "Hello" }]
});
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-2.5-flash" });
const result = await model.generateContent("Hello");
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.deepseek.com',
apiKey: process.env.DEEPSEEK_API_KEY
});
const response = await client.chat.completions.create({
model: 'deepseek-chat',
messages: [{ role: 'user', content: 'Hello' }]
});
Last updated: January 28, 2026 Sources: Official documentation from OpenAI, Anthropic, Google, DeepSeek
development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.
development
Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.
development
Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.
development
End-to-end Parallels smoke, upgrade, and rerun workflow for OpenClaw across macOS, Windows, and Linux guests. Use when Codex needs to run, rerun, debug, or interpret VM-based install, onboarding, gateway smoke tests, latest-release-to-main upgrade checks, fresh snapshot retests, or optional Discord roundtrip verification under Parallels.