.claude/skills/claude-opus-4-5-guide/SKILL.md
Comprehensive guide to Claude Opus 4.5, Anthropic's most intelligent model with effort parameter for reasoning control. Covers model capabilities, benchmarks, effort levels (high/medium/low), hybrid reasoning, and model selection. Use when working with Opus 4.5, optimizing reasoning depth, choosing models, or understanding effort parameter trade-offs.
npx skillsauth add adaptationio/skrillz claude-opus-4-5-guideInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Claude Opus 4.5 represents Anthropic's most capable and intelligent model, released November 24, 2025. It combines state-of-the-art reasoning abilities with a revolutionary "effort parameter" that lets you control the model's reasoning depth and token consumption dynamically.
Key Positioning: Opus 4.5 is the best model in the world for complex coding, autonomous agents, computer use automation, and advanced reasoning tasks. It succeeds where previous models required manual intervention and replaces what previously demanded multiple specialized models.
What Makes Opus 4.5 Different:
Opus 4.5 isn't just an incremental improvement—it's a model tier reduction. Tasks requiring Opus 4.1 now run on Sonnet with Opus 4.5 for complex work. Budget-conscious teams use Opus 4.5's effort parameter instead of maintaining multiple model versions.
Use claude-opus-4-5-guide when you need to:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-5-20251101",
max_tokens=1024,
messages=[
{
"role": "user",
"content": "Explain how quantum computing could solve the traveling salesman problem"
}
]
)
print(response.content[0].text)
The effort parameter is an exclusive Opus 4.5 feature that controls reasoning thoroughness:
import anthropic
client = anthropic.Anthropic()
# High effort (default): Maximum capability for complex tasks
response = client.beta.messages.create(
model="claude-opus-4-5-20251101",
max_tokens=1024,
messages=[{"role": "user", "content": "Design a distributed cache system"}],
output_config={"effort": "high"},
betas=["effort-2025-11-24"]
)
# Medium effort: Balanced efficiency for typical tasks
response = client.beta.messages.create(
model="claude-opus-4-5-20251101",
max_tokens=1024,
messages=[{"role": "user", "content": "Summarize this meeting transcript"}],
output_config={"effort": "medium"},
betas=["effort-2025-11-24"]
)
# Low effort: Quick responses for simple tasks
response = client.beta.messages.create(
model="claude-opus-4-5-20251101",
max_tokens=1024,
messages=[{"role": "user", "content": "Is this email spam?"}],
output_config={"effort": "low"},
betas=["effort-2025-11-24"]
)
| Specification | Opus 4.5 | Opus 4.1 | Sonnet 4.5 | Haiku 4.5 |
|---|---|---|---|---|
| Model ID | claude-opus-4-5-20251101 | claude-opus-4-1-20250125 | claude-sonnet-4-5-20250929 | claude-haiku-4-5-20251001 |
| Intelligence | ★★★★★ | ★★★★☆ | ★★★★☆ | ★★★☆☆ |
| Speed | Fast/Instant | Fast | Instant | Instant |
| Input Cost | $5/M | $25/M | $3/M | $0.80/M |
| Output Cost | $25/M | $125/M | $15/M | $4/M |
| Context Window | 200K | 200K | 200K | 200K |
| Effort Parameter | ✓ | ✗ | ✗ | ✗ |
| Knowledge Cutoff | March 2025 | January 2025 | November 2024 | August 2024 |
| Best For | Complex reasoning, coding, agents | Maximum capability (expensive) | Fast, capable tasks | Speed-critical, simple tasks |
Opus 4.5 achieves state-of-the-art results across benchmark suites:
| Benchmark | Score | Category | Significance | |---|---|---|---| | SWE-bench | 80.9% | Code | Autonomous coding—solves real GitHub issues | | MMLU | 92.3% | Knowledge | Broad knowledge across domains | | GPQA Diamond | 83.1% | Reasoning | Graduate-level science questions | | HumanEval | 95%+ | Code | Python function implementation | | Coding Contests | Top 10% | Code | Real programming competitions |
These scores demonstrate Opus 4.5 can handle autonomous coding tasks, complex reasoning, and knowledge-intensive applications previously requiring human expertise.
| Effort | Token Impact | Use Case | Example | |---|---|---|---| | High | Baseline (default) | Maximum quality needed, cost secondary | Complex system design, difficult debugging | | Medium | ~20-40% reduction | Typical production use, balanced efficiency | General coding, research, most tasks | | Low | ~50-70% reduction | Speed/cost priority, simple tasks | Classification, summarization, lookups |
Best Practice: Default to high effort, then measure and decrease strategically for known use cases.
Opus 4.5 is available on:
For deeper dives into specific topics:
For complete effort parameter documentation, API syntax details, and code examples in both Python and TypeScript, see references/effort-parameter-guide.md.
For model selection decision matrix and detailed capability comparisons, see references/model-selection-guide.md.
For full benchmark results and performance demonstrations, see references/model-capabilities.md.
development
Setup secure web-based terminal access to WSL2 from mobile/tablet via ttyd + ngrok/Cloudflare/Tailscale. One-command install, start, stop, status. Use when you need remote terminal access, web terminal, browser-based shell, or mobile access to WSL2 environment.
development
Complete development workflows where Claude writes the code while Gemini and Codex provide research, planning, reviews, and different perspectives. Claude remains the main developer. Use for complex projects requiring expert planning and multi-perspective reviews.
development
Systematic progress tracking for skill development. Manages task states (pending/in_progress/completed), updates in real-time, reports progress, identifies blockers, and maintains momentum. Use when tracking skill development, coordinating work, or reporting progress.
testing
Comprehensive testing workflow orchestrating functional testing, example validation, integration testing, and usability assessment. Sequential workflow for complete skill testing from examples through scenarios to integration validation. Use when conducting thorough testing, pre-deployment validation, ensuring skill functionality, or comprehensive quality checks.