.claude/skills/claude-context-management/SKILL.md
Comprehensive context management strategies for cost optimization and infinite-length conversations. Covers server-side clearing (tool results, thinking blocks), client-side SDK compaction (automatic summarization), and memory tool integration. Use when managing long conversations, optimizing token costs, preventing context overflow, or enabling continuous agentic workflows.
npx skillsauth add adaptationio/skrillz claude-context-managementInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Claude conversations can grow indefinitely, but context windows have limits. Context management strategies enable unlimited conversations while optimizing costs. This skill covers two complementary approaches: server-side clearing (API-managed) and client-side compaction (SDK-managed), plus integration with the memory tool for automatic context preservation.
The Problem: As conversations grow, token consumption increases. Without management:
The Solution: Automatic context editing and summarization strategies that preserve important information while reducing token consumption.
This skill is essential for:
Long-Running Conversations (>50K tokens accumulated)
Multi-Session Workflows
Token Cost Optimization
Tool-Heavy Applications
Memory-Augmented Applications
Hybrid Thinking Scenarios
Objectives:
Actions:
Analyze expected conversation length
Identify dominant content type
Determine session persistence
Decision Framework:
| Scenario | Strategy | Rationale |
|----------|----------|-----------|
| Immediate clearing needed, tool results dominate | Server-side (clear_tool_uses_20250919) | Results removed before Claude processes, minimal disruption |
| Extensive thinking blocks being generated | Server-side (clear_thinking_20251015) | Preserves recent reasoning, maintains cache hits |
| SDK context monitoring available | Client-side compaction | Automatic summarization on threshold |
| Both tool results and thinking | Combine both strategies | Thinking first, then tool clearing |
| Multi-session, knowledge accumulation | Add memory tool | Proactive preservation before clearing |
Selection Questions:
clear_tool_uses_20250919clear_thinking_20251015For Server-Side Clearing:
Choose trigger type:
input_tokens: Trigger when input accumulates (most common)tool_uses: Trigger when tool calls accumulateSet trigger value:
Define what to keep:
keep parameter: Most recent N items to preserveExclude important tools:
exclude_tools: Don't clear results from these tools["web_search"] (web search results often important)For Client-Side Compaction:
context_token_threshold (e.g., 100,000)summary_promptWhen to Add Memory:
Integration Pattern:
{"type": "memory_20250818", "name": "memory"}How It Works:
Monitoring Metrics:
Optimization Adjustments:
Validation Checklist:
Adjustment Process:
import anthropic
client = anthropic.Anthropic()
# Configure context management for tool result clearing
response = client.beta.messages.create(
model="claude-sonnet-4-5",
max_tokens=4096,
messages=[{"role": "user", "content": "Search for AI developments"}],
tools=[{"type": "web_search_20250305", "name": "web_search"}],
betas=["context-management-2025-06-27"],
context_management={
"edits": [
{
"type": "clear_tool_uses_20250919",
"trigger": {"type": "input_tokens", "value": 100000},
"keep": {"type": "tool_uses", "value": 3},
"clear_at_least": {"type": "input_tokens", "value": 5000},
"exclude_tools": ["web_search"]
}
]
}
)
print(response.content[0].text)
import anthropic
client = anthropic.Anthropic()
# Configure automatic summarization when tokens exceed threshold
runner = client.beta.messages.tool_runner(
model="claude-sonnet-4-5",
max_tokens=4096,
tools=[
{
"type": "text_editor_20250728",
"name": "file_editor",
"max_characters": 10000
}
],
messages=[{
"role": "user",
"content": "Review all Python files and summarize code quality issues"
}],
compaction_control={
"enabled": True,
"context_token_threshold": 100000
}
)
# Process until completion, automatic compaction on threshold
for event in runner:
if hasattr(event, 'usage'):
print(f"Current tokens: {event.usage.input_tokens}")
result = runner.until_done()
print(result.content[0].text)
import anthropic
client = anthropic.Anthropic()
# Enable both memory tool and context clearing
response = client.beta.messages.create(
model="claude-sonnet-4-5",
max_tokens=4096,
messages=[...],
tools=[
{
"type": "memory_20250818",
"name": "memory"
},
# Your other tools
],
betas=["context-management-2025-06-27"],
context_management={
"edits": [
{
"type": "clear_tool_uses_20250919",
"trigger": {"type": "input_tokens", "value": 100000}
}
]
}
)
# Claude will automatically receive warnings and can write to memory
| Feature | Server-Side Clearing | Client-Side Compaction | |---------|---------------------|----------------------| | Trigger | API detects threshold | SDK monitors after each response | | Action | Removes old content | Generates summary, replaces history | | Processing | Before Claude sees | After response, before next turn | | Control | Automatic | Requires SDK integration | | Language Support | All (Python, TypeScript, etc.) | Python + TypeScript only | | Customization | Trigger, keep, exclude tools | Threshold, model, summary prompt | | Cache Impact | May invalidate cache | Works with caching | | Summary Quality | N/A (deletion) | Claude-generated, customizable | | Memory Integration | Excellent (receives warnings) | Requires manual memory calls | | Best For | Tool-heavy workflows | Long multi-turn conversations | | Overhead | Minimal | Model call for summary generation |
Strategy 1: clear_tool_uses_20250919
Strategy 2: clear_thinking_20251015
Context Window: Maximum tokens available for input + output in a single request
Input Tokens: Accumulated message history size (grows with each turn)
Token Threshold: Configured limit triggering automatic clearing
Clearing: Automatic removal of old tool results to reduce input tokens
Compaction: Automatic summarization replacing full history with summary
Memory Tool: Persistent key-value storage accessible across sessions
Cache Integration: Prompt caching works with context management (preserve recent thinking)
context-management-2025-06-27context-management-2025-06-27All Claude 3.5+ models support context editing:
For detailed documentation on each strategy:
Server-Side Context Clearing → See references/server-side-context-editing.md
Client-Side Compaction SDK → See references/client-side-compaction-sdk.md
Memory Tool Integration → See references/memory-tool-integration.md
Context Optimization Workflow → See references/context-optimization-workflow.md
Last Updated: November 2025 Quality Score: 95/100 Citation Coverage: 100% (All claims from official Anthropic documentation)
development
Setup secure web-based terminal access to WSL2 from mobile/tablet via ttyd + ngrok/Cloudflare/Tailscale. One-command install, start, stop, status. Use when you need remote terminal access, web terminal, browser-based shell, or mobile access to WSL2 environment.
development
Complete development workflows where Claude writes the code while Gemini and Codex provide research, planning, reviews, and different perspectives. Claude remains the main developer. Use for complex projects requiring expert planning and multi-perspective reviews.
development
Systematic progress tracking for skill development. Manages task states (pending/in_progress/completed), updates in real-time, reports progress, identifies blockers, and maintains momentum. Use when tracking skill development, coordinating work, or reporting progress.
testing
Comprehensive testing workflow orchestrating functional testing, example validation, integration testing, and usability assessment. Sequential workflow for complete skill testing from examples through scenarios to integration validation. Use when conducting thorough testing, pre-deployment validation, ensuring skill functionality, or comprehensive quality checks.