skills/context-management/SKILL.md
Learn how to manage conversation context in AMCP to avoid LLM API errors from exceeding context windows. This skill covers SmartCompactor strategies, token estimation, configuration, and best practices.
npx skillsauth add tao12345666333/amcp context-managementInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill teaches you how to proactively manage your conversation context in AMCP to avoid LLM API errors caused by exceeding context window limits. Context management is critical when:
grep results, file reads)Different LLM models have different context window sizes:
| Model Family | Context Window | |--------------|----------------| | GPT-4 Turbo / GPT-4o | 128,000 tokens | | GPT-4.1 | 1,000,000 tokens | | Claude 3.5 Sonnet | 200,000 tokens | | DeepSeek V3 | 64,000 tokens | | Gemini 2.0 Flash | 1,000,000 tokens | | Qwen 2.5 | 128,000 tokens |
AMCP automatically detects most model context windows. For unknown models, it uses 32,000 tokens as the default.
When managing context, track:
AMCP's SmartCompactor automatically compresses context when it exceeds the threshold:
# In agent.py (line 500-505):
compactor = SmartCompactor(client, model)
if compactor.should_compact(history_to_add):
history_to_add, _ = compactor.compact(history_to_add)
This happens automatically during conversation! You don't need to trigger it manually.
AMCP supports four strategies (configurable via CompactionConfig):
SUMMARY (default): Uses LLM to create intelligent summary of old messages
TRUNCATE: Simple removal of old messages, keeping first and last few
SLIDING_WINDOW: Keeps only the most recent messages that fit in target
HYBRID: Combines sliding window with summary of removed content
Instead of reading entire large files:
# BAD: Reads entire file
read_file(path="src/large_module.py")
# GOOD: Read specific sections
read_file(path="src/large_module.py", mode="indentation", offset=100, limit=50)
# GOOD: Read specific line ranges
read_file(path="src/large_module.py", mode="slice", ranges=["1-50", "200-250"])
Use read_file in indentation mode - it intelligently captures code blocks around your target, providing context without excessive content.
# BAD: Returns all matches with full context
grep(pattern="function", paths=["src/"])
# GOOD: Limited context
grep(pattern="function", paths=["src/"], context=2)
When processing multiple files:
# Instead of processing all files at once:
for file in files:
# Process one file at a time
result = process_file(file)
# Save intermediate results
After completing a complex task, consider suggesting:
"会话历史较长。如果开始新的无关任务,建议清除历史或创建新会话以减少上下文。"
grep to find what you need before reading filesYou can check context usage programmatically:
from amcp import SmartCompactor
# Create compactor
compactor = SmartCompactor(client, model="gpt-4-turbo")
# Get detailed usage info
usage = compactor.get_token_usage(messages)
print(f"Current: {usage['current_tokens']:,} tokens")
print(f"Usage: {usage['usage_ratio']:.1%} of context")
print(f"Headroom: {usage['headroom_tokens']:,} tokens")
print(f"Should compact: {usage['should_compact']}")
Context compaction is configured via CompactionConfig:
from amcp import SmartCompactor, CompactionConfig, CompactionStrategy
config = CompactionConfig(
strategy=CompactionStrategy.SUMMARY,
threshold_ratio=0.7, # Compact at 70% usage
target_ratio=0.3, # Aim for 30% after compaction
preserve_last=6, # Keep last 6 user/assistant messages
preserve_tool_results=True, # Preserve recent tool results
max_tool_results=10, # Max tool results to preserve
min_tokens_to_compact=5000, # Don't compact tiny contexts
safety_margin=0.1, # 10% margin for responses
)
You can configure compaction in ~/.config/amcp/config.toml:
[chat]
model = "deepseek-chat"
[compaction]
strategy = "summary" # summary, truncate, sliding_window, hybrid
threshold_ratio = 0.7
target_ratio = 0.3
When using SUMMARY or HYBRID strategies, the summary follows this structure:
<current_task>
What we're working on now - be specific about files and goals
</current_task>
<completed>
- Task 1: Brief outcome + key changes made
- Task 2: Brief outcome + key changes made
</completed>
<code_state>
Key files and their current state - signatures + key logic only
Include file paths that were modified
</code_state>
<important>
Any crucial context: errors, decisions made, constraints, blockers
</important>
AMCP provides accurate token estimation:
from amcp import estimate_tokens
tokens = estimate_tokens(messages)
tiktoken library when available (recommended)Symptom: LLM API returns error about context window being exceeded.
Causes:
grep or find results)Solutions:
grep --limit, read_file with ranges)Symptom: After compaction, agent forgets critical details.
Causes:
Solutions:
preserve_last to keep more recent messagesSymptom: Agent forgets specific code changes after compaction.
Solutions:
preserve_tool_results=True (default) to keep recent tool outputsmax_tool_results (default: 10)AMCP uses progressive disclosure to manage skill instructions:
# In skills.py (line 297):
skills_summary = skill_manager.build_skills_summary()
This means:
AMCP separates:
.amcp/memory/ (unlimited)Use the memory system to store important information that should persist long-term:
memory(action="write", content="# Project Notes\n- Uses PostgreSQL database")
This information is NOT in the conversation context - it's stored separately and retrieved when relevant.
For complex tasks with large context:
Start: Use grep or find to locate relevant files
grep(pattern="class User", paths=["src/"])
Explore: Read specific sections using indentation mode
read_file(path="src/models/user.py", mode="indentation", offset=1)
Process: Work incrementally, one file at a time
Monitor: If context gets large, compaction happens automatically
Persist: Save important findings to memory if needed for future sessions
memory(action="append", content="Found authentication bug in src/auth.py:45")
AMCP emits events when compaction occurs:
from amcp import get_event_bus, EventType
@get_event_bus().on(EventType.CONTEXT_COMPACTED)
async def on_compaction(event):
data = event.data
print(f"Context compacted: {data['original_tokens']} -> {data['compacted_tokens']}")
CompactionConfig for your use caseIf you notice:
Then suggest:
"会话历史较长。如果开始新的无关任务,建议清除历史或创建新会话以减少上下文。"
This is proactive context management!
tools
Send and edit Telegram messages via Bot API. Use when AMCP needs to send a message, reply to a specific message, edit an existing message, or push proactive notifications (cron results, heartbeat alerts, task status). Requires AMCP_TELEGRAM_BOT_TOKEN env var.
tools
Create or update AMCP skills. Use when designing, structuring, or packaging skills with scripts, references, and assets. This skill should be used when users want to create a new skill (or update an existing skill) that extends AMCP's capabilities with specialized knowledge, workflows, or tool integrations.
tools
Backup old AMCP sessions by renaming with execution date, then clean and compact sessions and memory.
testing
Periodic heartbeat check that reads HEARTBEAT.md from the workspace and executes any tasks listed there. Use for autonomous background monitoring, periodic maintenance, and proactive task execution. Triggered by a cron schedule.