skills/skillxiv-v0.0.2-claude-opus-4.6/acon-context-compression-long-horizon/SKILL.md
Compress agent interaction histories and environment observations through natural language guideline optimization, reducing token usage by 26-54% while preserving 95%+ accuracy. Use for cost/latency reduction in multi-step agent tasks.
npx skillsauth add ADu2021/skillXiv acon-context-compression-long-horizonInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
ACON addresses unbounded context growth in multi-step agents through compression guideline optimization in natural language space. Rather than parameter fine-tuning, the approach optimizes natural language instructions that guide what to compress, enabling closed-source API compatibility and rapid iteration.
Setup ACON compression optimizer:
# Initialize ACON for agent context compression
from acon import CompressionOptimizer, CompressionGuidelineManager
# Create compression guideline manager
guideline_manager = CompressionGuidelineManager(
initial_guidelines=[
"Remove intermediate steps that don't affect final decisions",
"Abbreviate repetitive tool outputs",
"Summarize multi-turn conversation threads"
],
optimization_strategy="contrastive"
)
# Initialize optimizer
optimizer = CompressionOptimizer(
agent_model="gpt-4",
compression_model="gpt-4", # can differ from agent
guideline_manager=guideline_manager,
max_iterations=10
)
Execute contrastive guideline optimization:
# Optimization loop for compression guidelines
from acon import AgentExecution
for iteration in range(num_optimization_iterations):
# Task set to optimize over
test_tasks = load_benchmark_tasks() # e.g., AppWorld, OfficeBench
scores_by_guideline = {}
for task in test_tasks:
# Execute agent with full uncompressed context
full_execution = AgentExecution(
agent_model="gpt-4",
task=task
)
full_result = full_execution.run()
full_accuracy = evaluate(full_result, task.ground_truth)
full_tokens = count_tokens(full_execution.trajectory)
# Execute agent with current compression guidelines
compressed_execution = AgentExecution(
agent_model="gpt-4",
task=task,
compression_guidelines=guideline_manager.current_guidelines
)
# Compressor applies guidelines to interaction history
compressed_trajectory = compressed_execution.compress_context(
full_trajectory=full_execution.trajectory,
guidelines=guideline_manager.current_guidelines,
compression_ratio_target=0.5 # 50% of original
)
# Continue execution with compressed context
compressed_result = compressed_execution.run(
compressed_trajectory=compressed_trajectory
)
compressed_accuracy = evaluate(compressed_result, task.ground_truth)
compressed_tokens = count_tokens(compressed_trajectory)
# Evaluation metrics
accuracy_retained = compressed_accuracy / full_accuracy
token_reduction = 1 - (compressed_tokens / full_tokens)
# Record score (reward for efficiency + penalty for accuracy loss)
score = 0.6 * token_reduction - 0.4 * max(0, 1 - accuracy_retained)
scores_by_guideline[task.id] = score
# Optimize guidelines based on scores
avg_score = np.mean(list(scores_by_guideline.values()))
print(f"Iteration {iteration}: Avg score = {avg_score:.3f}")
# Generate improved guidelines
improved_guidelines = optimize_guidelines_with_lm(
current_guidelines=guideline_manager.current_guidelines,
evaluation_scores=scores_by_guideline,
full_trajectories=[e.trajectory for e in test_executions],
optimization_prompt=GUIDELINE_OPTIMIZATION_PROMPT
)
guideline_manager.update_guidelines(improved_guidelines)
When to use ACON:
When NOT to use:
Benchmark domains:
Hyperparameters:
Guidelines focus on:
Token reduction: 26-54% depending on domain
Accuracy retention: 95%+ maintained
Compressed guideline knowledge transfers to smaller models:
Gradient-free design enables:
Approach requires running agents twice (full + compressed) for optimization loop, creating upfront compute overhead. Amortized over many deployments, but matters for one-shot tasks.
Builds on prompt optimization, context compression, and agent efficiency literature.
testing
Uses flow maps as look-ahead operators to enable principled reward-guided diffusion by predicting trajectory endpoints at any denoising step. Deploy when applying rewards or preferences to diffusion trajectories with meaningful gradients throughout generation.
testing
Train language models where each expert learns independently on closed datasets, enabling flexible inference with selective data inclusion or exclusion. 41% performance improvement while allowing users to opt out of specific data sources without retraining.
data-ai
Understand how token generation flexibility in diffusion LMs paradoxically constrains reasoning, as models exploit ordering flexibility to avoid uncertain tokens, and apply simplified approaches that preserve parallel decoding benefits. Use when optimizing diffusion-based language models for reasoning tasks.
devops
Enable LLM agents to improve continuously during deployment by constructing structured experience libraries through self-reflection on successes and failures—achieving 23% improvement on reasoning without gradient-based parameter updates or external training.