skills/skillxiv-v0.0.2-claude-opus-4.6/experience-guided-reasoning-adaptation/SKILL.md
Dynamically adapt LLM reasoning strategies at inference time by curating episodic memory of past problem solutions—generate task-specific prompts, tool configs, and control logic for up to 111× cost reduction and 14% accuracy gains.
npx skillsauth add ADu2021/skillXiv experience-guided-reasoning-adaptationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Most inference-time reasoning optimization fixes prompts and parameters before deployment. Experience-Guided Reasoner (EGuR) treats reasoning strategy as dynamic: it maintains structured memory of past solutions and generates new strategies tailored to each problem's characteristics. Rather than modifying text inputs, EGuR produces complete computational procedures with custom prompts, sampling configs, tool selections, and control flow.
This approach achieves simultaneous improvements in accuracy (up to 14%) and efficiency (up to 111× cost reduction) by matching strategy intensity to problem difficulty—hard problems get more reasoning steps, easy ones get direct inference.
Reasoning strategies span multiple dimensions: textual prompts, sampling parameters (temperature, top-k), tool choices, and control structures (few-shot examples, chain-of-thought depth). Most systems fix all dimensions at deployment time. EGuR instead generates strategies adaptively using two components:
The key insight is that strategy generation is itself learnable and cacheable—successful strategies for similar problems can be retrieved, reducing synthesis cost on the critical path.
Step 1: Memory Structures. Initialize strategy library and notes for experience curation.
class ExperienceMemory:
def __init__(self, max_strategies=10000):
self.strategy_library = {} # {problem_hash: [strategies]}
self.general_notes = [] # High-level insights
self.max_strategies = max_strategies
self.access_count = {} # Track usage frequency
def add_strategy(self, problem_sig, strategy, success, cost, accuracy):
"""
Store successful strategy with problem signature.
problem_sig: hash/embedding of problem (domain, difficulty, type)
strategy: {prompt, temperature, tools, control_flow}
"""
key = problem_sig
if key not in self.strategy_library:
self.strategy_library[key] = []
entry = {
'strategy': strategy,
'success': success,
'cost': cost,
'accuracy': accuracy,
'timestamp': time.time()
}
self.strategy_library[key].append(entry)
self.access_count[key] = self.access_count.get(key, 0) + 1
# Evict oldest low-frequency entries if at capacity
if sum(len(v) for v in self.strategy_library.values()) > self.max_strategies:
self._evict_least_useful()
def retrieve_strategies(self, problem_sig, k=3):
"""Retrieve top-k strategies similar to problem signature."""
# Simple: exact match first, then fallback to similar signatures
if problem_sig in self.strategy_library:
strategies = self.strategy_library[problem_sig]
# Sort by success + efficiency
ranked = sorted(
strategies,
key=lambda x: x['success'] * (1 - x['cost']/100),
reverse=True
)
return ranked[:k]
return []
Step 2: Strategy Generation. Guide module generates candidate strategies conditioned on problem and retrieved experiences.
def generate_strategies(problem, memory, guide_llm, k=5):
"""
Generate k candidate strategies for given problem.
"""
# Extract problem signature (domain, complexity, type)
problem_sig = extract_problem_signature(problem)
# Retrieve relevant past strategies
past_strategies = memory.retrieve_strategies(problem_sig, k=3)
# Construct context for guide
guide_prompt = f"""
Problem: {problem}
Related successful strategies from past:
{format_strategies(past_strategies)}
Generate {k} diverse strategies. Each strategy specifies:
- prompt_template: the reasoning prompt
- temperature: sampling temperature (0.0-2.0)
- tools: list of tools to use (search, calculate, etc.)
- max_tokens: reasoning budget
- control_flow: chain_of_thought, direct, tree_search, etc.
Format each as JSON.
"""
# Call guide LLM
response = guide_llm(guide_prompt)
strategies = parse_json_list(response)
return strategies
Step 3: Strategy Execution and Feedback. Execute generated strategies and collect results for consolidation.
def execute_and_rank_strategies(problem, strategies, main_llm, verifier, k=5):
"""
Execute multiple strategies in parallel, rank by quality.
Returns best strategy and execution metadata for memory update.
"""
results = []
for i, strat in enumerate(strategies):
# Execute strategy
solution = execute_strategy(
problem,
prompt_template=strat['prompt_template'],
temperature=strat['temperature'],
tools=strat['tools'],
max_tokens=strat['max_tokens'],
control_flow=strat['control_flow'],
llm=main_llm
)
# Verify and score
score = verifier.score(problem, solution)
cost = estimate_cost(strat)
results.append({
'strategy': strat,
'solution': solution,
'accuracy': score,
'cost': cost,
'efficiency': score / (cost + 1e-8)
})
# Rank and return best
best = max(results, key=lambda x: x['efficiency'])
return best['solution'], best['strategy'], best
Step 4: Memory Consolidation. Update strategy library and extract general insights.
def consolidate_memory(memory, execution_results, problem_sig):
"""
Update memory: store successful strategy, extract patterns.
"""
best_strat = execution_results['strategy']
accuracy = execution_results['accuracy']
cost = execution_results['cost']
# Add to library
memory.add_strategy(
problem_sig,
best_strat,
success=(accuracy > 0.7),
cost=cost,
accuracy=accuracy
)
# Extract and update general notes if pattern found
# e.g., "complex math problems benefit from max_tokens >= 500"
if accuracy > 0.9 and cost < 50:
pattern = f"Low-cost high-accuracy solution for {problem_sig}: {best_strat}"
memory.add_insight(pattern)
# Cleanup: remove redundant entries
memory.cleanup()
When to Use: Complex reasoning tasks (math, coding, multi-step QA) where problem difficulty varies; inference-time budgets matter (cost-accuracy tradeoff).
Hyperparameters:
Pitfalls:
When NOT to Use: Fixed-task inference where problem distribution is static and small; simpler deterministic problems not benefiting from strategy adaptation.
Integration: Works with any LLM; compatible with tool-use frameworks (RAG, APIs). Pairs well with outcome verification for feedback signal.
Reference: https://arxiv.org/abs/2511.11519
testing
Uses flow maps as look-ahead operators to enable principled reward-guided diffusion by predicting trajectory endpoints at any denoising step. Deploy when applying rewards or preferences to diffusion trajectories with meaningful gradients throughout generation.
testing
Train language models where each expert learns independently on closed datasets, enabling flexible inference with selective data inclusion or exclusion. 41% performance improvement while allowing users to opt out of specific data sources without retraining.
data-ai
Understand how token generation flexibility in diffusion LMs paradoxically constrains reasoning, as models exploit ordering flexibility to avoid uncertain tokens, and apply simplified approaches that preserve parallel decoding benefits. Use when optimizing diffusion-based language models for reasoning tasks.
devops
Enable LLM agents to improve continuously during deployment by constructing structured experience libraries through self-reflection on successes and failures—achieving 23% improvement on reasoning without gradient-based parameter updates or external training.