skills/skillxiv-v0.0.2-claude-opus-4.6/divide-and-conquer-reasoning/SKILL.md
Train models to decompose complex problems into subproblems via divide-and-conquer reasoning. Structured approach enables systematic solution assembly and improved long-horizon reasoning compared to end-to-end generation.
npx skillsauth add ADu2021/skillXiv divide-and-conquer-reasoningInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Large language models often attempt end-to-end reasoning on complex problems, leading to solution quality degradation as problem complexity increases. Models lack systematic decomposition strategies.
Structured problem solving with explicit subproblem identification and assembly enables better performance than flat, unstructured approaches.
The method trains models to explicitly: (1) recognize decomposable structure in problems, (2) identify independent subproblems, (3) solve subproblems with recursive calls, and (4) synthesize subproblem solutions into final answers.
This mirrors human problem-solving strategies and enables scaling to larger problems through compositional reasoning.
Create training examples showing divide-and-conquer reasoning.
def create_decomposition_trace(problem, solution, model):
"""Generate divide-and-conquer reasoning trace for training."""
trace = {
'original_problem': problem,
'decomposition': None,
'subproblems': [],
'subsolutions': [],
'final_assembly': None,
'final_solution': solution
}
# Step 1: Identify decomposition structure
decomposition_prompt = f"""Analyze this problem and identify how to decompose it:
Problem: {problem}
Explain:
1. Is this decomposable? Yes/No
2. What are the independent subproblems?
3. What is the structure of dependencies?"""
decomposition = model.generate(decomposition_prompt)
trace['decomposition'] = decomposition
# Step 2: Extract subproblems
subproblems = extract_subproblems(decomposition)
trace['subproblems'] = subproblems
# Step 3: Solve subproblems
for subproblem in subproblems:
subproblem_solution = solve_subproblem(subproblem, model)
trace['subsolutions'].append(subproblem_solution)
# Step 4: Assembly
assembly_prompt = f"""Given these subproblem solutions, assemble the final answer:
Original problem: {problem}
Subproblems and solutions:
{format_subsolutions(subproblems, trace['subsolutions'])}
Final answer:"""
assembly_explanation = model.generate(assembly_prompt)
trace['final_assembly'] = assembly_explanation
return trace
def extract_subproblems(decomposition):
"""Parse decomposition to extract subproblems."""
# In practice, use regex or NLP to extract subproblem descriptions
lines = decomposition.split('\n')
subproblems = []
for line in lines:
if line.startswith('-') or line.startswith('*'):
subproblems.append(line.strip())
return subproblems
Train model to recognize and execute decomposition.
def train_decomposition_policy(model, decomposition_traces, num_epochs=3):
"""Train model to decompose problems effectively."""
optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
for epoch in range(num_epochs):
for trace in decomposition_traces:
# Supervised learning on decomposition step
problem = trace['original_problem']
target_decomposition = trace['decomposition']
prompt = f"Decompose this problem:\n\n{problem}\n\nDecomposition:"
# Generate and compute loss
generated = model.generate(prompt)
decomposition_loss = compute_similarity_loss(generated, target_decomposition)
# Supervised learning on subproblem identification
target_subproblems = trace['subproblems']
subproblem_loss = compute_classification_loss(model, problem, target_subproblems)
# Supervised learning on solution assembly
subsolution_inputs = format_subsolutions(trace['subproblems'], trace['subsolutions'])
target_assembly = trace['final_assembly']
assembly_loss = compute_assembly_loss(model, subsolution_inputs, target_assembly)
# Weighted combination
total_loss = 0.4 * decomposition_loss + 0.3 * subproblem_loss + 0.3 * assembly_loss
total_loss.backward()
optimizer.step()
optimizer.zero_grad()
return model
Execute divide-and-conquer reasoning at inference time.
def divide_and_conquer_inference(problem, model, max_depth=3, base_case_threshold=100):
"""Solve problem via divide-and-conquer reasoning."""
def solve_recursive(current_problem, depth):
# Base case: problem small enough to solve directly
if len(current_problem) < base_case_threshold or depth >= max_depth:
prompt = f"Solve this problem directly:\n\n{current_problem}"
return model.generate(prompt)
# Recursive case: decompose
decomposition_prompt = f"Decompose into subproblems:\n\n{current_problem}"
decomposition = model.generate(decomposition_prompt)
subproblems = extract_subproblems(decomposition)
# Solve subproblems recursively
subsolutions = []
for subproblem in subproblems:
subsolution = solve_recursive(subproblem, depth + 1)
subsolutions.append(subsolution)
# Assemble solutions
assembly_prompt = f"""Combine these solutions:
Problem: {current_problem}
Subproblems and solutions:
{format_subsolutions(subproblems, subsolutions)}
Final answer:"""
final_solution = model.generate(assembly_prompt)
return final_solution
return solve_recursive(problem, depth=0)
Use RL to improve decomposition strategy.
def reinforce_decomposition(model, problems, reward_function, num_rl_steps=1000):
"""Improve decomposition via reinforcement learning."""
optimizer = torch.optim.Adam(model.parameters(), lr=1e-5)
for step in range(num_rl_steps):
problem = np.random.choice(problems)
# Generate decomposition
decomposition_prompt = f"Decompose:\n\n{problem}"
decomposition = model.generate_with_logprobs(decomposition_prompt)
# Extract subproblems
subproblems = extract_subproblems(decomposition['text'])
# Solve and get final answer
final_answer = divide_and_conquer_inference(problem, model)
# Evaluate solution quality
reward = reward_function(problem, final_answer)
# Policy gradient
log_probs = decomposition['log_probs']
policy_loss = -reward * log_probs.sum()
policy_loss.backward()
optimizer.step()
optimizer.zero_grad()
return model
| Parameter | Value | Notes | |-----------|-------|-------| | Max recursion depth | 3-5 | Prevent infinite recursion | | Base case threshold | 100-200 tokens | When to solve directly | | Decomposition weight | 0.4 | Importance in training loss | | RL reward discount | 0.99 | Multi-step credit assignment | | Learning rate (SFT) | 1e-4 | Standard fine-tuning | | Learning rate (RL) | 1e-5 | Conservative RL tuning |
Training LLMs for Divide-and-Conquer Reasoning https://arxiv.org/abs/2602.02477
testing
Uses flow maps as look-ahead operators to enable principled reward-guided diffusion by predicting trajectory endpoints at any denoising step. Deploy when applying rewards or preferences to diffusion trajectories with meaningful gradients throughout generation.
testing
Train language models where each expert learns independently on closed datasets, enabling flexible inference with selective data inclusion or exclusion. 41% performance improvement while allowing users to opt out of specific data sources without retraining.
data-ai
Understand how token generation flexibility in diffusion LMs paradoxically constrains reasoning, as models exploit ordering flexibility to avoid uncertain tokens, and apply simplified approaches that preserve parallel decoding benefits. Use when optimizing diffusion-based language models for reasoning tasks.
devops
Enable LLM agents to improve continuously during deployment by constructing structured experience libraries through self-reflection on successes and failures—achieving 23% improvement on reasoning without gradient-based parameter updates or external training.