skills/skillxiv-v0.0.2-claude-opus-4.6/cove-tool-use-training/SKILL.md
CoVe synthesizes high-quality tool-use training data using explicit task constraints as both generation guidance and verification validators, enabling effective agent training without manual curation.
npx skillsauth add ADu2021/skillXiv cove-tool-use-trainingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Training agents to use tools (APIs, functions, domain-specific commands) is notoriously difficult. The challenge: creating diverse, realistic interaction trajectories where agents navigate complex, ambiguous user requests through deterministic actions is expensive and error-prone. Manual curation scales poorly, and unconstrained data synthesis produces trajectories that violate domain logic or are unrealistic.
CoVe solves this by embedding explicit task constraints (business rules, domain requirements) directly into the data synthesis process. Constraints serve dual purposes: (1) guiding generation of sophisticated, realistic trajectories, and (2) providing deterministic verification that outputs are correct. This eliminates manual annotation while ensuring data quality.
The core insight: constraints are semantic specifications that can guide both generation and validation. Rather than generating trajectories unconstrained and hoping they're correct, define constraints that the agent must satisfy (e.g., "booking a flight must include selecting date, passengers, and payment"). Use these constraints to:
This creates a virtuous cycle: better constraints yield better data, which trains better agents.
CoVe involves defining constraints, generating trajectories, and training agents. Here's how to implement it:
Define explicit constraints that specify valid task trajectories. Constraints encode domain logic:
from dataclasses import dataclass
from typing import List, Dict, Any
@dataclass
class Constraint:
"""Base class for task constraints."""
name: str
description: str
def verify(self, trajectory: List[Dict]) -> bool:
"""Check if trajectory satisfies this constraint."""
raise NotImplementedError
class AirlineBookingConstraints:
"""Constraints for airline booking tasks."""
class HasFlight(Constraint):
def __init__(self):
super().__init__(
"has_flight",
"Trajectory must include flight selection action"
)
def verify(self, trajectory: List[Dict]) -> bool:
actions = [step.get('action') for step in trajectory]
return 'select_flight' in actions or 'confirm_flight' in actions
class HasPassengers(Constraint):
def __init__(self):
super().__init__(
"has_passengers",
"Must specify passenger details"
)
def verify(self, trajectory: List[Dict]) -> bool:
for step in trajectory:
if step.get('action') == 'enter_passenger_info':
return 'passengers' in step and len(step['passengers']) > 0
return False
class ValidPayment(Constraint):
def __init__(self):
super().__init__(
"valid_payment",
"Payment must be processed"
)
def verify(self, trajectory: List[Dict]) -> bool:
for step in trajectory:
if step.get('action') == 'process_payment':
payment = step.get('payment_method')
return payment in ['credit_card', 'debit_card', 'wallet']
return False
def __init__(self):
self.constraints = [
self.HasFlight(),
self.HasPassengers(),
self.ValidPayment(),
]
def verify_trajectory(self, trajectory: List[Dict]) -> bool:
"""Check if trajectory satisfies all constraints."""
return all(c.verify(trajectory) for c in self.constraints)
def get_constraint_prompt(self) -> str:
"""Generate prompt guidance for trajectory generation."""
prompt = "Generate a valid airline booking trajectory that:\n"
for i, constraint in enumerate(self.constraints, 1):
prompt += f"{i}. {constraint.description}\n"
return prompt
Generate trajectories using an LLM, guided by constraints:
def generate_tool_trajectories(
model,
constraints,
num_trajectories=100,
temperature=0.8,
):
"""
Generate synthetic tool-use trajectories respecting constraints.
"""
constraint_guidance = constraints.get_constraint_prompt()
prompt_template = f"""
{constraint_guidance}
Generate a realistic user request and corresponding agent interaction trajectory.
The trajectory should show the agent using tools to complete the task.
Format:
USER_REQUEST: [User's initial request]
TRAJECTORY:
[Step 1]: action=..., parameters={{...}}
[Step 2]: action=..., parameters={{...}}
...
CONFIRMATION: [Final booking/result]
"""
trajectories = []
verified_count = 0
for _ in range(num_trajectories):
# Generate trajectory
generation = model.generate(
prompt_template,
max_length=500,
temperature=temperature,
num_return_sequences=1,
)[0]
# Parse trajectory
parsed = parse_trajectory_output(generation)
# Verify constraints
if constraints.verify_trajectory(parsed['steps']):
trajectories.append(parsed)
verified_count += 1
print(f"Generated {verified_count}/{num_trajectories} valid trajectories")
return trajectories
def parse_trajectory_output(text: str) -> Dict:
"""Parse model output into structured trajectory."""
lines = text.split('\n')
trajectory = {
'request': '',
'steps': [],
'confirmation': ''
}
current_section = None
for line in lines:
if 'USER_REQUEST:' in line:
trajectory['request'] = line.split('USER_REQUEST:')[1].strip()
current_section = 'request'
elif 'TRAJECTORY:' in line:
current_section = 'steps'
elif 'CONFIRMATION:' in line:
trajectory['confirmation'] = line.split('CONFIRMATION:')[1].strip()
current_section = 'confirmation'
elif current_section == 'steps' and line.strip():
# Parse step: [Step N]: action=..., parameters={...}
step_data = parse_step_line(line)
trajectory['steps'].append(step_data)
return trajectory
def parse_step_line(line: str) -> Dict:
"""Parse individual trajectory step."""
import re
# Example: [Step 1]: action=select_flight, parameters={'flight_id': 'AA123'}
match = re.search(r'action=(\w+),\s*parameters=(\{.*\})', line)
if match:
return {
'action': match.group(1),
'parameters': eval(match.group(2)) # In practice, use safer parsing
}
return {}
Create training data from verified trajectories:
def create_training_data(trajectories, constraints):
"""
Convert verified trajectories into supervised fine-tuning examples.
"""
training_examples = []
for trajectory in trajectories:
# Verify one more time before adding to training
if not constraints.verify_trajectory(trajectory['steps']):
continue
# Create prompt-response pairs
user_request = trajectory['request']
# Multi-turn conversation: user request + agent actions
conversation = [
{'role': 'user', 'content': user_request}
]
for step in trajectory['steps']:
action_str = f"Action: {step['action']}\n"
params_str = f"Parameters: {step['parameters']}"
conversation.append({
'role': 'assistant',
'content': f"{action_str}{params_str}"
})
training_examples.append({
'conversation': conversation,
'trajectory': trajectory,
'verified': True
})
return training_examples
# Usage
constraints = AirlineBookingConstraints()
trajectories = generate_tool_trajectories(
model,
constraints,
num_trajectories=1000
)
training_data = create_training_data(trajectories, constraints)
Train agent on verified trajectories:
def train_tool_use_agent(
model,
training_data,
num_epochs=3,
learning_rate=1e-5,
):
"""
Fine-tune model on constraint-verified tool-use data.
"""
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate)
for epoch in range(num_epochs):
total_loss = 0.0
for example in training_data:
# Format conversation for training
conversation_text = format_conversation(example['conversation'])
# Standard language modeling loss
outputs = model(conversation_text)
loss = outputs.loss
# Backward pass
optimizer.zero_grad()
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
optimizer.step()
total_loss += loss.item()
avg_loss = total_loss / len(training_data)
print(f"Epoch {epoch+1}: Loss = {avg_loss:.4f}")
return model
def format_conversation(conversation):
"""Format multi-turn conversation for LLM fine-tuning."""
formatted = ""
for turn in conversation:
role = turn['role'].upper()
content = turn['content']
formatted += f"{role}: {content}\n"
return formatted
When to Use:
When NOT to Use:
Constraint Design:
Generation and Filtering:
Training:
Results:
Reference: CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification
testing
Uses flow maps as look-ahead operators to enable principled reward-guided diffusion by predicting trajectory endpoints at any denoising step. Deploy when applying rewards or preferences to diffusion trajectories with meaningful gradients throughout generation.
testing
Train language models where each expert learns independently on closed datasets, enabling flexible inference with selective data inclusion or exclusion. 41% performance improvement while allowing users to opt out of specific data sources without retraining.
data-ai
Understand how token generation flexibility in diffusion LMs paradoxically constrains reasoning, as models exploit ordering flexibility to avoid uncertain tokens, and apply simplified approaches that preserve parallel decoding benefits. Use when optimizing diffusion-based language models for reasoning tasks.
devops
Enable LLM agents to improve continuously during deployment by constructing structured experience libraries through self-reflection on successes and failures—achieving 23% improvement on reasoning without gradient-based parameter updates or external training.