skills/skillxiv-v0.0.2-claude-opus-4.6/docdancer-document-agent/SKILL.md
Build open-source agents for document question-answering by modeling DocQA as information-seeking with explicit tool utilization. DocDancer uses an exploration-then-synthesis pipeline to generate high-quality training data, addressing the scarcity that limits agent-based document understanding systems.
npx skillsauth add ADu2021/skillXiv docdancer-document-agentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Existing document question-answering (DocQA) agents suffer from two critical limitations: (1) they lack effective tool utilization, relying on implicit understanding instead of explicit document exploration, and (2) they depend heavily on closed-source models, limiting accessibility and adaptability. The fundamental barrier is scarcity of high-quality training data for DocQA agents—annotation is expensive and difficult at scale.
Model DocQA as information-seeking with explicit tool integration, then generate synthetic training data through an exploration-then-synthesis pipeline.
class DocDancerAgent:
def __init__(self, base_llm, document):
self.llm = base_llm
self.document = document
self.interaction_history = []
def answer_document_question(self, question):
"""Tool-driven exploration followed by answer synthesis"""
# Phase 1: Exploration
exploration_steps = self.explore_document(question)
# exploration_steps = [
# {"action": "highlight", "text": "...", "rationale": "..."},
# {"action": "extract", "content": "...", "rationale": "..."},
# {"action": "reason", "inference": "...", "rationale": "..."}
# ]
# Phase 2: Synthesis
answer = self.synthesize_answer(question, exploration_steps)
self.interaction_history.append({
"question": question,
"exploration": exploration_steps,
"answer": answer
})
return answer
def explore_document(self, question):
"""Sequential tool invocation for information gathering"""
steps = []
context = f"Question: {question}\nDocument: {self.document[:2000]}..."
for exploration_turn in range(max_exploration_steps):
# Decide which tool to use next
tool_decision = self.llm.generate(f"""
Current exploration state:
{format_exploration_history(steps)}
Question: {question}
What's the next exploration action?
Options:
- highlight: Mark important text regions
- extract: Pull out specific information
- reason: Make inference from gathered info
- stop: Sufficient information gathered
""")
action = parse_tool_action(tool_decision)
if action == "stop":
break
# Execute chosen tool
if action == "highlight":
highlighted_text = self.identify_relevant_sections(question, context)
steps.append({
"action": "highlight",
"text": highlighted_text,
"rationale": tool_decision
})
elif action == "extract":
extracted_content = self.extract_key_information(question, context)
steps.append({
"action": "extract",
"content": extracted_content,
"rationale": tool_decision
})
elif action == "reason":
inference = self.llm.generate(f"""
Based on gathered evidence:
{format_exploration_steps(steps)}
Make an inference relevant to: {question}
""")
steps.append({
"action": "reason",
"inference": inference,
"rationale": tool_decision
})
return steps
def synthesize_answer(self, question, exploration_steps):
"""Combine exploration traces into final answer"""
synthesis_prompt = f"""
Question: {question}
Exploration process:
{format_exploration_steps(exploration_steps)}
Based on this exploration, provide the final answer.
"""
return self.llm.generate(synthesis_prompt)
Tool-Driven Architecture:
Exploration-Then-Synthesis Pipeline: Generates high-quality synthetic training data:
def generate_synthetic_training_data(document, gold_answer, num_samples=100):
"""Generate diverse question-exploration-answer triplets"""
synthetic_data = []
for sample_idx in range(num_samples):
# Generate question variants that require document exploration
question = generate_question_from_answer(gold_answer, document)
# Simulate diverse exploration strategies
exploration_trajectories = []
for strategy in ["sequential", "selective", "comprehensive"]:
trajectory = simulate_exploration(
question, document, gold_answer, strategy
)
exploration_trajectories.append(trajectory)
# Create training examples from best exploration
best_trajectory = select_best_trajectory(
exploration_trajectories, gold_answer
)
synthetic_data.append({
"question": question,
"document": document,
"exploration": best_trajectory,
"answer": gold_answer
})
return synthetic_data
Data Scarcity Problem:
Solution: Synthetic Generation
Benchmarks:
Comparison:
Full codebase released with:
testing
Uses flow maps as look-ahead operators to enable principled reward-guided diffusion by predicting trajectory endpoints at any denoising step. Deploy when applying rewards or preferences to diffusion trajectories with meaningful gradients throughout generation.
testing
Train language models where each expert learns independently on closed datasets, enabling flexible inference with selective data inclusion or exclusion. 41% performance improvement while allowing users to opt out of specific data sources without retraining.
data-ai
Understand how token generation flexibility in diffusion LMs paradoxically constrains reasoning, as models exploit ordering flexibility to avoid uncertain tokens, and apply simplified approaches that preserve parallel decoding benefits. Use when optimizing diffusion-based language models for reasoning tasks.
devops
Enable LLM agents to improve continuously during deployment by constructing structured experience libraries through self-reflection on successes and failures—achieving 23% improvement on reasoning without gradient-based parameter updates or external training.