skills/dspy-react/SKILL.md
Use when the task requires calling external tools or APIs to gather information — multi-step tool use with reasoning, like searching databases, calling APIs, or combining multiple data sources. Common scenarios - building agents that search the web and synthesize results, multi-step information gathering from APIs, chatbots that look up data before answering, question answering that requires external knowledge, or any task needing interleaved reasoning and action. Related - ai-taking-actions, ai-searching-docs, dspy-codeact, dspy-tools. Also used for dspy.ReAct, ReAct agent pattern, reasoning and acting loop, tool-using agent in DSPy, search then answer pattern, agent with tools, multi-step tool use, interleave thinking and acting, API-calling agent, agent that reasons about tool outputs, when to use ReAct vs CodeAct, build intelligent agent with DSPy.
npx skillsauth add lebsral/dspy-programming-not-prompting-lms-skills dspy-reactInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Guide the user through building agents that reason step-by-step and call tools to accomplish tasks. dspy.ReAct implements the Reasoning-Action-Observation loop -- the agent thinks about what to do, calls a tool, observes the result, and repeats until it has an answer.
dspy.ReAct implements the Reasoning-Action-Observation loop as an optimizable module. The agent reasons about what to do, calls a tool, observes the result, and repeats until it has enough information to answer. DSPy handles the loop mechanics and prompt construction.
| Use ReAct when... | Use something else when... |
|--------------------|----------------------------|
| The agent needs to call external tools (search, APIs, databases) | You just need input -> output with no tools (dspy.ChainOfThought) |
| Multi-step reasoning with real-world data | The task is purely computational / code-heavy (dspy.CodeAct) |
| You want the agent to decide which tools to call and in what order | You have a fixed pipeline of steps (dspy.Module with sub-modules) |
| You need an interpretable trace of reasoning + actions | You need agents coordinating with each other (see /ai-coordinating-agents) |
Tools are Python functions with type hints and docstrings. DSPy uses the function signature and docstring to tell the agent what each tool does and how to call it.
def search(query: str) -> str:
"""Search the web for information about a topic."""
# Your search implementation here
return "search results..."
def calculate(expression: str) -> float:
"""Evaluate a mathematical expression and return the result."""
return eval(expression) # use a safe evaluator in production
def get_weather(city: str) -> str:
"""Get the current weather for a city."""
# Your weather API call here
return f"72°F and sunny in {city}"
Tool requirements:
Keep tools focused on one thing. A search tool should search, not search-and-summarize.
import dspy
lm = dspy.LM("openai/gpt-4o-mini") # or "anthropic/claude-sonnet-4-5-20250929", etc.
dspy.configure(lm=lm)
def search(query: str) -> str:
"""Search for information about a topic."""
return "DSPy is a framework for programming language models."
agent = dspy.ReAct("question -> answer", tools=[search])
result = agent(question="What is DSPy?")
print(result.answer)
That's it. The agent will:
searchdspy.ReAct(
signature, # str | Signature -- required, defines inputs/outputs
tools, # list[Callable | dspy.Tool] -- required, available tools
max_iters=20, # int -- max reasoning-action cycles
)
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| signature | str \| type[Signature] | required | Input/output contract (inline or class-based) |
| tools | list[Callable \| dspy.Tool] | required | Functions the agent can call. DSPy wraps plain functions automatically |
| max_iters | int | 20 | Max Thought-Action-Observation cycles before forcing an answer |
max_iters controls how many Thought-Action-Observation cycles the agent can take before it must produce an answer:
# Simple lookup -- 1-2 tool calls usually enough
agent = dspy.ReAct("question -> answer", tools=[search], max_iters=3)
# Complex research -- may need many tool calls
agent = dspy.ReAct("question -> answer", tools=[search, lookup], max_iters=8)
Guidelines:
max_iters without finishing, it returns its best answer so farGive the agent multiple tools and it decides which to use and when:
import dspy
def search(query: str) -> str:
"""Search the web for general information."""
return "search results..."
def lookup_user(email: str) -> str:
"""Look up a user account by email address."""
return '{"name": "Alice", "plan": "pro", "status": "active"}'
def check_order(order_id: str) -> str:
"""Check the status of an order by its ID."""
return '{"order_id": "12345", "status": "shipped", "eta": "March 20"}'
agent = dspy.ReAct(
"question -> answer",
tools=[search, lookup_user, check_order],
max_iters=5,
)
# The agent picks the right tool based on the question
result = agent(question="What's the status of order 12345?")
print(result.answer) # Uses check_order
result = agent(question="What plan is [email protected] on?")
print(result.answer) # Uses lookup_user
The agent can also chain tools -- call lookup_user first, then use the result to call check_order.
For production use, wrap dspy.ReAct inside a dspy.Module to add pre-processing, context, or post-processing:
class SupportAgent(dspy.Module):
def __init__(self):
self.agent = dspy.ReAct(
"question, context -> answer",
tools=[search, lookup_user, check_order],
max_iters=6,
)
def forward(self, question):
context = (
"You are a customer support agent. "
"Use lookup_user for account questions, "
"check_order for order questions, "
"and search for general questions."
)
return self.agent(question=question, context=context)
def support_reward(args, pred):
if len(pred.answer) > 20:
return 1.0
return 0.0 # Response too short — not detailed enough
validated_support = dspy.Refine(
module=SupportAgent(),
N=3,
reward_fn=support_reward,
threshold=1.0,
)
This pattern lets you:
dspy.RefineFor agents with typed inputs and outputs, use a class-based signature:
from typing import Literal
class ResearchTask(dspy.Signature):
"""Research a topic and provide a comprehensive answer with sources."""
question: str = dspy.InputField(desc="The research question")
answer: str = dspy.OutputField(desc="A thorough answer to the question")
confidence: Literal["high", "medium", "low"] = dspy.OutputField(
desc="Confidence level based on the sources found"
)
agent = dspy.ReAct(ResearchTask, tools=[search], max_iters=5)
result = agent(question="What are the main features of DSPy?")
print(result.answer)
print(result.confidence)
Both are agent modules, but they act differently:
| | ReAct | CodeAct | |---|-------|---------| | How it acts | Calls tools by name with arguments | Writes and executes Python code | | Best for | API calls, database lookups, search | Data manipulation, calculations, file I/O | | Interpretability | Clear tool call trace | Full code trace | | Tool style | Function calls | Python expressions | | Use when | You have specific tools to call | The task is better solved by writing code |
# ReAct -- calls tools
agent = dspy.ReAct("question -> answer", tools=[search, calculate])
# CodeAct -- writes code
agent = dspy.CodeAct("question -> answer", tools=[search, calculate])
If you're unsure, start with ReAct. Switch to CodeAct if the agent needs to do math, string manipulation, or data transformations between tool calls.
Tools can fail. Handle errors inside your tools so the agent gets a useful message instead of a crash:
import requests
def search(query: str) -> str:
"""Search the web for information."""
try:
response = requests.get(
"https://api.example.com/search",
params={"q": query},
timeout=5,
)
response.raise_for_status()
return response.json()["results"]
except requests.Timeout:
return "Error: Search timed out. Try a simpler query."
except requests.HTTPError as e:
return f"Error: Search failed with status {e.response.status_code}."
except Exception as e:
return f"Error: {str(e)}"
When a tool returns an error string, the agent sees it as an Observation and can decide to retry with different arguments, try a different tool, or give a partial answer.
For module-level error handling, wrap the agent call:
class SafeAgent(dspy.Module):
def __init__(self):
self.agent = dspy.ReAct("question -> answer", tools=[search], max_iters=5)
self.fallback = dspy.ChainOfThought("question -> answer")
def forward(self, question):
try:
return self.agent(question=question)
except Exception:
# Fall back to answering without tools
return self.fallback(question=question)
ReAct agents are optimizable like any DSPy module. The optimizer tunes the reasoning prompts so the agent makes better tool-calling decisions:
def answer_metric(example, prediction, trace=None):
return prediction.answer.strip().lower() == example.answer.strip().lower()
# BootstrapFewShot for quick optimization
optimizer = dspy.BootstrapFewShot(metric=answer_metric, max_bootstrapped_demos=4)
optimized_agent = optimizer.compile(agent, trainset=trainset)
# MIPROv2 for better prompt optimization
optimizer = dspy.MIPROv2(metric=answer_metric, auto="medium")
optimized_agent = optimizer.compile(agent, trainset=trainset)
# Save and load
optimized_agent.save("optimized_agent.json")
Inspect what the agent is doing:
# See the last few LM calls (thoughts, tool calls, observations)
dspy.inspect_history(n=5)
# Print the module structure
print(agent)
inspect_history shows you the full Thought-Action-Observation trace, which is invaluable for understanding why the agent called certain tools or gave a wrong answer.
max_iters=5 but the default is 20. Claude habitually passes max_iters=5 which cuts off complex multi-step tasks too early. The actual default is 20. Only lower it when you want to constrain simple tasks (2-3 for lookups). For complex research, the default of 20 is appropriate.CodeAct for computation-heavy tasks where the agent can do work in code between tool calls.trajectory in the return value. ReAct returns a dspy.Prediction with a .trajectory dict containing the full Thought-Action-Observation trace. Access result.trajectory to log or debug the agent's reasoning path. Claude often discards this and only uses result.answer.Install any skill:
npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>
/dspy-tools/dspy-codeact/dspy-modules/ai-taking-actions/ai-coordinating-agents/ai-do if you do not have it — it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-dotools
See what is happening during optimizer.compile() instead of waiting blind. Use when you want to watch optimization progress, see scores as they come in, know if your optimizer is working, check if optimization is stuck, understand why optimization is taking too long, get live progress during compile, monitor convergence, detect overfitting during optimization, interpret optimization results, or pick the right tool for watching optimization. Also used for optimizer progress bar, is my optimizer doing anything, optimization seems stuck, how long will optimization take, watch GEPA run, watch MIPROv2 run, live optimization dashboard, optimizer not improving, scores not going up, optimization taking forever, see what optimizer is doing, debug slow optimization, optimization visibility, optimizer metrics, track compile progress, optimization observability.
testing
Use when you want the highest-quality prompt optimization DSPy offers — jointly optimizes instructions and few-shot demos, with auto=light/medium/heavy presets. Common scenarios - you want the best possible accuracy from prompt optimization, jointly tuning instructions and few-shot demonstrations, using auto presets for different compute budgets, or when COPRO or BootstrapFewShot alone are not reaching your accuracy target. Related - ai-improving-accuracy, dspy-copro, dspy-bootstrap-few-shot. Also used for dspy.MIPROv2, best DSPy optimizer, highest quality optimization, auto=light medium heavy, joint instruction and demo optimization, most powerful prompt optimizer, MIPROv2 vs COPRO vs BootstrapFewShot, which optimizer should I use, state of the art prompt optimization, when to use MIPROv2, optimize both instructions and examples, heavy optimization for production, best optimizer for accuracy.
testing
Use LangWatch for DSPy auto-tracing and real-time optimizer progress. Use when you want to set up LangWatch, langwatch.dspy.init, auto-tracing DSPy, real-time optimization dashboard, optimizer progress tracking, app.langwatch.ai, or DSPy optimizer dashboard. Also used for langwatch setup, pip install langwatch, langwatch trace, optimizer progress, real-time optimization, watch optimizer run, LangWatch self-hosted, langwatch docker, langwatch vs langtrace, langwatch autotrack_dspy.
data-ai
Use when you want to optimize instructions without few-shot examples — a lightweight alternative to COPRO when you do not have or do not want to use demonstrations. Common scenarios - optimizing instructions when you do not have or do not want to use few-shot demonstrations, lightweight instruction search as a first step, tasks where examples in the prompt confuse the model, or when you want fast instruction optimization without the cost of COPRO. Related - ai-improving-accuracy, dspy-copro, dspy-miprov2. Also used for dspy.GEPA, instruction optimization without demos, lightweight prompt optimization, optimize instructions only, no few-shot examples needed, GEPA vs COPRO, quick instruction search, when demonstrations hurt performance, zero-shot optimization, instruction-only optimizer, simplest instruction tuner, fast prompt optimization, skip few-shot and just tune instructions, optimize Pydantic field descriptions, GEPA structured output, GEPA does not optimize field desc.