skills/dspy-tools/SKILL.md
Use when you need to give DSPy agents tool-calling abilities — wrapping Python functions as tools, building tool-using pipelines, or setting up code execution environments. Common scenarios - wrapping a Python function as a tool for DSPy agents, building tool-using pipelines, setting up a calculator or search tool, giving agents access to databases or APIs, or configuring code execution environments for agents. Related - ai-taking-actions, dspy-react, dspy-codeact. Also used for dspy.Tool, wrap function as DSPy tool, give agent tools, tool calling in DSPy, function calling for agents, build custom tools for DSPy agent, calculator tool for LLM, search tool for agent, database tool for AI, Python function to agent tool, MCP tools with DSPy, tool registry, how to define tools in DSPy, agent tool configuration, executable tools for AI agents.
npx skillsauth add lebsral/dspy-programming-not-prompting-lms-skills dspy-toolsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Guide the user through wrapping functions as DSPy tools, using dspy.PythonInterpreter for sandboxed code execution, and wiring tools into agents with dspy.ReAct and dspy.CodeAct.
Ask the user before diving in:
PythonInterpreter. Otherwise, plain tool functions are simpler.Then jump to the relevant section below.
dspy.Tool wraps a Python function so DSPy agents can call it. It automatically extracts the function's name, docstring, parameter types, and descriptions to build the tool schema that the LM sees.
You can pass plain functions directly to dspy.ReAct or dspy.CodeAct and DSPy wraps them for you. Use dspy.Tool explicitly when you need to override the inferred metadata, convert tools from LangChain or MCP, or inspect the generated schema.
import dspy
# Implicit -- pass a function directly (DSPy wraps it automatically)
agent = dspy.ReAct("question -> answer", tools=[my_search_function])
# Explicit -- wrap it yourself for control over name, description, etc.
tool = dspy.Tool(my_search_function, name="search", desc="Search the knowledge base")
agent = dspy.ReAct("question -> answer", tools=[tool])
dspy.Tool(
func, # Callable -- the function to wrap
name=None, # str | None -- tool name (inferred from func.__name__ if omitted)
desc=None, # str | None -- description (inferred from docstring if omitted)
args=None, # dict | None -- argument JSON schemas (inferred from type hints)
arg_types=None, # dict | None -- argument type mappings (inferred from type hints)
arg_desc=None, # dict | None -- per-argument descriptions
)
All parameters except func are optional. DSPy infers them from the function signature and docstring. Override them when the inferred values are wrong or when you want a different name or description.
def foo(x: int, y: str = "hello"):
"""Combine a number and a string."""
return str(x) + y
tool = dspy.Tool(foo)
print(tool.name) # "foo"
print(tool.desc) # "Combine a number and a string."
print(tool.args) # {'x': {'type': 'integer'}, 'y': {'type': 'string', 'default': 'hello'}}
The quality of your tools depends on three things: type hints, docstrings, and focused scope.
DSPy reads type hints to build the JSON schema the LM uses for tool calling. Always annotate every parameter and the return type.
# Good -- fully typed
def search(query: str, max_results: int = 5) -> str:
"""Search the knowledge base for documents matching the query."""
...
# Bad -- no type hints, agent won't know what to pass
def search(query, max_results=5):
"""Search the knowledge base."""
...
The docstring becomes the tool description. Write it from the perspective of someone deciding whether to call this tool.
# Good -- explains what the tool does and when to use it
def lookup_user(email: str) -> str:
"""Look up a user account by email address. Returns name, plan, and join date."""
...
# Bad -- vague, doesn't help the agent decide
def lookup_user(email: str) -> str:
"""Get user info."""
...
Keep tools focused. A tool that searches and summarizes is harder for the agent to use than two separate tools.
# Good -- single responsibility
def search(query: str) -> str:
"""Search for documents matching the query."""
...
def summarize(text: str) -> str:
"""Summarize a long piece of text into key points."""
...
# Bad -- does two things
def search_and_summarize(query: str) -> str:
"""Search for documents and summarize the results."""
...
Tool return values become the Observation the agent sees. Return a string (or something that converts to string cleanly).
import json
def check_order(order_id: str) -> str:
"""Check the status of an order by its ID."""
order = db.get_order(order_id)
if order:
return json.dumps(order)
return f"No order found with ID {order_id}."
For complex tools, add per-argument descriptions using arg_desc:
tool = dspy.Tool(
search,
arg_desc={
"query": "The search query -- use keywords, not full sentences",
"max_results": "Maximum number of results to return (1-20)",
},
)
dspy.PythonInterpreter runs Python code in a sandboxed Deno + Pyodide environment. By default, the sandbox has no filesystem, network, or environment access. You selectively enable what you need.
dspy.PythonInterpreter(
deno_command=None, # list[str] | None -- custom Deno launch command
enable_read_paths=None, # list[str] | None -- paths the sandbox can read
enable_write_paths=None, # list[str] | None -- paths the sandbox can write
enable_env_vars=None, # list[str] | None -- environment variables to expose
enable_network_access=None, # list[str] | None -- allowed network domains
sync_files=True, # bool -- sync file changes back to host
tools=None, # dict[str, Callable] | None -- host-side tool functions
output_fields=None, # list[dict] | None -- output field definitions
)
Prerequisites: Deno must be installed. See https://docs.deno.com/runtime/getting_started/installation/
from dspy import PythonInterpreter
with PythonInterpreter() as interp:
result = interp("print(1 + 2)") # Returns "3"
Tools passed to PythonInterpreter run in your normal Python process (not the sandbox). The sandbox calls them via JSON-RPC. This lets tools access databases, APIs, and libraries that aren't available inside the sandbox.
def fetch_price(ticker: str) -> str:
"""Fetch the current stock price for a ticker symbol."""
import requests
resp = requests.get(f"https://api.example.com/price/{ticker}")
return resp.json()["price"]
with PythonInterpreter(tools={"fetch_price": fetch_price}) as interp:
result = interp("price = fetch_price(ticker='AAPL')\nprint(f'Price: {price}')")
# Allow reading from a data directory and accessing one API
interp = PythonInterpreter(
enable_read_paths=["./data"],
enable_network_access=["api.example.com"],
)
dspy.CodeAct creates a PythonInterpreter automatically if you don't pass one. Pass your own when you need custom permissions:
import dspy
interp = dspy.PythonInterpreter(
enable_read_paths=["./data"],
enable_network_access=["api.example.com"],
)
agent = dspy.CodeAct(
"question -> answer",
tools=[search, calculate],
interpreter=interp,
max_iters=5,
)
dspy.ToolCalls is a structured type representing tool-calling information -- tool names and their arguments in JSON format. Use it in signatures when you want the LM to output tool calls directly (without the ReAct loop).
import dspy
class PlanActions(dspy.Signature):
"""Given a user request, plan which tools to call."""
request: str = dspy.InputField()
actions: dspy.ToolCalls = dspy.OutputField()
planner = dspy.Predict(PlanActions)
result = planner(request="Look up the weather in Paris and convert to Celsius")
print(result.actions) # ToolCalls with name and args for each tool call
from dspy import ToolCalls
tool_calls = ToolCalls.from_dict_list([
{"name": "search", "args": {"query": "weather in Paris"}},
{"name": "convert_temp", "args": {"value": 72, "from_unit": "F", "to_unit": "C"}},
])
When the configured LM supports native tool calling (most modern LMs do), ToolCalls automatically adapts to use the LM's native function-calling API rather than generating JSON as text. This improves reliability.
dspy.ReAct is the standard choice for tool-using agents. Pass tools as a list of functions or dspy.Tool objects:
import dspy
def search(query: str) -> str:
"""Search for information about a topic."""
return "DSPy is a framework for programming language models."
def calculate(expression: str) -> float:
"""Evaluate a math expression and return the result."""
return eval(expression)
agent = dspy.ReAct(
"question -> answer",
tools=[search, calculate],
max_iters=5,
)
result = agent(question="What is 2^10 plus the year DSPy was released?")
print(result.answer)
The agent decides which tools to call, in what order, and when to stop. See /dspy-react for the full guide.
dspy.CodeAct agents write Python code that calls your tools. Tools must be pure functions (not callable objects). The agent can chain calls, use loops, and manipulate data in code:
import dspy
def factorial(n: int) -> int:
"""Calculate the factorial of n."""
if n <= 1:
return 1
return n * factorial(n - 1)
agent = dspy.CodeAct(
"question -> answer",
tools=[factorial],
max_iters=5,
)
result = agent(question="What is factorial(10) + factorial(5)?")
print(result.answer)
CodeAct tools have stricter requirements than ReAct tools: they must be plain functions, cannot import external libraries, and cannot reference global state. See /dspy-codeact for the full guide.
dspy.Tool.from_langchain() converts any LangChain tool to a DSPy tool:
import dspy
from langchain_community.tools import DuckDuckGoSearchRun, WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
search = dspy.Tool.from_langchain(DuckDuckGoSearchRun())
wikipedia = dspy.Tool.from_langchain(
WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())
)
agent = dspy.ReAct("question -> answer", tools=[search, wikipedia])
Install LangChain tools with pip install langchain-community.
dspy.Tool.from_mcp_tool() converts Model Context Protocol tools into DSPy tools. It preserves the tool's name, description, and input schema, and creates an async callable that invokes the tool through the MCP session.
Install the MCP extra:
pip install -U "dspy[mcp]"
import dspy
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client
async with streamablehttp_client("https://mcp.example.com/sse") as (read, write, _):
async with ClientSession(read, write) as session:
await session.initialize()
mcp_tools = await session.list_tools()
tools = [dspy.Tool.from_mcp_tool(session, t) for t in mcp_tools.tools]
agent = dspy.ReAct("question -> answer", tools=tools)
result = await agent.aforward(question="What files are in the repo?")
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
server_params = StdioServerParameters(
command="npx",
args=["-y", "@modelcontextprotocol/server-filesystem", "./data"],
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
mcp_tools = await session.list_tools()
tools = [dspy.Tool.from_mcp_tool(session, t) for t in mcp_tools.tools]
agent = dspy.ReAct("question -> answer", tools=tools)
result = await agent.aforward(question="List the files")
ClientSession yourself using the mcp library.from_mcp_tool creates an async callable, so use await agent.aforward() or run inside an async context.Good tools make good agents. Before passing a tool to an agent, check:
| Check | Why it matters | |-------|---------------| | All parameters have type hints | DSPy generates the JSON schema from them | | Return type is annotated | Helps the agent know what to expect | | Docstring explains what the tool does | The agent reads this to decide when to call it | | Docstring mentions required input format | e.g., "Pass repo as 'owner/name'" | | Parameters have sensible defaults | Reduces the number of decisions the agent makes | | Errors return useful strings, not exceptions | The agent sees the error as an Observation and can retry |
# A well-documented tool
def get_github_repo(repo: str) -> str:
"""Get information about a GitHub repository.
Pass the full repository name like 'stanfordnlp/dspy'.
Returns name, description, stars, and language.
"""
try:
response = requests.get(f"https://api.github.com/repos/{repo}", timeout=10)
response.raise_for_status()
data = response.json()
return f"Name: {data['full_name']}, Stars: {data['stargazers_count']}"
except requests.RequestException as e:
return f"Error: {str(e)}"
Tools should catch exceptions and return error strings. When a tool returns an error string, the agent sees it as an Observation and can retry with different arguments or try a different tool.
def search(query: str) -> str:
"""Search for information."""
try:
response = requests.get("https://api.example.com/search", params={"q": query}, timeout=5)
response.raise_for_status()
return response.json()["results"]
except requests.Timeout:
return "Error: Search timed out. Try a shorter or simpler query."
except requests.HTTPError as e:
return f"Error: Search failed with status {e.response.status_code}."
except Exception as e:
return f"Error: {str(e)}"
dspy.Tool only when you need to override the inferred name, description, or arg schema.from_langchain() or from_mcp_tool().repr() which is often unhelpful. Always return a formatted string or json.dumps() output.import pandas inside the tool function body. This works but adds latency on every call. Move imports to the top of the file — they run once, not per tool invocation.await agent.aforward() with MCP tools. MCP tools are async, so the agent must be called with aforward(). Claude defaults to agent() which blocks or fails silently in async contexts.Install any skill:
npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>
/dspy-react/dspy-codeact/ai-taking-actions/dspy-signatures/dspy-modules/ai-do if you do not have it — it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-dotools
See what is happening during optimizer.compile() instead of waiting blind. Use when you want to watch optimization progress, see scores as they come in, know if your optimizer is working, check if optimization is stuck, understand why optimization is taking too long, get live progress during compile, monitor convergence, detect overfitting during optimization, interpret optimization results, or pick the right tool for watching optimization. Also used for optimizer progress bar, is my optimizer doing anything, optimization seems stuck, how long will optimization take, watch GEPA run, watch MIPROv2 run, live optimization dashboard, optimizer not improving, scores not going up, optimization taking forever, see what optimizer is doing, debug slow optimization, optimization visibility, optimizer metrics, track compile progress, optimization observability.
testing
Use when you want the highest-quality prompt optimization DSPy offers — jointly optimizes instructions and few-shot demos, with auto=light/medium/heavy presets. Common scenarios - you want the best possible accuracy from prompt optimization, jointly tuning instructions and few-shot demonstrations, using auto presets for different compute budgets, or when COPRO or BootstrapFewShot alone are not reaching your accuracy target. Related - ai-improving-accuracy, dspy-copro, dspy-bootstrap-few-shot. Also used for dspy.MIPROv2, best DSPy optimizer, highest quality optimization, auto=light medium heavy, joint instruction and demo optimization, most powerful prompt optimizer, MIPROv2 vs COPRO vs BootstrapFewShot, which optimizer should I use, state of the art prompt optimization, when to use MIPROv2, optimize both instructions and examples, heavy optimization for production, best optimizer for accuracy.
testing
Use LangWatch for DSPy auto-tracing and real-time optimizer progress. Use when you want to set up LangWatch, langwatch.dspy.init, auto-tracing DSPy, real-time optimization dashboard, optimizer progress tracking, app.langwatch.ai, or DSPy optimizer dashboard. Also used for langwatch setup, pip install langwatch, langwatch trace, optimizer progress, real-time optimization, watch optimizer run, LangWatch self-hosted, langwatch docker, langwatch vs langtrace, langwatch autotrack_dspy.
data-ai
Use when you want to optimize instructions without few-shot examples — a lightweight alternative to COPRO when you do not have or do not want to use demonstrations. Common scenarios - optimizing instructions when you do not have or do not want to use few-shot demonstrations, lightweight instruction search as a first step, tasks where examples in the prompt confuse the model, or when you want fast instruction optimization without the cost of COPRO. Related - ai-improving-accuracy, dspy-copro, dspy-miprov2. Also used for dspy.GEPA, instruction optimization without demos, lightweight prompt optimization, optimize instructions only, no few-shot examples needed, GEPA vs COPRO, quick instruction search, when demonstrations hurt performance, zero-shot optimization, instruction-only optimizer, simplest instruction tuner, fast prompt optimization, skip few-shot and just tune instructions, optimize Pydantic field descriptions, GEPA structured output, GEPA does not optimize field desc.