skills/dspy-modules/SKILL.md
Use when you need to compose multiple DSPy calls into a pipeline — structuring multi-step programs as reusable, optimizable components with forward() logic. Common scenarios - building a multi-step pipeline as a class, composing Predict and ChainOfThought calls in sequence, creating reusable AI components, structuring a RAG pipeline as a module, or building nested programs where one module calls another. Related - ai-building-pipelines, dspy-predict, dspy-chain-of-thought. Also used for dspy.Module, forward() method, custom DSPy module, compose DSPy calls, multi-step DSPy program, pipeline as a class, reusable AI components, nested DSPy modules, module design patterns, how to structure a DSPy program, class-based DSPy pipeline, self.predict in forward, modular AI pipeline, build complex DSPy programs, combine multiple DSPy calls into one module.
npx skillsauth add lebsral/dspy-programming-not-prompting-lms-skills dspy-modulesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Guide the user through structuring DSPy programs as reusable, composable modules. A dspy.Module is the building block for all DSPy programs -- like PyTorch's nn.Module but for language model pipelines.
dspy.Module is the building block for multi-step DSPy programs. Declare sub-modules in __init__ as self. attributes, wire them together with Python logic in forward(). DSPy optimizers automatically discover and tune all sub-modules in the tree.
Modules are composable. A module can use other custom modules as sub-modules:
class Summarizer(dspy.Module):
def __init__(self):
self.summarize = dspy.ChainOfThought("text -> summary")
def forward(self, text):
return self.summarize(text=text)
class AnalyzeAndSummarize(dspy.Module):
def __init__(self):
self.classify = dspy.Predict("text -> category")
self.summarizer = Summarizer() # nested custom module
self.respond = dspy.ChainOfThought("category, summary -> response")
def forward(self, text):
category = self.classify(text=text).category
summary = self.summarizer(text=text).summary
return self.respond(category=category, summary=summary)
DSPy optimizers traverse the full module tree. When you optimize AnalyzeAndSummarize, the inner Summarizer's prompts get optimized too.
Use print() to inspect all sub-modules and their signatures:
pipeline = AnalyzeAndSummarize()
print(pipeline)
Output shows the module tree:
AnalyzeAndSummarize(
classify = Predict(text -> category)
summarizer = Summarizer(
summarize = ChainOfThought(text -> summary)
)
respond = ChainOfThought(category, summary -> response)
)
This is useful for verifying your module hierarchy and debugging which sub-modules exist.
After optimization, save the learned state (few-shot demos, instructions) and reload it later:
# Save after optimization
optimized_program = optimizer.compile(my_program, trainset=trainset)
optimized_program.save("my_program.json")
# Load into a fresh instance
loaded = MyProgram()
loaded.load("my_program.json")
# Use the loaded program -- it has the optimized prompts
result = loaded(question="What is DSPy?")
What gets saved:
Predict modules trackWhat does not get saved:
forward() -- that's your codeBootstrapFinetune)dspy.configure() before loadingUse dspy.Refine to enforce quality constraints on outputs through a reward function. This replaces the older dspy.Assert/dspy.Suggest pattern:
class SafeQA(dspy.Module):
def __init__(self):
self.generate = dspy.ChainOfThought("question -> answer")
def forward(self, question):
return self.generate(question=question)
def answer_reward(args, pred):
"""Score answer quality. Returns float between 0.0 and 1.0."""
score = 0.0
# Hard requirement -- must provide a substantive answer
if pred.answer.strip() and pred.answer != "I don't know":
score += 0.6
# Quality preference -- at least 10 words
if len(pred.answer.split()) >= 10:
score += 0.4
return score
# Wrap with Refine to retry until quality threshold is met
validated_qa = dspy.Refine(
module=SafeQA(),
N=3,
reward_fn=answer_reward,
threshold=0.6, # must at least pass the hard requirement
)
dspy.Refine -- wraps a module, scores each attempt with a reward function, and retries until the threshold is met (up to N attempts). Use for requirements that must be met.dspy.BestOfN -- similar to Refine but without cross-attempt feedback; use when attempts are independent.For detailed Refine patterns and examples, see /dspy-refine and /dspy-best-of-n.
Route to different sub-modules based on intermediate results:
class ConditionalPipeline(dspy.Module):
def __init__(self):
self.classify = dspy.Predict("text -> category")
self.simple_handler = dspy.Predict("text -> response")
self.complex_handler = dspy.ChainOfThought("text -> response")
def forward(self, text):
category = self.classify(text=text).category
if category in ("simple", "faq"):
return self.simple_handler(text=text)
else:
return self.complex_handler(text=text)
Process a list of items or iterate until a condition is met:
class BatchProcessor(dspy.Module):
def __init__(self):
self.process_item = dspy.ChainOfThought("item -> result")
def forward(self, items: list[str]):
results = []
for item in items:
result = self.process_item(item=item)
results.append(result.result)
return dspy.Prediction(results=results)
Keep improving until quality is sufficient:
class Refiner(dspy.Module):
def __init__(self, max_rounds=3):
self.draft = dspy.ChainOfThought("task -> output")
self.critique = dspy.ChainOfThought("task, output -> feedback, is_good: bool")
self.revise = dspy.ChainOfThought("task, output, feedback -> output")
self.max_rounds = max_rounds
def forward(self, task):
result = self.draft(task=task)
for _ in range(self.max_rounds):
check = self.critique(task=task, output=result.output)
if check.is_good:
break
result = self.revise(
task=task,
output=result.output,
feedback=check.feedback,
)
return result
Wrap sub-module calls to handle failures gracefully:
class ResilientModule(dspy.Module):
def __init__(self):
self.primary = dspy.ChainOfThought("question -> answer")
self.fallback = dspy.Predict("question -> answer")
def forward(self, question):
try:
return self.primary(question=question)
except Exception:
return self.fallback(question=question)
Use dspy.Prediction to return structured results from forward():
class MultiOutput(dspy.Module):
def __init__(self):
self.analyze = dspy.ChainOfThought("text -> sentiment, topics: list[str]")
self.summarize = dspy.ChainOfThought("text -> summary")
def forward(self, text):
analysis = self.analyze(text=text)
summary = self.summarize(text=text)
return dspy.Prediction(
sentiment=analysis.sentiment,
topics=analysis.topics,
summary=summary.summary,
)
Assign cheaper models to simpler steps:
expensive_lm = dspy.LM("openai/gpt-4o") # or "anthropic/claude-sonnet-4-5-20250929", etc.
cheap_lm = dspy.LM("openai/gpt-4o-mini") # or any smaller model
pipeline = MyProgram()
pipeline.classify.set_lm(cheap_lm)
pipeline.generate.set_lm(expensive_lm)
Use batch() to process multiple examples in parallel:
pipeline = MyProgram()
examples = [dspy.Example(question=q).with_inputs("question") for q in questions]
results = pipeline.batch(examples, num_threads=4, timeout=120)
self. attributes. Optimizers discover sub-modules by traversing self. attributes in __init__. A Predict stored in a local variable or a plain list is invisible to optimization. Use a dict assigned to self. — DSPy traverses dicts for parameters.dspy.configure() inside forward(). Configure once at startup. Calling it per-forward adds overhead and causes unexpected behavior during optimization.forward() args differently from training example fields. When an optimizer traces your module, it passes inputs from training examples to forward(). Mismatched argument names cause silent failures. Use the same field names as your dspy.Example inputs.forward() method. Every dspy.Module subclass must implement forward(). Without it, calling the module raises an error.Install any skill:
npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>
/dspy-signatures/dspy-predict/dspy-chain-of-thought/ai-building-pipelines/ai-improving-accuracy/ai-do if you do not have it — it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-dotools
See what is happening during optimizer.compile() instead of waiting blind. Use when you want to watch optimization progress, see scores as they come in, know if your optimizer is working, check if optimization is stuck, understand why optimization is taking too long, get live progress during compile, monitor convergence, detect overfitting during optimization, interpret optimization results, or pick the right tool for watching optimization. Also used for optimizer progress bar, is my optimizer doing anything, optimization seems stuck, how long will optimization take, watch GEPA run, watch MIPROv2 run, live optimization dashboard, optimizer not improving, scores not going up, optimization taking forever, see what optimizer is doing, debug slow optimization, optimization visibility, optimizer metrics, track compile progress, optimization observability.
testing
Use when you want the highest-quality prompt optimization DSPy offers — jointly optimizes instructions and few-shot demos, with auto=light/medium/heavy presets. Common scenarios - you want the best possible accuracy from prompt optimization, jointly tuning instructions and few-shot demonstrations, using auto presets for different compute budgets, or when COPRO or BootstrapFewShot alone are not reaching your accuracy target. Related - ai-improving-accuracy, dspy-copro, dspy-bootstrap-few-shot. Also used for dspy.MIPROv2, best DSPy optimizer, highest quality optimization, auto=light medium heavy, joint instruction and demo optimization, most powerful prompt optimizer, MIPROv2 vs COPRO vs BootstrapFewShot, which optimizer should I use, state of the art prompt optimization, when to use MIPROv2, optimize both instructions and examples, heavy optimization for production, best optimizer for accuracy.
testing
Use LangWatch for DSPy auto-tracing and real-time optimizer progress. Use when you want to set up LangWatch, langwatch.dspy.init, auto-tracing DSPy, real-time optimization dashboard, optimizer progress tracking, app.langwatch.ai, or DSPy optimizer dashboard. Also used for langwatch setup, pip install langwatch, langwatch trace, optimizer progress, real-time optimization, watch optimizer run, LangWatch self-hosted, langwatch docker, langwatch vs langtrace, langwatch autotrack_dspy.
data-ai
Use when you want to optimize instructions without few-shot examples — a lightweight alternative to COPRO when you do not have or do not want to use demonstrations. Common scenarios - optimizing instructions when you do not have or do not want to use few-shot demonstrations, lightweight instruction search as a first step, tasks where examples in the prompt confuse the model, or when you want fast instruction optimization without the cost of COPRO. Related - ai-improving-accuracy, dspy-copro, dspy-miprov2. Also used for dspy.GEPA, instruction optimization without demos, lightweight prompt optimization, optimize instructions only, no few-shot examples needed, GEPA vs COPRO, quick instruction search, when demonstrations hurt performance, zero-shot optimization, instruction-only optimizer, simplest instruction tuner, fast prompt optimization, skip few-shot and just tune instructions, optimize Pydantic field descriptions, GEPA structured output, GEPA does not optimize field desc.