skills/dspy-assertions/SKILL.md
REMOVED IN DSPy 3.x -- use dspy.Refine or dspy.BestOfN instead (see /dspy-refine, /dspy-best-of-n). Legacy documentation for dspy.Assert and dspy.Suggest kept for existing codebases only. For new code, use dspy.Refine (iterative improvement with feedback) or dspy.BestOfN (sampling, pick best). Also used for dspy.Assert, dspy.Suggest, runtime validation for LLM output, retry on bad output, backtracking on constraint violation, guard rails in DSPy.
npx skillsauth add lebsral/dspy-programming-not-prompting-lms-skills dspy-assertionsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
REMOVED IN DSPy 3.x.
dspy.Assertanddspy.Suggesthave been removed from the DSPy codebase (noassertions.py, no imports in__init__.py,retry.pycommented out, no docs page). Usedspy.Refineordspy.BestOfNinstead — see/dspy-refineand/dspy-best-of-n. This skill documents the legacy API for maintaining existing codebases only.Migration guide: | Old pattern | New equivalent | |-------------|---------------| |
dspy.Assert(condition, msg)(hard rule, retry) |dspy.Refine(module, N=3, reward_fn=..., threshold=0.8)| |dspy.Suggest(condition, msg)(soft rule, continue) | Lower weight in reward function (penalize but don't block) | |max_backtrack_attempts=2|N=3in Refine/BestOfN | |DSPyAssertionErroron exhaustion |fail_countparameter in Refine/BestOfN | | Error message as feedback | Refine auto-generates feedback from reward scores |
Guide the user through adding runtime constraints to DSPy programs. Assertions let you declare what valid output looks like — DSPy handles retrying, backtracking, and feeding error messages back to the LM automatically.
| | dspy.Assert | dspy.Suggest |
|---|---|---|
| Severity | Hard — must pass | Soft — should pass |
| On failure | Retries with feedback, then raises error | Logs a warning, continues execution |
| Use for | Format requirements, safety checks, non-negotiable rules | Style preferences, quality nudges, nice-to-haves |
import dspy
class QA(dspy.Module):
def __init__(self):
self.answer = dspy.ChainOfThought("question -> answer")
def forward(self, question):
result = self.answer(question=question)
# Hard constraint — retries if violated
dspy.Assert(
len(result.answer) > 0,
"Answer must not be empty",
)
# Soft constraint — logs warning but continues
dspy.Suggest(
len(result.answer.split()) >= 10,
"Answer should be at least 10 words for completeness",
)
return result
Call dspy.Assert with a boolean condition and a message. When the condition is False, DSPy:
max_backtrack_attempts times (default: 2)DSPyAssertionErrordspy.Assert(
result.answer != "I don't know",
"You must provide a substantive answer based on the context",
)
Write specific messages. The message is injected back into the prompt on retry, so "Answer was 350 words, must be under 200" is far more useful than "too long."
Same signature as Assert, but non-blocking. When the condition is False:
dspy.Suggest(
"however" not in result.answer.lower(),
"Avoid hedging language like 'however' — be direct",
)
Use Suggest when the constraint improves quality but isn't a hard requirement.
When dspy.Assert fails inside a module's forward(), DSPy doesn't just retry the same call. It modifies the signature by injecting the error message as additional context, so the LM has feedback about what went wrong:
# Original prompt (simplified)
Question: What is DSPy?
Answer: [LM generates here]
# After assertion failure, retry prompt becomes:
Question: What is DSPy?
Previous attempt failed: "Answer was 350 words, must be under 200. Be concise."
Answer: [LM generates here with feedback]
This is why assertion messages should be actionable instructions, not just error descriptions.
By default, DSPy backtracks to the most recent LM call. Use the backtrack_module parameter to target a specific module instead:
dspy.Assert(
is_valid_json(result.output),
"Output must be valid JSON. Check for missing braces or trailing commas.",
backtrack_module=self.generate, # retry this specific module
)
dspy.Assert(
len(result.summary.split()) <= 50,
f"Summary is {len(result.summary.split())} words, must be under 50",
)
import re
dspy.Assert(
re.match(r"^\d{4}-\d{2}-\d{2}$", result.date or ""),
"Date must be in YYYY-MM-DD format",
)
dspy.Assert(
not any(phrase in result.answer.lower() for phrase in ["as an ai", "i cannot"]),
"Do not include AI self-references in the answer",
)
dspy.Assert(
len(result.tags) >= 1,
"Must assign at least one tag",
)
dspy.Assert(
all(tag in VALID_TAGS for tag in result.tags),
f"All tags must be from the valid set: {VALID_TAGS}",
)
# Check that the answer references at least one key term from the context
context_terms = set(word.lower() for p in context for word in p.split() if len(word) > 5)
answer_terms = set(word.lower() for word in result.answer.split())
overlap = context_terms & answer_terms
dspy.Assert(
len(overlap) >= 3,
"Answer must reference specific terms from the source passages",
)
Assertions work with all DSPy optimizers. During optimization:
dspy.Assert failures cause the training example to be retried. If the program can't satisfy the constraint after retries, that example is skipped.dspy.Suggest failures are tracked as soft signals. Optimizers like BootstrapFewShotWithRandomSearch and MIPROv2 prefer demo sets where suggestions are satisfied.This means the optimizer learns prompts and demos that satisfy your constraints on the first try, reducing retries in production:
program = QA()
optimizer = dspy.BootstrapFewShotWithRandomSearch(
metric=my_metric,
max_bootstrapped_demos=4,
num_candidate_programs=10,
)
optimized = optimizer.compile(program, trainset=trainset)
After optimization, the program will have few-shot demos that naturally produce outputs satisfying your assertions.
When all retries are exhausted, dspy.Assert raises DSPyAssertionError. Handle it at the call site:
from dspy.primitives.assertions import DSPyAssertionError
try:
result = program(question="...")
except DSPyAssertionError as e:
# Log the failure, return a fallback, etc.
print(f"Output failed validation: {e}")
| Scenario | Use |
|----------|-----|
| Output must be valid JSON | Assert |
| Answer should be concise | Suggest |
| No PII in output | Assert |
| Prefer active voice | Suggest |
| Must cite sources | Assert |
| Avoid hedging language | Suggest |
| Output matches expected schema | Assert |
| Include a confidence score | Suggest |
Rule of thumb: If a bad output reaching users would be a bug, use Assert. If it would just be suboptimal, use Suggest.
Assert/Suggest have been removed from DSPy 3.x. All constraint enforcement should use dspy.Refine (iterative with feedback) or dspy.BestOfN (independent sampling).
The key shift is from inline boolean checks to reward functions that score the full output:
# OLD (removed in DSPy 3.x)
dspy.Assert(len(result.answer.split()) <= 50, "Too long")
dspy.Suggest("however" not in result.answer, "Avoid hedging")
# NEW — reward function + Refine
def quality_reward(args, pred):
score = 1.0
if len(pred.answer.split()) > 50: # hard rule
score -= 0.4
if "however" in pred.answer.lower(): # soft rule
score -= 0.1
return max(score, 0.0)
refined = dspy.Refine(module=my_module, N=3, reward_fn=quality_reward, threshold=0.8)
For full migration patterns, see /dspy-refine and /dspy-best-of-n.
forward(). dspy.Assert and dspy.Suggest only work inside a dspy.Module.forward() method because DSPy needs the module context for backtracking. Calling them at the top level or in a standalone function silently skips the retry mechanism.Assert for style preferences. Hard assertions that fail after all retries raise DSPyAssertionError and crash the program. Use dspy.Suggest for subjective quality preferences (tone, style, verbosity) and reserve Assert for objective constraints (format validity, safety, schema compliance).DSPyAssertionError at the call site. When all retry attempts are exhausted, Assert raises DSPyAssertionError. In production code, always wrap the program call in a try/except to handle validation failures gracefully with a fallback response.Suggest.Install any skill:
npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>
/dspy-refine/ai-checking-outputs/ai-stopping-hallucinations/ai-following-rules/dspy-bootstrap-rs, /dspy-miprov2/ai-do if you do not have it — it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-dotools
See what is happening during optimizer.compile() instead of waiting blind. Use when you want to watch optimization progress, see scores as they come in, know if your optimizer is working, check if optimization is stuck, understand why optimization is taking too long, get live progress during compile, monitor convergence, detect overfitting during optimization, interpret optimization results, or pick the right tool for watching optimization. Also used for optimizer progress bar, is my optimizer doing anything, optimization seems stuck, how long will optimization take, watch GEPA run, watch MIPROv2 run, live optimization dashboard, optimizer not improving, scores not going up, optimization taking forever, see what optimizer is doing, debug slow optimization, optimization visibility, optimizer metrics, track compile progress, optimization observability.
testing
Use when you want the highest-quality prompt optimization DSPy offers — jointly optimizes instructions and few-shot demos, with auto=light/medium/heavy presets. Common scenarios - you want the best possible accuracy from prompt optimization, jointly tuning instructions and few-shot demonstrations, using auto presets for different compute budgets, or when COPRO or BootstrapFewShot alone are not reaching your accuracy target. Related - ai-improving-accuracy, dspy-copro, dspy-bootstrap-few-shot. Also used for dspy.MIPROv2, best DSPy optimizer, highest quality optimization, auto=light medium heavy, joint instruction and demo optimization, most powerful prompt optimizer, MIPROv2 vs COPRO vs BootstrapFewShot, which optimizer should I use, state of the art prompt optimization, when to use MIPROv2, optimize both instructions and examples, heavy optimization for production, best optimizer for accuracy.
testing
Use LangWatch for DSPy auto-tracing and real-time optimizer progress. Use when you want to set up LangWatch, langwatch.dspy.init, auto-tracing DSPy, real-time optimization dashboard, optimizer progress tracking, app.langwatch.ai, or DSPy optimizer dashboard. Also used for langwatch setup, pip install langwatch, langwatch trace, optimizer progress, real-time optimization, watch optimizer run, LangWatch self-hosted, langwatch docker, langwatch vs langtrace, langwatch autotrack_dspy.
data-ai
Use when you want to optimize instructions without few-shot examples — a lightweight alternative to COPRO when you do not have or do not want to use demonstrations. Common scenarios - optimizing instructions when you do not have or do not want to use few-shot demonstrations, lightweight instruction search as a first step, tasks where examples in the prompt confuse the model, or when you want fast instruction optimization without the cost of COPRO. Related - ai-improving-accuracy, dspy-copro, dspy-miprov2. Also used for dspy.GEPA, instruction optimization without demos, lightweight prompt optimization, optimize instructions only, no few-shot examples needed, GEPA vs COPRO, quick instruction search, when demonstrations hurt performance, zero-shot optimization, instruction-only optimizer, simplest instruction tuner, fast prompt optimization, skip few-shot and just tune instructions, optimize Pydantic field descriptions, GEPA structured output, GEPA does not optimize field desc.