skills/ai-fixing-errors/SKILL.md
Fix broken AI features. Use when your AI is throwing errors, producing wrong outputs, crashing, returning garbage, not responding, or behaving unexpectedly. Also use when you get Could not parse LLM output errors, DSPy program crashes, LLM timeout or rate limit errors, API key not working with DSPy, JSON parse error from LLM, model returns empty response, AI works sometimes but fails other times, intermittent LLM failures, debug DSPy pipeline, context window exceeded, token limit error, AI feature stopped working overnight, production AI errors.
npx skillsauth add lebsral/dspy-programming-not-prompting-lms-skills ai-fixing-errorsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Systematic approach to diagnosing and fixing AI features that aren't working.
Before debugging, ask the user:
import dspy
# Check current config
print(dspy.settings.lm) # Should show your LM, not None
# If None, configure it:
lm = dspy.LM("openai/gpt-4o-mini") # or "anthropic/claude-sonnet-4-5-20250929", etc.
dspy.configure(lm=lm)
Common issues:
dspy.configure(lm=lm)provider/model-name)# Test the AI provider directly
lm = dspy.LM("openai/gpt-4o-mini") # or "anthropic/claude-sonnet-4-5-20250929", etc.
response = lm("Hello, respond with just 'OK'")
print(response)
# Check your signature defines the right fields
class MySignature(dspy.Signature):
"""Clear task description here."""
input_field: str = dspy.InputField(desc="what this contains")
output_field: str = dspy.OutputField(desc="what to produce")
# Verify by inspecting
print(MySignature.fields)
Common issues:
dspy.InputField() / dspy.OutputField() annotationsstr, list[str], Literal[...], Pydantic models)# Check that input field names match
result = my_program(question="test") # field name must match signature
# Wrong:
result = my_program(q="test") # 'q' doesn't match 'question'
result = my_program("test") # positional args don't work
result = my_program(question="test")
print(result) # see all fields
print(result.answer) # access specific field
print(type(result.answer)) # check type
Common issues with typed outputs:
Literal type doesn't match any of the provided optionsThe most powerful debugging tool — shows exactly what prompts were sent and what came back:
# Show the last 3 AI calls
dspy.inspect_history(n=3)
This shows:
What to look for:
AttributeError: 'NoneType' has no attribute ...Cause: AI provider not configured.
Fix: Call dspy.configure(lm=lm) before using any module.
ValueError: Could not parse outputCause: AI output doesn't match expected format. Fix:
dspy.inspect_history() to see what the AI returneddspy.ChainOfThought instead of dspy.Predict (reasoning helps formatting)TypeError: forward() got an unexpected keyword argumentCause: Input field name mismatch.
Fix: Make sure you're passing keyword arguments that match your signature's InputField names.
Cause: Retriever not configured or wrong endpoint. Fix:
# Test retriever directly
rm = dspy.ColBERTv2(url="http://...")
results = rm("test query", k=3)
print(results)
# Or if using a custom retriever function, call it directly to verify
Cause: Bad metric, too little data, or overfitting. Fix:
max_bootstrapped_demosdspy.Refine not meeting threshold / exhausting attemptsCause: Reward function threshold is too strict, or the module cannot produce outputs that score high enough. Fix:
0.8 rather than 1.0 for multi-criteria scoring)N to give more retry attempts, or use dspy.BestOfN for independent samplingdspy.configure(lm=lm, trace=[])
# Now run your program — trace will be populated
result = my_program(question="test")
# Print the module tree
print(my_program)
# See all named predictors
for name, predictor in my_program.named_predictors():
print(f"{name}: {predictor}")
Break your pipeline into pieces and test each one:
class MyPipeline(dspy.Module):
def __init__(self):
self.step1 = dspy.ChainOfThought("question -> search_query")
self.step2 = dspy.Retrieve(k=3)
self.step3 = dspy.ChainOfThought("context, question -> answer")
def forward(self, question):
query = self.step1(question=question)
print(f"Step 1 output: {query.search_query}") # Debug
context = self.step2(query.search_query)
print(f"Step 2 retrieved: {len(context.passages)} passages") # Debug
answer = self.step3(context=context.passages, question=question)
print(f"Step 3 output: {answer.answer}") # Debug
return answer
# Before optimization
baseline = MyProgram()
baseline(question="test")
print("=== BASELINE PROMPT ===")
dspy.inspect_history(n=1)
# After optimization
optimized = MyProgram()
optimized.load("optimized.json")
optimized(question="test")
print("=== OPTIMIZED PROMPT ===")
dspy.inspect_history(n=1)
dspy.inspect_history(). Claude tends to guess at fixes based on the error message alone. Always inspect the actual prompt and response first — the root cause is usually visible in the raw LM output (wrong format, truncated response, misunderstood instruction).Predict to ChainOfThought so the model has space to reason before producing structured output.try/except around DSPy calls to swallow errors. This hides the real problem. DSPy errors (especially ValueError from parsing) are diagnostic — they tell you exactly what the LM returned vs what was expected. Fix the root cause instead of catching and retrying..load() restores old few-shot demos that no longer match the current signature. Re-optimize or clear the saved state after signature changes./ai-improving-accuracy instead. This skill fixes crashes and parse failures, not quality problems./ai-do to get routed to the right building skill. This skill assumes you already have code that is broken./ai-cutting-costs or /ai-making-consistent depending on the problem.Install any skill:
npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>
/ai-improving-accuracy/ai-tracing-requests/ai-monitoring/dspy-modules/dspy-refine/dspy-best-of-n/ai-do if you do not have it — it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-dotools
See what is happening during optimizer.compile() instead of waiting blind. Use when you want to watch optimization progress, see scores as they come in, know if your optimizer is working, check if optimization is stuck, understand why optimization is taking too long, get live progress during compile, monitor convergence, detect overfitting during optimization, interpret optimization results, or pick the right tool for watching optimization. Also used for optimizer progress bar, is my optimizer doing anything, optimization seems stuck, how long will optimization take, watch GEPA run, watch MIPROv2 run, live optimization dashboard, optimizer not improving, scores not going up, optimization taking forever, see what optimizer is doing, debug slow optimization, optimization visibility, optimizer metrics, track compile progress, optimization observability.
testing
Use when you want the highest-quality prompt optimization DSPy offers — jointly optimizes instructions and few-shot demos, with auto=light/medium/heavy presets. Common scenarios - you want the best possible accuracy from prompt optimization, jointly tuning instructions and few-shot demonstrations, using auto presets for different compute budgets, or when COPRO or BootstrapFewShot alone are not reaching your accuracy target. Related - ai-improving-accuracy, dspy-copro, dspy-bootstrap-few-shot. Also used for dspy.MIPROv2, best DSPy optimizer, highest quality optimization, auto=light medium heavy, joint instruction and demo optimization, most powerful prompt optimizer, MIPROv2 vs COPRO vs BootstrapFewShot, which optimizer should I use, state of the art prompt optimization, when to use MIPROv2, optimize both instructions and examples, heavy optimization for production, best optimizer for accuracy.
testing
Use LangWatch for DSPy auto-tracing and real-time optimizer progress. Use when you want to set up LangWatch, langwatch.dspy.init, auto-tracing DSPy, real-time optimization dashboard, optimizer progress tracking, app.langwatch.ai, or DSPy optimizer dashboard. Also used for langwatch setup, pip install langwatch, langwatch trace, optimizer progress, real-time optimization, watch optimizer run, LangWatch self-hosted, langwatch docker, langwatch vs langtrace, langwatch autotrack_dspy.
data-ai
Use when you want to optimize instructions without few-shot examples — a lightweight alternative to COPRO when you do not have or do not want to use demonstrations. Common scenarios - optimizing instructions when you do not have or do not want to use few-shot demonstrations, lightweight instruction search as a first step, tasks where examples in the prompt confuse the model, or when you want fast instruction optimization without the cost of COPRO. Related - ai-improving-accuracy, dspy-copro, dspy-miprov2. Also used for dspy.GEPA, instruction optimization without demos, lightweight prompt optimization, optimize instructions only, no few-shot examples needed, GEPA vs COPRO, quick instruction search, when demonstrations hurt performance, zero-shot optimization, instruction-only optimizer, simplest instruction tuner, fast prompt optimization, skip few-shot and just tune instructions, optimize Pydantic field descriptions, GEPA structured output, GEPA does not optimize field desc.