Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

lebsral/dspy-utils

Name: dspy-utils
Author: lebsral

skills/dspy-utils/SKILL.md

npx skillsauth add lebsral/dspy-programming-not-prompting-lms-skills dspy-utils

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

DSPy Utilities: Caching, Debugging, Save/Load, and Validation

Guide the user through DSPy's utility functions -- controlling caching, debugging calls, persisting optimized programs, and enforcing runtime constraints with reward functions.

Looking for streaming, async, or MCP? These have dedicated skills now:

Streaming tokens to a UI -- see /dspy-streaming

Async execution and FastAPI -- see /dspy-async

MCP server integration -- see /dspy-mcp

Step 1: Which utility do you need?

Ask the user before diving in:

What are you trying to do? Debug a failing program, save/load an optimized program, control caching, or validate outputs with reward functions?
Is this for development or production? Development needs (debugging, cache control) differ from production needs (save/load, validation).

Then jump to the relevant section below.

2. configure_cache -- controlling cache behavior

DSPy caches LM responses by default to reduce costs and speed up development. Use dspy.configure_cache to control this globally.

# Disable caching entirely
dspy.configure_cache(enable=False)

# Re-enable caching
dspy.configure_cache(enable=True)

Per-LM cache control

You can also control caching per LM instance:

# This LM never caches
lm_no_cache = dspy.LM("openai/gpt-4o-mini", cache=False)

# This LM caches (default)
lm_cached = dspy.LM("openai/gpt-4o-mini", cache=True)

When to disable caching

Generating diverse outputs -- when you need different responses for the same prompt (e.g., data generation)
Testing real latency -- cache hits are instant, which skews benchmarks
Streaming -- caching may interfere with streaming behavior in some configurations

Cache is stored locally on disk. Identical calls (same prompt, parameters, model) return cached results with no API call.

When NOT to disable caching: During optimization runs -- optimizers rely heavily on cache to avoid redundant LM calls. Disabling cache globally during optimization dramatically increases cost and time.

3. inspect_history -- debugging LM calls

dspy.inspect_history shows the raw prompts and responses from recent LM calls. This is the single most useful debugging tool in DSPy.

import dspy

lm = dspy.LM("openai/gpt-4o-mini")  # or "anthropic/claude-sonnet-4-5-20250929", etc.
dspy.configure(lm=lm)

classify = dspy.Predict("text -> label")
classify(text="Great product!")

# See what was actually sent to and received from the LM
dspy.inspect_history(n=1)  # Show last 1 call
dspy.inspect_history(n=3)  # Show last 3 calls

What inspect_history shows

The full prompt sent to the LM (including system message, few-shot demos, instructions)
The raw LM response
Which adapter formatted the prompt (ChatAdapter, JSONAdapter, etc.)

Debugging workflow

Run your program on a failing input
Call dspy.inspect_history(n=1) to see the last LM call
Check if the prompt makes sense -- are the instructions clear? Are few-shot demos relevant?
Check the raw response -- did the LM follow the format? Did it hallucinate?
Adjust your signature, module, or optimization strategy based on what you see

Verbose logging

For more detailed tracing, configure DSPy with an empty trace list:

dspy.configure(lm=lm, trace=[])

You can also print a module to see its structure:

print(my_program)  # Shows module tree with all sub-modules and signatures

4. save/load -- persisting optimized programs

After optimizing a DSPy program, save its learned state (few-shot demos, instructions) for production use.

Save

# After optimization
optimized = optimizer.compile(my_program, trainset=trainset)
optimized.save("optimized_program.json")

Load

# In production -- create a fresh instance, then load state
program = MyProgram()
program.load("optimized_program.json")

# Use it
result = program(question="What is DSPy?")

What gets saved

Few-shot demonstrations discovered by optimizers
Optimized instructions (from MIPROv2, GEPA, etc.)
Any state tracked by dspy.Predict modules

What does NOT get saved

Python logic in forward() -- that's your code, it must exist at load time
Model weights (unless you used BootstrapFinetune)
LM configuration -- you must call dspy.configure() before loading

Production deployment pattern

import dspy

class MyPipeline(dspy.Module):
    def __init__(self):
        self.classify = dspy.Predict("text -> category")
        self.respond = dspy.ChainOfThought("text, category -> response")

    def forward(self, text):
        cat = self.classify(text=text)
        return self.respond(text=text, category=cat.category)

# --- Optimization (run once) ---
# optimizer = dspy.MIPROv2(metric=metric, auto="medium")
# optimized = optimizer.compile(MyPipeline(), trainset=trainset)
# optimized.save("pipeline_v1.json")

# --- Production (run on every request) ---
lm = dspy.LM("openai/gpt-4o-mini")  # or "anthropic/claude-sonnet-4-5-20250929", etc.
dspy.configure(lm=lm)

pipeline = MyPipeline()
pipeline.load("pipeline_v1.json")

result = pipeline(text="How do I reset my password?")

5. dspy.Refine and dspy.BestOfN -- reward-based output validation

Use dspy.Refine to wrap any module and retry until a reward function returns a score meeting a threshold. This replaced dspy.Assert/dspy.Suggest in DSPy 3.x:

import dspy

qa = dspy.ChainOfThought("question -> answer")

def answer_reward(args, pred):
    """Score answer quality. Returns 0.0-1.0."""
    if not pred.answer.strip():
        return 0.0
    if len(pred.answer.split()) < 5:
        return 0.5  # soft penalty for short answers
    return 1.0

validated_qa = dspy.Refine(
    module=qa,
    N=3,
    reward_fn=answer_reward,
    threshold=1.0,
)

result = validated_qa(question="What is DSPy?")

dspy.Refine -- retries with feedback from the reward function until threshold is met or N attempts exhausted. Use when later attempts can improve based on earlier failures.
dspy.BestOfN -- runs N independent attempts and returns the best-scoring one. Use when attempts are independent and cross-attempt feedback would not help.

For detailed patterns and examples, see /dspy-refine and /dspy-best-of-n.

Gotchas

save() does not persist forward() logic -- only learned state (demos, instructions) is saved. The class definition must exist in your production code at load time.
Must dspy.configure() before load() -- loading a saved program before configuring the LM causes silent failures where the program runs but uses no LM (or the wrong one).
inspect_history shows cached calls too -- after a cache hit, inspect_history still shows the call, but the prompt may look different from what was originally sent. Disable cache if you need exact prompt inspection.
Claude disables caching during optimization. Do NOT disable cache globally during optimizer runs -- optimizers rely heavily on cache to avoid redundant LM calls. Disabling cache during optimization dramatically increases cost and time.

Additional resources

DSPy saving/loading guide
For API details, see reference.md
For worked examples, see examples.md

Cross-references

Install any skill: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>

Streaming tokens to a UI -- see /dspy-streaming
Async execution and FastAPI -- see /dspy-async
MCP server integration -- see /dspy-mcp
/dspy-lm -- Configure language models, per-LM caching, inspect_history on LM instances
/dspy-modules -- Build composable programs with dspy.Module, save/load patterns
/ai-tracing-requests -- Production observability and tracing for DSPy programs
/dspy-refine -- Refine patterns, reward functions, and iterative improvement
/dspy-best-of-n -- BestOfN for independent sampling without cross-attempt feedback
/ai-serving-apis -- Serve DSPy programs as web APIs
Install /ai-do if you do not have it -- it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-do

lebsral/dspy-utils

skills/dspy-utils/SKILL.md

Use when you need DSPy infrastructure - caching control, debugging with inspect_history, saving/loading optimized programs, or runtime validation with Refine/BestOfN. Common scenarios - controlling the cache to avoid stale results, debugging with inspect_history to see raw prompts, saving and loading optimized programs, or validating outputs with reward functions. For streaming see /dspy-streaming, for async see /dspy-async, for MCP see /dspy-mcp. Related - ai-tracing-requests, ai-serving-apis, ai-monitoring, dspy-streaming, dspy-async, dspy-mcp. Also used for dspy.inspect_history, dspy.settings.configure, cache control in DSPy, save and load DSPy program, debug DSPy prompts, see what DSPy sent to the model, DSPy program serialization, production DSPy utilities, clear DSPy cache, view prompt history.

5 stars

tools

Updated May 7, 2026

$ install --global

skillsauth

npx skillsauth add lebsral/dspy-programming-not-prompting-lms-skills dspy-utils

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 7, 2026, 7:04 AM221.6s4 files scanned

SKILL.md

name:: dspy-utils
description:: Use when you need DSPy infrastructure - caching control, debugging with inspect_history, saving/loading optimized programs, or runtime validation with Refine/BestOfN. Common scenarios - controlling the cache to avoid stale results, debugging with inspect_history to see raw prompts, saving and loading optimized programs, or validating outputs with reward functions. For streaming see /dspy-streaming, for async see /dspy-async, for MCP see /dspy-mcp. Related - ai-tracing-requests, ai-serving-apis, ai-monitoring, dspy-streaming, dspy-async, dspy-mcp. Also used for dspy.inspect_history, dspy.settings.configure, cache control in DSPy, save and load DSPy program, debug DSPy prompts, see what DSPy sent to the model, DSPy program serialization, production DSPy utilities, clear DSPy cache, view prompt history.

DSPy Utilities: Caching, Debugging, Save/Load, and Validation

Guide the user through DSPy's utility functions -- controlling caching, debugging calls, persisting optimized programs, and enforcing runtime constraints with reward functions.

Looking for streaming, async, or MCP? These have dedicated skills now:

Streaming tokens to a UI -- see /dspy-streaming

Async execution and FastAPI -- see /dspy-async

MCP server integration -- see /dspy-mcp

Step 1: Which utility do you need?

Ask the user before diving in:

What are you trying to do? Debug a failing program, save/load an optimized program, control caching, or validate outputs with reward functions?
Is this for development or production? Development needs (debugging, cache control) differ from production needs (save/load, validation).

Then jump to the relevant section below.

2. configure_cache -- controlling cache behavior

DSPy caches LM responses by default to reduce costs and speed up development. Use dspy.configure_cache to control this globally.

# Disable caching entirely
dspy.configure_cache(enable=False)

# Re-enable caching
dspy.configure_cache(enable=True)

Per-LM cache control

You can also control caching per LM instance:

# This LM never caches
lm_no_cache = dspy.LM("openai/gpt-4o-mini", cache=False)

# This LM caches (default)
lm_cached = dspy.LM("openai/gpt-4o-mini", cache=True)

When to disable caching

Generating diverse outputs -- when you need different responses for the same prompt (e.g., data generation)
Testing real latency -- cache hits are instant, which skews benchmarks
Streaming -- caching may interfere with streaming behavior in some configurations

Cache is stored locally on disk. Identical calls (same prompt, parameters, model) return cached results with no API call.

3. inspect_history -- debugging LM calls

dspy.inspect_history shows the raw prompts and responses from recent LM calls. This is the single most useful debugging tool in DSPy.

import dspy

lm = dspy.LM("openai/gpt-4o-mini")  # or "anthropic/claude-sonnet-4-5-20250929", etc.
dspy.configure(lm=lm)

classify = dspy.Predict("text -> label")
classify(text="Great product!")

# See what was actually sent to and received from the LM
dspy.inspect_history(n=1)  # Show last 1 call
dspy.inspect_history(n=3)  # Show last 3 calls

What inspect_history shows

The full prompt sent to the LM (including system message, few-shot demos, instructions)
The raw LM response
Which adapter formatted the prompt (ChatAdapter, JSONAdapter, etc.)

Debugging workflow

Run your program on a failing input
Call dspy.inspect_history(n=1) to see the last LM call
Check if the prompt makes sense -- are the instructions clear? Are few-shot demos relevant?
Check the raw response -- did the LM follow the format? Did it hallucinate?
Adjust your signature, module, or optimization strategy based on what you see

Verbose logging

For more detailed tracing, configure DSPy with an empty trace list:

dspy.configure(lm=lm, trace=[])

You can also print a module to see its structure:

print(my_program)  # Shows module tree with all sub-modules and signatures

4. save/load -- persisting optimized programs

After optimizing a DSPy program, save its learned state (few-shot demos, instructions) for production use.

Save

# After optimization
optimized = optimizer.compile(my_program, trainset=trainset)
optimized.save("optimized_program.json")

Load

# In production -- create a fresh instance, then load state
program = MyProgram()
program.load("optimized_program.json")

# Use it
result = program(question="What is DSPy?")

What gets saved

Few-shot demonstrations discovered by optimizers
Optimized instructions (from MIPROv2, GEPA, etc.)
Any state tracked by dspy.Predict modules

What does NOT get saved

Python logic in forward() -- that's your code, it must exist at load time
Model weights (unless you used BootstrapFinetune)
LM configuration -- you must call dspy.configure() before loading

Production deployment pattern

import dspy

class MyPipeline(dspy.Module):
    def __init__(self):
        self.classify = dspy.Predict("text -> category")
        self.respond = dspy.ChainOfThought("text, category -> response")

    def forward(self, text):
        cat = self.classify(text=text)
        return self.respond(text=text, category=cat.category)

# --- Optimization (run once) ---
# optimizer = dspy.MIPROv2(metric=metric, auto="medium")
# optimized = optimizer.compile(MyPipeline(), trainset=trainset)
# optimized.save("pipeline_v1.json")

# --- Production (run on every request) ---
lm = dspy.LM("openai/gpt-4o-mini")  # or "anthropic/claude-sonnet-4-5-20250929", etc.
dspy.configure(lm=lm)

pipeline = MyPipeline()
pipeline.load("pipeline_v1.json")

result = pipeline(text="How do I reset my password?")

5. dspy.Refine and dspy.BestOfN -- reward-based output validation

Use dspy.Refine to wrap any module and retry until a reward function returns a score meeting a threshold. This replaced dspy.Assert/dspy.Suggest in DSPy 3.x:

import dspy

qa = dspy.ChainOfThought("question -> answer")

def answer_reward(args, pred):
    """Score answer quality. Returns 0.0-1.0."""
    if not pred.answer.strip():
        return 0.0
    if len(pred.answer.split()) < 5:
        return 0.5  # soft penalty for short answers
    return 1.0

validated_qa = dspy.Refine(
    module=qa,
    N=3,
    reward_fn=answer_reward,
    threshold=1.0,
)

result = validated_qa(question="What is DSPy?")

dspy.Refine -- retries with feedback from the reward function until threshold is met or N attempts exhausted. Use when later attempts can improve based on earlier failures.
dspy.BestOfN -- runs N independent attempts and returns the best-scoring one. Use when attempts are independent and cross-attempt feedback would not help.

For detailed patterns and examples, see /dspy-refine and /dspy-best-of-n.

Gotchas

save() does not persist forward() logic -- only learned state (demos, instructions) is saved. The class definition must exist in your production code at load time.
Must dspy.configure() before load() -- loading a saved program before configuring the LM causes silent failures where the program runs but uses no LM (or the wrong one).
inspect_history shows cached calls too -- after a cache hit, inspect_history still shows the call, but the prompt may look different from what was originally sent. Disable cache if you need exact prompt inspection.
Claude disables caching during optimization. Do NOT disable cache globally during optimizer runs -- optimizers rely heavily on cache to avoid redundant LM calls. Disabling cache during optimization dramatically increases cost and time.

Additional resources

DSPy saving/loading guide
For API details, see reference.md
For worked examples, see examples.md

Cross-references

Install any skill: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>

Streaming tokens to a UI -- see /dspy-streaming
Async execution and FastAPI -- see /dspy-async
MCP server integration -- see /dspy-mcp
/dspy-lm -- Configure language models, per-LM caching, inspect_history on LM instances
/dspy-modules -- Build composable programs with dspy.Module, save/load patterns
/ai-tracing-requests -- Production observability and tracing for DSPy programs
/dspy-refine -- Refine patterns, reward functions, and iterative improvement
/dspy-best-of-n -- BestOfN for independent sampling without cross-attempt feedback
/ai-serving-apis -- Serve DSPy programs as web APIs
Install /ai-do if you do not have it -- it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-do

Related Skills

lebsral/ai-watching-optimization

tools

VerifiedTrustedCommunity

See what is happening during optimizer.compile() instead of waiting blind. Use when you want to watch optimization progress, see scores as they come in, know if your optimizer is working, check if optimization is stuck, understand why optimization is taking too long, get live progress during compile, monitor convergence, detect overfitting during optimization, interpret optimization results, or pick the right tool for watching optimization. Also used for optimizer progress bar, is my optimizer doing anything, optimization seems stuck, how long will optimization take, watch GEPA run, watch MIPROv2 run, live optimization dashboard, optimizer not improving, scores not going up, optimization taking forever, see what optimizer is doing, debug slow optimization, optimization visibility, optimizer metrics, track compile progress, optimization observability.

6SKILL.mdUpdated May 31, 2026

lebsral/ai-watching-optimization

lebsral/dspy-miprov2

testing

VerifiedTrustedCommunity

Use when you want the highest-quality prompt optimization DSPy offers — jointly optimizes instructions and few-shot demos, with auto=light/medium/heavy presets. Common scenarios - you want the best possible accuracy from prompt optimization, jointly tuning instructions and few-shot demonstrations, using auto presets for different compute budgets, or when COPRO or BootstrapFewShot alone are not reaching your accuracy target. Related - ai-improving-accuracy, dspy-copro, dspy-bootstrap-few-shot. Also used for dspy.MIPROv2, best DSPy optimizer, highest quality optimization, auto=light medium heavy, joint instruction and demo optimization, most powerful prompt optimizer, MIPROv2 vs COPRO vs BootstrapFewShot, which optimizer should I use, state of the art prompt optimization, when to use MIPROv2, optimize both instructions and examples, heavy optimization for production, best optimizer for accuracy.

6SKILL.mdUpdated Apr 27, 2026

lebsral/dspy-langwatch

testing

VerifiedTrustedCommunity

Use LangWatch for DSPy auto-tracing and real-time optimizer progress. Use when you want to set up LangWatch, langwatch.dspy.init, auto-tracing DSPy, real-time optimization dashboard, optimizer progress tracking, app.langwatch.ai, or DSPy optimizer dashboard. Also used for langwatch setup, pip install langwatch, langwatch trace, optimizer progress, real-time optimization, watch optimizer run, LangWatch self-hosted, langwatch docker, langwatch vs langtrace, langwatch autotrack_dspy.

6SKILL.mdUpdated Apr 27, 2026

lebsral/dspy-langwatch

lebsral/dspy-gepa

data-ai

VerifiedTrustedCommunity

Use when you want to optimize instructions without few-shot examples — a lightweight alternative to COPRO when you do not have or do not want to use demonstrations. Common scenarios - optimizing instructions when you do not have or do not want to use few-shot demonstrations, lightweight instruction search as a first step, tasks where examples in the prompt confuse the model, or when you want fast instruction optimization without the cost of COPRO. Related - ai-improving-accuracy, dspy-copro, dspy-miprov2. Also used for dspy.GEPA, instruction optimization without demos, lightweight prompt optimization, optimize instructions only, no few-shot examples needed, GEPA vs COPRO, quick instruction search, when demonstrations hurt performance, zero-shot optimization, instruction-only optimizer, simplest instruction tuner, fast prompt optimization, skip few-shot and just tune instructions, optimize Pydantic field descriptions, GEPA structured output, GEPA does not optimize field desc.

6SKILL.mdUpdated Apr 27, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/lebsral/dspy-programming-not-prompting-lms-skills.git

# Copy into Claude Code skills folder (global)
cp -r dspy-programming-not-prompting-lms-skills/skills/dspy-utils ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

lebsral/dspy-programming-not-prompting-lms-skills

5 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT