Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

lebsral/dspy-data

Name: dspy-data
Author: lebsral

skills/dspy-data/SKILL.md

npx skillsauth add lebsral/dspy-programming-not-prompting-lms-skills dspy-data

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Work with DSPy Data: Examples, Predictions, and Datasets

Guide the user through creating, loading, and managing data for DSPy programs. Data is the fuel for DSPy optimizers — getting it right is the difference between a program that works and one that doesn't.

What are Examples

dspy.Example is DSPy's data container. Think of it as a dictionary with one extra feature: you can mark which fields are inputs and which are outputs. This distinction is critical because optimizers need to know what to feed into your program (inputs) and what to compare against (outputs).

import dspy

# An Example holds named fields — like a dict
example = dspy.Example(question="What is DSPy?", answer="A framework for programming LMs")

# Access fields with dot notation or bracket notation
print(example.question)       # "What is DSPy?"
print(example["answer"])      # "A framework for programming LMs"

Every DSPy optimizer, evaluator, and metric function expects data as a list of dspy.Example objects.

Creating Examples

Create examples with keyword arguments. Every keyword becomes a field.

# Simple question-answer pair
ex = dspy.Example(question="What color is the sky?", answer="Blue")

# Classification example with more fields
ex = dspy.Example(
    text="The product broke after one day",
    label="negative",
    category="quality",
)

# Fields can be any Python type
ex = dspy.Example(
    query="hiking trails near Portland",
    results=["Forest Park", "Eagle Creek"],
    count=2,
)

You can also create an Example from a dictionary:

data = {"question": "What is Python?", "answer": "A programming language"}
ex = dspy.Example(**data)

with_inputs() — marking input fields

with_inputs() tells DSPy which fields are inputs (what the program receives) and which are outputs (what the program should produce). This is required for optimizers and evaluation.

# Mark "question" as input — "answer" becomes the expected output
ex = dspy.Example(question="What is DSPy?", answer="A framework").with_inputs("question")

# Multiple input fields
ex = dspy.Example(
    context="DSPy is a Python framework...",
    question="What is DSPy?",
    answer="A framework for programming LMs",
).with_inputs("context", "question")

What happens without with_inputs()

If you skip with_inputs(), optimizers won't know which fields to pass to your program and which to hold back for scoring. You'll get errors or wrong results. Always call it.

How it works

After with_inputs("question"):

example.inputs() returns an Example with only {"question": "What is DSPy?"}
example.labels() returns an Example with only {"answer": "A framework"}

DSPy uses .inputs() to feed data into your module and .labels() to check the output against expected values.

ex = dspy.Example(question="What is DSPy?", answer="A framework").with_inputs("question")

print(ex.inputs())   # Example(question="What is DSPy?")
print(ex.labels())   # Example(answer="A framework")

Prediction — what modules return

When you call a DSPy module, it returns a dspy.Prediction, which extends Example. A Prediction has the same dot-access and dict-like behavior.

classify = dspy.ChainOfThought("text -> label")
result = classify(text="Great product!")

# result is a Prediction
print(result.label)       # "positive"
print(result.reasoning)   # chain-of-thought reasoning (added by ChainOfThought)

# Predictions work like Examples
print(result.keys())      # dict_keys(['reasoning', 'label'])

In metric functions, prediction is always a Prediction and example is always an Example:

def metric(example, prediction, trace=None):
    # example has the gold fields you defined
    # prediction has the fields your module produced
    return prediction.label == example.label

Building datasets

A dataset in DSPy is just a Python list of Example objects. Use list comprehensions to build them.

# From parallel lists
questions = ["What is Python?", "What is DSPy?", "What is an LM?"]
answers = ["A programming language", "A framework for LMs", "A language model"]

trainset = [
    dspy.Example(question=q, answer=a).with_inputs("question")
    for q, a in zip(questions, answers)
]

# From a list of dicts
raw_data = [
    {"text": "Love it!", "label": "positive"},
    {"text": "Terrible.", "label": "negative"},
    {"text": "It's okay.", "label": "neutral"},
]

trainset = [
    dspy.Example(**row).with_inputs("text")
    for row in raw_data
]

Loading from CSV

import csv

def load_csv_as_examples(filepath, input_fields):
    """Load a CSV file into a list of dspy.Example objects."""
    examples = []
    with open(filepath, "r") as f:
        reader = csv.DictReader(f)
        for row in reader:
            ex = dspy.Example(**row).with_inputs(*input_fields)
            examples.append(ex)
    return examples

# Usage
trainset = load_csv_as_examples("tickets.csv", input_fields=["message"])

With pandas (if you prefer):

import pandas as pd

df = pd.read_csv("tickets.csv")
trainset = [
    dspy.Example(**row.to_dict()).with_inputs("message")
    for _, row in df.iterrows()
]

Handling CSV quirks

# Skip rows with missing values
trainset = [
    dspy.Example(**row).with_inputs("text")
    for row in csv.DictReader(open("data.csv"))
    if row["text"] and row["label"]  # skip blanks
]

# Rename columns to match your signature
trainset = [
    dspy.Example(
        text=row["customer_message"],
        label=row["assigned_category"],
    ).with_inputs("text")
    for row in csv.DictReader(open("data.csv"))
]

Loading from JSON

import json

def load_json_as_examples(filepath, input_fields):
    """Load a JSON array file into dspy.Example objects."""
    with open(filepath, "r") as f:
        data = json.load(f)
    return [
        dspy.Example(**item).with_inputs(*input_fields)
        for item in data
    ]

# Usage — file contains [{"question": "...", "answer": "..."}, ...]
trainset = load_json_as_examples("qa_pairs.json", input_fields=["question"])

For JSON Lines (one JSON object per line):

def load_jsonl_as_examples(filepath, input_fields):
    """Load a JSONL file into dspy.Example objects."""
    examples = []
    with open(filepath, "r") as f:
        for line in f:
            if line.strip():
                item = json.loads(line)
                examples.append(dspy.Example(**item).with_inputs(*input_fields))
    return examples

Loading from HuggingFace

The HuggingFace datasets library gives you access to thousands of ready-to-use datasets.

pip install datasets

from datasets import load_dataset

# Load a dataset
dataset = load_dataset("hotpotqa", "fullwiki")

# Convert to DSPy Examples
trainset = [
    dspy.Example(
        question=x["question"],
        answer=x["answer"],
    ).with_inputs("question")
    for x in dataset["train"]
]

Common HuggingFace patterns

# Limit the number of examples (large datasets)
trainset = [
    dspy.Example(question=x["question"], answer=x["answer"]).with_inputs("question")
    for x in list(dataset["train"])[:500]
]

# Rename fields to match your signature
trainset = [
    dspy.Example(
        text=x["sentence"],
        label="positive" if x["label"] == 1 else "negative",
    ).with_inputs("text")
    for x in dataset["train"]
]

# Filter rows
trainset = [
    dspy.Example(question=x["question"], answer=x["answer"]).with_inputs("question")
    for x in dataset["train"]
    if len(x["answer"]) > 0  # skip empty answers
]

Built-in datasets

DSPy ships with a few standard datasets for prototyping and benchmarking. These return pre-built dspy.Example objects — no HuggingFace dependency needed.

from dspy.datasets import HotPotQA

# Multi-hop question answering
dataset = HotPotQA(train_seed=1, train_size=200, dev_size=50, test_size=0)
trainset = dataset.train
devset = dataset.dev

| Dataset | Import | Task | |---------|--------|------| | HotPotQA | from dspy.datasets import HotPotQA | Multi-hop QA over Wikipedia | | GSM8k | from dspy.datasets import GSM8k | Grade-school math word problems | | Colors | from dspy.datasets import Colors | Simple color identification |

Constructor parameters (all optional): train_seed, train_size, dev_size, test_size.

Note: You still need .with_inputs() if you're passing these to an optimizer:

trainset = [ex.with_inputs("question") for ex in dataset.train]

Train/dev splits

Optimizers train on trainset and you evaluate on devset. Keep them separate to measure real performance.

Random split

import random

def train_dev_split(examples, train_ratio=0.8, seed=42):
    """Split a list of examples into train and dev sets."""
    random.seed(seed)
    shuffled = list(examples)
    random.shuffle(shuffled)
    split_idx = int(len(shuffled) * train_ratio)
    return shuffled[:split_idx], shuffled[split_idx:]

# Usage
all_examples = load_csv_as_examples("data.csv", input_fields=["text"])
trainset, devset = train_dev_split(all_examples)
print(f"Train: {len(trainset)}, Dev: {len(devset)}")

Stratified split (preserves label distribution)

Use this when your categories are imbalanced (e.g., 90% "general", 10% "urgent").

from collections import defaultdict

def stratified_split(examples, label_field, train_ratio=0.8, seed=42):
    """Split examples while preserving the distribution of a label field."""
    random.seed(seed)
    buckets = defaultdict(list)
    for ex in examples:
        buckets[ex[label_field]].append(ex)

    trainset, devset = [], []
    for label, items in buckets.items():
        random.shuffle(items)
        split_idx = int(len(items) * train_ratio)
        trainset.extend(items[:split_idx])
        devset.extend(items[split_idx:])

    random.shuffle(trainset)
    random.shuffle(devset)
    return trainset, devset

# Usage
trainset, devset = stratified_split(all_examples, label_field="category")

Using HuggingFace's built-in splits

Many HuggingFace datasets come pre-split:

dataset = load_dataset("hotpotqa", "fullwiki")
trainset = [dspy.Example(**x).with_inputs("question") for x in dataset["train"][:500]]
devset = [dspy.Example(**x).with_inputs("question") for x in dataset["validation"][:200]]

Common patterns

Accessing fields

ex = dspy.Example(question="What?", answer="That", source="wiki")

# Dot access
ex.question

# Dict-style access
ex["question"]

# Get all field names
ex.keys()  # dict_keys(['question', 'answer', 'source'])

# Check if a field exists
"question" in ex  # True

Converting to/from dicts

# Example to dict
d = dict(ex)  # {"question": "What?", "answer": "That", "source": "wiki"}

# Dict to Example
ex = dspy.Example(**d).with_inputs("question")

Filtering examples

# Keep only examples where the answer is short
short_answers = [ex for ex in trainset if len(ex.answer.split()) < 20]

# Keep only a specific category
urgent_only = [ex for ex in trainset if ex.category == "urgent"]

# Remove duplicates (by a field)
seen = set()
unique = []
for ex in trainset:
    if ex.question not in seen:
        seen.add(ex.question)
        unique.append(ex)

Inspecting your dataset

# Quick summary
print(f"Total examples: {len(trainset)}")
print(f"Fields: {trainset[0].keys()}")
print(f"First example: {trainset[0]}")

# Label distribution
from collections import Counter
labels = Counter(ex.label for ex in trainset)
print(f"Label distribution: {labels}")

Gotchas

Claude forgets with_inputs() on every Example. Without it, optimizers cannot distinguish inputs from expected outputs. Every example passed to an optimizer or evaluator must have with_inputs() called. Claude often creates examples and only calls with_inputs() on the first one or skips it entirely when building lists inline.
Claude calls with_inputs() with output field names. Mark only the fields your module receives as input — not the fields it should produce. If your signature is question -> answer, call .with_inputs("question"), not .with_inputs("question", "answer"). Including output fields means the optimizer has nothing to score against.
Claude uses Literal[list] instead of Literal[tuple(list)] for dynamic categories. When building categories from data (CATEGORIES = list(set(...))), the type annotation must be Literal[tuple(CATEGORIES)], not Literal[CATEGORIES]. The latter silently fails to constrain the output.
Claude passes raw dicts to optimizers instead of dspy.Example objects. DSPy optimizers and evaluators require dspy.Example objects, not plain Python dicts. Always convert with dspy.Example(**row).with_inputs(...).
Claude creates train/dev splits without shuffling first. If data is sorted by label or date, taking the first 80% as train and last 20% as dev creates a biased split. Always shuffle with a fixed seed before splitting.

Additional resources

dspy.Example API docs
dspy.Prediction API docs
reference.md — constructor signatures, method tables, Prediction details
examples.md — worked examples with CSV, HuggingFace, and manual data

Cross-references

Install any skill: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>

/dspy-evaluate — evaluate your program on a devset with metrics
/ai-generating-data — generate synthetic training data when you have none
/ai-improving-accuracy — use optimizers that consume your trainset to boost quality
Install /ai-do if you do not have it — it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-do

lebsral/dspy-data

skills/dspy-data/SKILL.md

Use when you need to prepare training/dev data for DSPy optimizers — loading from CSV/JSON/HuggingFace, creating Examples, setting input keys, or building train/dev splits. Common scenarios - loading a CSV of labeled examples for optimization, converting HuggingFace datasets to DSPy format, creating train/dev/test splits, building Examples with proper input keys, converting JSON data for DSPy, or preparing evaluation datasets. Related - ai-generating-data, dspy-evaluate. Also used for dspy.Example, dspy.Dataset, load training data for DSPy, CSV to DSPy examples, HuggingFace dataset in DSPy, prepare data for optimization, input_keys in DSPy, train dev split for DSPy, how to format data for DSPy optimizer, labeled examples format, create examples from JSON, what format does DSPy expect, dataset preparation for DSPy, with_inputs in DSPy Example, build evaluation dataset.

5 stars

development

Updated May 7, 2026

$ install --global

skillsauth

npx skillsauth add lebsral/dspy-programming-not-prompting-lms-skills dspy-data

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 7, 2026, 6:58 AM190.4s4 files scanned

SKILL.md

name:: dspy-data
description:: Use when you need to prepare training/dev data for DSPy optimizers — loading from CSV/JSON/HuggingFace, creating Examples, setting input keys, or building train/dev splits. Common scenarios - loading a CSV of labeled examples for optimization, converting HuggingFace datasets to DSPy format, creating train/dev/test splits, building Examples with proper input keys, converting JSON data for DSPy, or preparing evaluation datasets. Related - ai-generating-data, dspy-evaluate. Also used for dspy.Example, dspy.Dataset, load training data for DSPy, CSV to DSPy examples, HuggingFace dataset in DSPy, prepare data for optimization, input_keys in DSPy, train dev split for DSPy, how to format data for DSPy optimizer, labeled examples format, create examples from JSON, what format does DSPy expect, dataset preparation for DSPy, with_inputs in DSPy Example, build evaluation dataset.

Work with DSPy Data: Examples, Predictions, and Datasets

What are Examples

import dspy

# An Example holds named fields — like a dict
example = dspy.Example(question="What is DSPy?", answer="A framework for programming LMs")

# Access fields with dot notation or bracket notation
print(example.question)       # "What is DSPy?"
print(example["answer"])      # "A framework for programming LMs"

Every DSPy optimizer, evaluator, and metric function expects data as a list of dspy.Example objects.

Creating Examples

Create examples with keyword arguments. Every keyword becomes a field.

# Simple question-answer pair
ex = dspy.Example(question="What color is the sky?", answer="Blue")

# Classification example with more fields
ex = dspy.Example(
    text="The product broke after one day",
    label="negative",
    category="quality",
)

# Fields can be any Python type
ex = dspy.Example(
    query="hiking trails near Portland",
    results=["Forest Park", "Eagle Creek"],
    count=2,
)

You can also create an Example from a dictionary:

data = {"question": "What is Python?", "answer": "A programming language"}
ex = dspy.Example(**data)

with_inputs() — marking input fields

with_inputs() tells DSPy which fields are inputs (what the program receives) and which are outputs (what the program should produce). This is required for optimizers and evaluation.

# Mark "question" as input — "answer" becomes the expected output
ex = dspy.Example(question="What is DSPy?", answer="A framework").with_inputs("question")

# Multiple input fields
ex = dspy.Example(
    context="DSPy is a Python framework...",
    question="What is DSPy?",
    answer="A framework for programming LMs",
).with_inputs("context", "question")

What happens without with_inputs()

If you skip with_inputs(), optimizers won't know which fields to pass to your program and which to hold back for scoring. You'll get errors or wrong results. Always call it.

How it works

After with_inputs("question"):

example.inputs() returns an Example with only {"question": "What is DSPy?"}
example.labels() returns an Example with only {"answer": "A framework"}

DSPy uses .inputs() to feed data into your module and .labels() to check the output against expected values.

ex = dspy.Example(question="What is DSPy?", answer="A framework").with_inputs("question")

print(ex.inputs())   # Example(question="What is DSPy?")
print(ex.labels())   # Example(answer="A framework")

Prediction — what modules return

When you call a DSPy module, it returns a dspy.Prediction, which extends Example. A Prediction has the same dot-access and dict-like behavior.

classify = dspy.ChainOfThought("text -> label")
result = classify(text="Great product!")

# result is a Prediction
print(result.label)       # "positive"
print(result.reasoning)   # chain-of-thought reasoning (added by ChainOfThought)

# Predictions work like Examples
print(result.keys())      # dict_keys(['reasoning', 'label'])

In metric functions, prediction is always a Prediction and example is always an Example:

def metric(example, prediction, trace=None):
    # example has the gold fields you defined
    # prediction has the fields your module produced
    return prediction.label == example.label

Building datasets

A dataset in DSPy is just a Python list of Example objects. Use list comprehensions to build them.

# From parallel lists
questions = ["What is Python?", "What is DSPy?", "What is an LM?"]
answers = ["A programming language", "A framework for LMs", "A language model"]

trainset = [
    dspy.Example(question=q, answer=a).with_inputs("question")
    for q, a in zip(questions, answers)
]

# From a list of dicts
raw_data = [
    {"text": "Love it!", "label": "positive"},
    {"text": "Terrible.", "label": "negative"},
    {"text": "It's okay.", "label": "neutral"},
]

trainset = [
    dspy.Example(**row).with_inputs("text")
    for row in raw_data
]

Loading from CSV

import csv

def load_csv_as_examples(filepath, input_fields):
    """Load a CSV file into a list of dspy.Example objects."""
    examples = []
    with open(filepath, "r") as f:
        reader = csv.DictReader(f)
        for row in reader:
            ex = dspy.Example(**row).with_inputs(*input_fields)
            examples.append(ex)
    return examples

# Usage
trainset = load_csv_as_examples("tickets.csv", input_fields=["message"])

With pandas (if you prefer):

import pandas as pd

df = pd.read_csv("tickets.csv")
trainset = [
    dspy.Example(**row.to_dict()).with_inputs("message")
    for _, row in df.iterrows()
]

Handling CSV quirks

# Skip rows with missing values
trainset = [
    dspy.Example(**row).with_inputs("text")
    for row in csv.DictReader(open("data.csv"))
    if row["text"] and row["label"]  # skip blanks
]

# Rename columns to match your signature
trainset = [
    dspy.Example(
        text=row["customer_message"],
        label=row["assigned_category"],
    ).with_inputs("text")
    for row in csv.DictReader(open("data.csv"))
]

Loading from JSON

import json

def load_json_as_examples(filepath, input_fields):
    """Load a JSON array file into dspy.Example objects."""
    with open(filepath, "r") as f:
        data = json.load(f)
    return [
        dspy.Example(**item).with_inputs(*input_fields)
        for item in data
    ]

# Usage — file contains [{"question": "...", "answer": "..."}, ...]
trainset = load_json_as_examples("qa_pairs.json", input_fields=["question"])

For JSON Lines (one JSON object per line):

def load_jsonl_as_examples(filepath, input_fields):
    """Load a JSONL file into dspy.Example objects."""
    examples = []
    with open(filepath, "r") as f:
        for line in f:
            if line.strip():
                item = json.loads(line)
                examples.append(dspy.Example(**item).with_inputs(*input_fields))
    return examples

Loading from HuggingFace

The HuggingFace datasets library gives you access to thousands of ready-to-use datasets.

pip install datasets

from datasets import load_dataset

# Load a dataset
dataset = load_dataset("hotpotqa", "fullwiki")

# Convert to DSPy Examples
trainset = [
    dspy.Example(
        question=x["question"],
        answer=x["answer"],
    ).with_inputs("question")
    for x in dataset["train"]
]

Common HuggingFace patterns

# Limit the number of examples (large datasets)
trainset = [
    dspy.Example(question=x["question"], answer=x["answer"]).with_inputs("question")
    for x in list(dataset["train"])[:500]
]

# Rename fields to match your signature
trainset = [
    dspy.Example(
        text=x["sentence"],
        label="positive" if x["label"] == 1 else "negative",
    ).with_inputs("text")
    for x in dataset["train"]
]

# Filter rows
trainset = [
    dspy.Example(question=x["question"], answer=x["answer"]).with_inputs("question")
    for x in dataset["train"]
    if len(x["answer"]) > 0  # skip empty answers
]

Built-in datasets

DSPy ships with a few standard datasets for prototyping and benchmarking. These return pre-built dspy.Example objects — no HuggingFace dependency needed.

from dspy.datasets import HotPotQA

# Multi-hop question answering
dataset = HotPotQA(train_seed=1, train_size=200, dev_size=50, test_size=0)
trainset = dataset.train
devset = dataset.dev

Constructor parameters (all optional): train_seed, train_size, dev_size, test_size.

Note: You still need .with_inputs() if you're passing these to an optimizer:

trainset = [ex.with_inputs("question") for ex in dataset.train]

Train/dev splits

Optimizers train on trainset and you evaluate on devset. Keep them separate to measure real performance.

Random split

import random

def train_dev_split(examples, train_ratio=0.8, seed=42):
    """Split a list of examples into train and dev sets."""
    random.seed(seed)
    shuffled = list(examples)
    random.shuffle(shuffled)
    split_idx = int(len(shuffled) * train_ratio)
    return shuffled[:split_idx], shuffled[split_idx:]

# Usage
all_examples = load_csv_as_examples("data.csv", input_fields=["text"])
trainset, devset = train_dev_split(all_examples)
print(f"Train: {len(trainset)}, Dev: {len(devset)}")

Stratified split (preserves label distribution)

Use this when your categories are imbalanced (e.g., 90% "general", 10% "urgent").

from collections import defaultdict

def stratified_split(examples, label_field, train_ratio=0.8, seed=42):
    """Split examples while preserving the distribution of a label field."""
    random.seed(seed)
    buckets = defaultdict(list)
    for ex in examples:
        buckets[ex[label_field]].append(ex)

    trainset, devset = [], []
    for label, items in buckets.items():
        random.shuffle(items)
        split_idx = int(len(items) * train_ratio)
        trainset.extend(items[:split_idx])
        devset.extend(items[split_idx:])

    random.shuffle(trainset)
    random.shuffle(devset)
    return trainset, devset

# Usage
trainset, devset = stratified_split(all_examples, label_field="category")

Using HuggingFace's built-in splits

Many HuggingFace datasets come pre-split:

dataset = load_dataset("hotpotqa", "fullwiki")
trainset = [dspy.Example(**x).with_inputs("question") for x in dataset["train"][:500]]
devset = [dspy.Example(**x).with_inputs("question") for x in dataset["validation"][:200]]

Common patterns

Accessing fields

ex = dspy.Example(question="What?", answer="That", source="wiki")

# Dot access
ex.question

# Dict-style access
ex["question"]

# Get all field names
ex.keys()  # dict_keys(['question', 'answer', 'source'])

# Check if a field exists
"question" in ex  # True

Converting to/from dicts

# Example to dict
d = dict(ex)  # {"question": "What?", "answer": "That", "source": "wiki"}

# Dict to Example
ex = dspy.Example(**d).with_inputs("question")

Filtering examples

# Keep only examples where the answer is short
short_answers = [ex for ex in trainset if len(ex.answer.split()) < 20]

# Keep only a specific category
urgent_only = [ex for ex in trainset if ex.category == "urgent"]

# Remove duplicates (by a field)
seen = set()
unique = []
for ex in trainset:
    if ex.question not in seen:
        seen.add(ex.question)
        unique.append(ex)

Inspecting your dataset

# Quick summary
print(f"Total examples: {len(trainset)}")
print(f"Fields: {trainset[0].keys()}")
print(f"First example: {trainset[0]}")

# Label distribution
from collections import Counter
labels = Counter(ex.label for ex in trainset)
print(f"Label distribution: {labels}")

Gotchas

Claude forgets with_inputs() on every Example. Without it, optimizers cannot distinguish inputs from expected outputs. Every example passed to an optimizer or evaluator must have with_inputs() called. Claude often creates examples and only calls with_inputs() on the first one or skips it entirely when building lists inline.
Claude calls with_inputs() with output field names. Mark only the fields your module receives as input — not the fields it should produce. If your signature is question -> answer, call .with_inputs("question"), not .with_inputs("question", "answer"). Including output fields means the optimizer has nothing to score against.
Claude uses Literal[list] instead of Literal[tuple(list)] for dynamic categories. When building categories from data (CATEGORIES = list(set(...))), the type annotation must be Literal[tuple(CATEGORIES)], not Literal[CATEGORIES]. The latter silently fails to constrain the output.
Claude passes raw dicts to optimizers instead of dspy.Example objects. DSPy optimizers and evaluators require dspy.Example objects, not plain Python dicts. Always convert with dspy.Example(**row).with_inputs(...).
Claude creates train/dev splits without shuffling first. If data is sorted by label or date, taking the first 80% as train and last 20% as dev creates a biased split. Always shuffle with a fixed seed before splitting.

Additional resources

dspy.Example API docs
dspy.Prediction API docs
reference.md — constructor signatures, method tables, Prediction details
examples.md — worked examples with CSV, HuggingFace, and manual data

Cross-references

Install any skill: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>

/dspy-evaluate — evaluate your program on a devset with metrics
/ai-generating-data — generate synthetic training data when you have none
/ai-improving-accuracy — use optimizers that consume your trainset to boost quality
Install /ai-do if you do not have it — it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-do

Related Skills

lebsral/ai-watching-optimization

tools

VerifiedTrustedCommunity

See what is happening during optimizer.compile() instead of waiting blind. Use when you want to watch optimization progress, see scores as they come in, know if your optimizer is working, check if optimization is stuck, understand why optimization is taking too long, get live progress during compile, monitor convergence, detect overfitting during optimization, interpret optimization results, or pick the right tool for watching optimization. Also used for optimizer progress bar, is my optimizer doing anything, optimization seems stuck, how long will optimization take, watch GEPA run, watch MIPROv2 run, live optimization dashboard, optimizer not improving, scores not going up, optimization taking forever, see what optimizer is doing, debug slow optimization, optimization visibility, optimizer metrics, track compile progress, optimization observability.

6SKILL.mdUpdated May 31, 2026

lebsral/ai-watching-optimization

lebsral/dspy-miprov2

testing

VerifiedTrustedCommunity

Use when you want the highest-quality prompt optimization DSPy offers — jointly optimizes instructions and few-shot demos, with auto=light/medium/heavy presets. Common scenarios - you want the best possible accuracy from prompt optimization, jointly tuning instructions and few-shot demonstrations, using auto presets for different compute budgets, or when COPRO or BootstrapFewShot alone are not reaching your accuracy target. Related - ai-improving-accuracy, dspy-copro, dspy-bootstrap-few-shot. Also used for dspy.MIPROv2, best DSPy optimizer, highest quality optimization, auto=light medium heavy, joint instruction and demo optimization, most powerful prompt optimizer, MIPROv2 vs COPRO vs BootstrapFewShot, which optimizer should I use, state of the art prompt optimization, when to use MIPROv2, optimize both instructions and examples, heavy optimization for production, best optimizer for accuracy.

6SKILL.mdUpdated Apr 27, 2026

lebsral/dspy-langwatch

testing

VerifiedTrustedCommunity

Use LangWatch for DSPy auto-tracing and real-time optimizer progress. Use when you want to set up LangWatch, langwatch.dspy.init, auto-tracing DSPy, real-time optimization dashboard, optimizer progress tracking, app.langwatch.ai, or DSPy optimizer dashboard. Also used for langwatch setup, pip install langwatch, langwatch trace, optimizer progress, real-time optimization, watch optimizer run, LangWatch self-hosted, langwatch docker, langwatch vs langtrace, langwatch autotrack_dspy.

6SKILL.mdUpdated Apr 27, 2026

lebsral/dspy-langwatch

lebsral/dspy-gepa

data-ai

VerifiedTrustedCommunity

Use when you want to optimize instructions without few-shot examples — a lightweight alternative to COPRO when you do not have or do not want to use demonstrations. Common scenarios - optimizing instructions when you do not have or do not want to use few-shot demonstrations, lightweight instruction search as a first step, tasks where examples in the prompt confuse the model, or when you want fast instruction optimization without the cost of COPRO. Related - ai-improving-accuracy, dspy-copro, dspy-miprov2. Also used for dspy.GEPA, instruction optimization without demos, lightweight prompt optimization, optimize instructions only, no few-shot examples needed, GEPA vs COPRO, quick instruction search, when demonstrations hurt performance, zero-shot optimization, instruction-only optimizer, simplest instruction tuner, fast prompt optimization, skip few-shot and just tune instructions, optimize Pydantic field descriptions, GEPA structured output, GEPA does not optimize field desc.

6SKILL.mdUpdated Apr 27, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/lebsral/dspy-programming-not-prompting-lms-skills.git

# Copy into Claude Code skills folder (global)
cp -r dspy-programming-not-prompting-lms-skills/skills/dspy-data ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

lebsral/dspy-programming-not-prompting-lms-skills

5 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT