skills/dspy-knn-few-shot/SKILL.md
Use when you want few-shot demos that are dynamically selected per input based on similarity — better than fixed demos when inputs vary widely. Common scenarios - inputs vary widely and fixed examples do not cover enough cases, dynamically selecting the most relevant demos per input, building a retrieval-augmented prompt with similar examples, or when static few-shot examples work for some inputs but fail on others. Related - ai-improving-accuracy, dspy-labeled-few-shot, dspy-bootstrap-few-shot. Also used for dspy.KNNFewShot, dynamic few-shot selection, similar examples per input, retrieval-augmented few-shot, adaptive demonstrations, nearest neighbor example selection, dynamic prompt construction, different examples for different inputs, embedding-based demo retrieval, when fixed examples do not generalize, per-input demo selection, contextual few-shot examples, smart example selection.
npx skillsauth add lebsral/dspy-programming-not-prompting-lms-skills dspy-knn-few-shotInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Guide the user through using DSPy's KNN-based retrieval to dynamically select the most relevant few-shot demonstrations for each input at inference time, rather than using the same static examples for every query.
dspy.KNN is an in-memory nearest-neighbor retriever. Given a training set and an embedding function, it converts every training example into a vector. At query time, it embeds the new input, computes dot-product similarity against all stored vectors, and returns the k most similar training examples.
dspy.KNNFewShot is an optimizer (teleprompter) that wraps KNN and BootstrapFewShot together. It compiles a student program so that every forward call first retrieves the k nearest training examples, then uses them as the few-shot demonstrations for the underlying module. The demonstrations change per input -- each query gets the examples most relevant to it.
New input ──> Embed ──> Find k nearest training examples ──> Use as demos ──> Run module
Do not use KNNFewShot when:
import dspy
from sentence_transformers import SentenceTransformer
lm = dspy.LM("openai/gpt-4o-mini") # or any LiteLLM-supported provider
dspy.configure(lm=lm)
# 1. Prepare training data
trainset = [
dspy.Example(question="What causes rain?", answer="Condensation of water vapor in clouds").with_inputs("question"),
dspy.Example(question="What is photosynthesis?", answer="The process plants use to convert sunlight into energy").with_inputs("question"),
# ... more examples
]
# 2. Set up an embedding function
encoder = SentenceTransformer("all-MiniLM-L6-v2")
embedder = dspy.Embedder(encoder.encode)
# 3. Create the optimizer
knn_optimizer = dspy.KNNFewShot(
k=3,
trainset=trainset,
vectorizer=embedder,
)
# 4. Compile your module
qa = dspy.ChainOfThought("question -> answer")
optimized_qa = knn_optimizer.compile(qa)
# 5. Use it -- each call retrieves relevant demos automatically
result = optimized_qa(question="How do volcanoes form?")
print(result.answer)
Each call to optimized_qa now dynamically selects the 3 training examples most similar to the input question and includes them as few-shot demonstrations in the prompt.
If you only need the retrieval step (without the BootstrapFewShot compilation), use dspy.KNN on its own:
import dspy
from sentence_transformers import SentenceTransformer
encoder = SentenceTransformer("all-MiniLM-L6-v2")
embedder = dspy.Embedder(encoder.encode)
trainset = [
dspy.Example(question="What is gravity?", answer="A fundamental force of attraction between masses").with_inputs("question"),
dspy.Example(question="What is friction?", answer="A force that opposes the relative motion of surfaces").with_inputs("question"),
# ... more examples
]
knn = dspy.KNN(
k=3,
trainset=trainset,
vectorizer=embedder,
)
# Retrieve the 3 most similar examples to a new query
similar = knn(question="What is inertia?")
# similar is a list of dspy.Example objects, ranked by similarity
This is useful when you want to plug KNN retrieval into a custom module or pipeline.
KNNFewShot and KNN require a dspy.Embedder wrapping any function that takes text (or a list of texts) and returns vectors.
from sentence_transformers import SentenceTransformer
encoder = SentenceTransformer("all-MiniLM-L6-v2")
embedder = dspy.Embedder(encoder.encode)
all-MiniLM-L6-v2 is fast, small (~80MB), and works well for general-purpose similarity. For domain-specific tasks, consider models from the MTEB leaderboard.
import openai
client = openai.OpenAI()
def openai_embed(texts):
response = client.embeddings.create(
model="text-embedding-3-small",
input=texts,
)
return [item.embedding for item in response.data]
embedder = dspy.Embedder(openai_embed)
Any function with the signature (str | list[str]) -> list[list[float]] works:
embedder = dspy.Embedder(my_custom_embed_function)
Indexing (at init time): KNN concatenates all input fields of each training example into a single string, then calls the embedder to produce a vector per example. These vectors are stored in memory as a matrix.
Querying (at call time): The new input's fields are concatenated and embedded the same way. KNN computes dot-product similarity between the query vector and all stored vectors, then returns the k examples with the highest scores.
Demo injection (KNNFewShot only): The retrieved examples are set as the demos on each Predict module inside the compiled student program. This happens on every forward call, so demonstrations adapt per query.
The dot-product similarity means vectors should ideally be normalized (most sentence-transformer models do this by default). If your embedding function does not normalize, cosine similarity and dot-product may diverge.
dspy.KNN(
k, # int -- number of nearest neighbors to retrieve
trainset, # list[dspy.Example] -- examples to search through
vectorizer, # dspy.Embedder -- embedding function wrapper
)
dspy.KNNFewShot(
k, # int -- number of nearest neighbors to retrieve
trainset, # list[dspy.Example] -- examples to search through
vectorizer, # dspy.Embedder -- embedding function wrapper
**few_shot_bootstrap_args # passed to BootstrapFewShot (e.g., metric, max_bootstrapped_demos)
)
| Parameter | Type | Description |
|-----------|------|-------------|
| k | int | Number of nearest neighbors to retrieve per query |
| trainset | list[dspy.Example] | Training examples to index and search |
| vectorizer | dspy.Embedder | Wraps any embedding function for vectorization |
| **few_shot_bootstrap_args | dict | Forwarded to BootstrapFewShot (e.g., metric, max_bootstrapped_demos, max_labeled_demos) |
compile(student, *, teacher=None): Returns a copy of the student program whose forward method retrieves k nearest demos per call. Accepts an optional teacher program (passed through to BootstrapFewShot).
| k | Trade-off | |---|-----------| | 1-2 | Minimal prompt overhead. Works when examples are very similar to queries. | | 3-5 | Good default range. Enough diversity without bloating the prompt. | | 7-10 | Use with short examples or large context windows. Diminishing returns beyond this. |
Keep in mind that each retrieved demo adds to the prompt length. If your examples are long (multi-paragraph), use a smaller k to stay within context limits.
| | Static few-shot (BootstrapFewShot / LabeledFewShot) | Dynamic few-shot (KNNFewShot) | |---|---|---| | Demo selection | Same demos for every input | Per-input demos based on similarity | | Best when | Inputs are homogeneous, few examples available | Inputs are diverse, many examples available | | Setup cost | Lower -- no embedding model needed | Higher -- requires an embedder and more training data | | Prompt relevance | May include irrelevant demos for some inputs | Demos are always relevant to the current input | | Latency | No retrieval overhead | Small overhead for embedding + similarity search | | Scales with data | More data doesn't help (fixed demo slots) | More data improves retrieval quality |
Since KNNFewShot wraps BootstrapFewShot internally, you can pass any BootstrapFewShot parameter via **few_shot_bootstrap_args:
knn_optimizer = dspy.KNNFewShot(
k=5,
trainset=trainset,
vectorizer=embedder,
metric=my_metric,
max_bootstrapped_demos=2,
max_labeled_demos=3,
)
optimized = knn_optimizer.compile(my_program, teacher=teacher_program)
This retrieves 5 nearest neighbors per query and then applies BootstrapFewShot logic (with the given metric and demo limits) over those neighbors.
normalize_embeddings=True to SentenceTransformer.encode() if needed..with_inputs() on training examples. KNN concatenates only the input fields (not output fields) to create the query embedding. If .with_inputs() is missing, KNN does not know which fields are inputs and may embed the wrong thing or fail silently.all-MiniLM-L6-v2) for cost-sensitive workloads.Install any skill:
npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill <name>
/dspy-bootstrap-few-shot/dspy-labeled-few-shot/ai-improving-accuracy/ai-do if you do not have it — it routes any AI problem to the right skill and is the fastest way to work: npx skills add lebsral/DSPy-Programming-not-prompting-LMs-skills --skill ai-dotools
See what is happening during optimizer.compile() instead of waiting blind. Use when you want to watch optimization progress, see scores as they come in, know if your optimizer is working, check if optimization is stuck, understand why optimization is taking too long, get live progress during compile, monitor convergence, detect overfitting during optimization, interpret optimization results, or pick the right tool for watching optimization. Also used for optimizer progress bar, is my optimizer doing anything, optimization seems stuck, how long will optimization take, watch GEPA run, watch MIPROv2 run, live optimization dashboard, optimizer not improving, scores not going up, optimization taking forever, see what optimizer is doing, debug slow optimization, optimization visibility, optimizer metrics, track compile progress, optimization observability.
testing
Use when you want the highest-quality prompt optimization DSPy offers — jointly optimizes instructions and few-shot demos, with auto=light/medium/heavy presets. Common scenarios - you want the best possible accuracy from prompt optimization, jointly tuning instructions and few-shot demonstrations, using auto presets for different compute budgets, or when COPRO or BootstrapFewShot alone are not reaching your accuracy target. Related - ai-improving-accuracy, dspy-copro, dspy-bootstrap-few-shot. Also used for dspy.MIPROv2, best DSPy optimizer, highest quality optimization, auto=light medium heavy, joint instruction and demo optimization, most powerful prompt optimizer, MIPROv2 vs COPRO vs BootstrapFewShot, which optimizer should I use, state of the art prompt optimization, when to use MIPROv2, optimize both instructions and examples, heavy optimization for production, best optimizer for accuracy.
testing
Use LangWatch for DSPy auto-tracing and real-time optimizer progress. Use when you want to set up LangWatch, langwatch.dspy.init, auto-tracing DSPy, real-time optimization dashboard, optimizer progress tracking, app.langwatch.ai, or DSPy optimizer dashboard. Also used for langwatch setup, pip install langwatch, langwatch trace, optimizer progress, real-time optimization, watch optimizer run, LangWatch self-hosted, langwatch docker, langwatch vs langtrace, langwatch autotrack_dspy.
data-ai
Use when you want to optimize instructions without few-shot examples — a lightweight alternative to COPRO when you do not have or do not want to use demonstrations. Common scenarios - optimizing instructions when you do not have or do not want to use few-shot demonstrations, lightweight instruction search as a first step, tasks where examples in the prompt confuse the model, or when you want fast instruction optimization without the cost of COPRO. Related - ai-improving-accuracy, dspy-copro, dspy-miprov2. Also used for dspy.GEPA, instruction optimization without demos, lightweight prompt optimization, optimize instructions only, no few-shot examples needed, GEPA vs COPRO, quick instruction search, when demonstrations hurt performance, zero-shot optimization, instruction-only optimizer, simplest instruction tuner, fast prompt optimization, skip few-shot and just tune instructions, optimize Pydantic field descriptions, GEPA structured output, GEPA does not optimize field desc.