skills/dspy-optimize-anything/SKILL.md
Universal text artifact optimizer using GEPA's optimize_anything API for code, prompts, agent architectures, configs, and more
npx skillsauth add omidzamani/dspy-skills dspy-optimize-anythingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Optimize any artifact representable as text — code, prompts, agent architectures, vector graphics, configurations — using a single declarative API powered by GEPA's reflective evolutionary search.
| Input | Type | Description |
|-------|------|-------------|
| seed_candidate | str \| dict[str, str] \| None | Starting artifact text, or None for seedless mode |
| evaluator | Callable | Returns score (higher=better), optionally with ASI dict |
| dataset | list \| None | Training examples (for multi-task and generalization modes) |
| valset | list \| None | Validation set (for generalization mode) |
| objective | str \| None | Natural language description of what to optimize for |
| background | str \| None | Domain knowledge and constraints |
| config | GEPAConfig \| None | Engine, reflection, and tracking settings |
| Output | Type | Description |
|--------|------|-------------|
| result.best_candidate | str \| dict | Best optimized artifact |
pip install -U "gepa>=0.1.1,<0.2"
The evaluator scores a candidate and returns Actionable Side Information (ASI) — diagnostic feedback that guides the LLM proposer during reflection.
Simple evaluator (score only):
import gepa.optimize_anything as oa
from gepa.optimize_anything import EngineConfig, GEPAConfig
config = GEPAConfig(engine=EngineConfig(max_metric_calls=100))
def evaluate(candidate: str) -> float:
score, diagnostic = run_my_system(candidate)
oa.log(f"Error: {diagnostic}") # captured as ASI
return score
Rich evaluator (score + structured ASI):
def evaluate(candidate: str) -> tuple[float, dict]:
result = execute_code(candidate)
return result.score, {
"Error": result.stderr,
"Output": result.stdout,
"Runtime": f"{result.time_ms:.1f}ms",
}
ASI can include open-ended text, structured data, multi-objectives (via scores), or images (via gepa.Image) for vision-capable LLMs.
Mode 1 — Single-Task Search: Solve one hard problem. No dataset needed.
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
config=config,
)
Mode 2 — Multi-Task Search: Solve a batch of related problems with cross-transfer.
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
dataset=tasks,
config=config,
)
Mode 3 — Generalization: Build a skill/prompt/policy that transfers to unseen problems.
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
dataset=train,
valset=val,
config=config,
)
Seedless mode: Describe what you need instead of providing a seed.
result = oa.optimize_anything(
evaluator=evaluate,
objective="Generate a Python function `reverse()` that reverses a string.",
config=config,
)
print(result.best_candidate)
import gepa.optimize_anything as oa
from gepa import Image
from gepa.optimize_anything import EngineConfig, GEPAConfig
import logging
logger = logging.getLogger(__name__)
# ---------- SVG optimization with VLM feedback ----------
GOAL = "a pelican riding a bicycle"
VLM = "vertex_ai/gemini-3-flash-preview"
VISUAL_ASPECTS = [
{"id": "overall", "criteria": f"Rate overall quality of this SVG ({GOAL}). SCORE: X/10"},
{"id": "anatomy", "criteria": "Rate pelican accuracy: beak, pouch, plumage. SCORE: X/10"},
{"id": "bicycle", "criteria": "Rate bicycle: wheels, frame, handlebars, pedals. SCORE: X/10"},
{"id": "composition", "criteria": "Rate how convincingly the pelican rides the bicycle. SCORE: X/10"},
]
def evaluate(candidate, example):
"""Render SVG, score with a VLM, return (score, ASI)."""
image = render_image(candidate["svg_code"]) # via cairosvg
score, feedback = get_vlm_score_feedback(VLM, image, example["criteria"])
return score, {
"RenderedSVG": Image(base64_data=image, media_type="image/png"),
"Feedback": feedback,
}
result = oa.optimize_anything(
seed_candidate={"svg_code": "<svg>...</svg>"},
evaluator=evaluate,
dataset=VISUAL_ASPECTS,
background=f"Optimize SVG source code depicting '{GOAL}'. "
"Improve anatomy, composition, and visual quality.",
config=GEPAConfig(engine=EngineConfig(max_metric_calls=100)),
)
logger.info(f"Best SVG:\n{result.best_candidate['svg_code']}")
# ---------- Code optimization (single-task) ----------
def evaluate_solver(candidate: str) -> tuple[float, dict]:
"""Evaluate a Python solver for a mathematical optimization problem."""
import subprocess, json
proc = subprocess.run(
["python", "-c", candidate],
capture_output=True, text=True, timeout=30,
)
if proc.returncode != 0:
oa.log(f"Runtime error: {proc.stderr}")
return 0.0, {"Error": proc.stderr}
try:
output = json.loads(proc.stdout)
return output["score"], {
"Output": output.get("solution"),
"Runtime": f"{output.get('time_ms', 0):.1f}ms",
}
except (json.JSONDecodeError, KeyError) as e:
oa.log(f"Parse error: {e}")
return 0.0, {"Error": str(e), "Stdout": proc.stdout}
result = oa.optimize_anything(
evaluator=evaluate_solver,
objective="Write a Python solver for the bin packing problem that "
"minimizes the number of bins. Output JSON with 'score' and 'solution'.",
background="Use first-fit-decreasing as a starting heuristic. "
"Higher score = fewer bins used.",
config=GEPAConfig(engine=EngineConfig(max_metric_calls=100)),
)
print(result.best_candidate)
# ---------- Agent architecture generalization ----------
def evaluate_agent(candidate: str, example: dict) -> tuple[float, dict]:
"""Run an agent architecture on a task and score it."""
exec_globals = {}
exec(candidate, exec_globals)
agent_fn = exec_globals.get("solve")
if agent_fn is None:
return 0.0, {"Error": "No `solve` function defined"}
try:
prediction = agent_fn(example["input"])
correct = prediction == example["expected"]
score = 1.0 if correct else 0.0
feedback = "Correct" if correct else (
f"Expected '{example['expected']}', got '{prediction}'"
)
return score, {"Prediction": prediction, "Feedback": feedback}
except Exception as e:
return 0.0, {"Error": str(e)}
result = oa.optimize_anything(
seed_candidate="def solve(input):\n return input",
evaluator=evaluate_agent,
dataset=train_tasks,
valset=val_tasks,
background="Discover a Python agent function `solve(input)` that "
"generalizes across unseen reasoning tasks.",
config=GEPAConfig(engine=EngineConfig(max_metric_calls=100)),
)
print(result.best_candidate)
optimize_anything complements DSPy's built-in optimizers. Use DSPy optimizers (GEPA, MIPROv2, BootstrapFewShot) for DSPy programs, and optimize_anything for arbitrary text artifacts outside DSPy:
import dspy
import gepa.optimize_anything as oa
from gepa.optimize_anything import EngineConfig, GEPAConfig
# DSPy program optimization (use dspy.GEPA)
optimizer = dspy.GEPA(
metric=gepa_metric,
reflection_lm=dspy.LM("openai/gpt-4o"),
auto="medium",
)
compiled = optimizer.compile(agent, trainset=trainset)
# Non-DSPy artifact optimization (use optimize_anything)
result = oa.optimize_anything(
seed_candidate=my_config_yaml,
evaluator=eval_config,
background="Optimize Kubernetes scheduling policy for cost.",
config=GEPAConfig(engine=EngineConfig(max_metric_calls=100)),
)
oa.log() — Route prints to the proposer as ASI instead of stdout(score, dict) tuples for multi-faceted diagnosticsobjective= when the solution space is large and unfamiliarbackground= to constrain the searchvalset when the artifact must transfer to unseen inputsgepa.Image to pass rendered outputs to vision-capable LLMsGEPAConfig(engine=EngineConfig(max_metric_calls=...))gepa package (pip install -U "gepa>=0.1.1,<0.2")valset for transfertools
This skill should be used when the user asks to "optimize with SIMBA", "use mini-batch introspective optimization", "generate self-reflective rules", mentions "SIMBA optimizer", "stochastic mini-batch ascent", "output variability", or needs an alternative to MIPROv2/GEPA that evolves rules and demonstrations from numeric metrics.
data-ai
This skill should be used when the user asks to "create a DSPy signature", "define inputs and outputs", "design a signature", "use InputField or OutputField", "add type hints to DSPy", mentions "signature class", "type-safe DSPy", "Pydantic models in DSPy", or needs to define what a DSPy module should do with structured inputs and outputs.
development
This skill should be used when the user asks to "use DSPy RLM", "process a very long context", "use ProgramOfThought", "use CodeAct", "run DSPy modules in parallel", mentions Recursive Language Models, sandboxed Python execution, Deno, `dspy.RLM`, `dspy.ProgramOfThought`, `dspy.CodeAct`, or `dspy.Parallel`, or needs to choose a DSPy reasoning module beyond Predict, ChainOfThought, and ReAct.
tools
This skill should be used when the user asks to "create a ReAct agent", "build an agent with tools", "implement tool-calling agent", "use dspy.ReAct", mentions "agent with tools", "reasoning and acting", "multi-step agent", "agent optimization with GEPA", or needs to build production agents that use tools to solve complex tasks.