skills/llm/prompt-variant-ensemble/SKILL.md
Generate multiple LLM responses using diverse system prompt variants to increase reasoning diversity for self-consistency voting
npx skillsauth add wenmin-wu/ds-skills llm-prompt-variant-ensembleInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When using self-consistency (majority voting over multiple LLM outputs), all responses sharing the same prompt tend to follow similar reasoning paths — reducing effective diversity. By rotating through multiple distinct system prompts that differ in tone, instruction style, or emphasis, the model explores more diverse reasoning strategies, improving the quality of majority vote aggregation.
PROMPT_VARIANTS = [
"You are a math expert. Solve step by step. Put your answer in \\boxed{}.",
"Please reflect and verify while reasoning. Put the answer in \\boxed{}.",
"Solve using concise, clear reasoning. Place the answer in \\boxed{}.",
"Think carefully and double-check your work. Answer in \\boxed{}.",
"You are good at reverse thinking to recheck answers. Answer in \\boxed{}.",
]
def create_messages(question, variant_idx):
return [
{"role": "system", "content": PROMPT_VARIANTS[variant_idx % len(PROMPT_VARIANTS)]},
{"role": "user", "content": question},
]
# Generate with rotating prompts
all_messages = [create_messages(q, i) for i in range(num_samples)]
responses = llm.generate(all_messages, sampling_params)
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF