skills/llm/weighted-attack-strategy-sampling/SKILL.md
Sample from a pool of adversarial prompt strategies with per-strategy probability weights to hedge across judge models
npx skillsauth add wenmin-wu/ds-skills llm-weighted-attack-strategy-samplingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
No single adversarial prompt wins against all LLM judges — Claude resists counting traps, Gemini resists metadata injection, GPT resists self-ID prompts. Instead of picking one, maintain a pool of attack strategies and sample one per row using random.choices with weights learned from a small validation set. The submission becomes a stochastic mixture that dominates any single strategy because it always has some coverage against the judge du jour.
import random
STRATEGIES = {
"counting_trap": (0.35, build_counting_trap),
"metadata_inject": (0.25, build_metadata_inject),
"few_shot_inject": (0.20, build_few_shot_inject),
"score_variance": (0.10, build_score_variance),
"plain_essay": (0.10, build_plain_essay), # baseline to preserve distribution
}
names = list(STRATEGIES.keys())
weights = [w for w, _ in STRATEGIES.values()]
builders = {k: b for k, (_, b) in STRATEGIES.items()}
def attack(row, rng=random.Random()):
strategy = rng.choices(names, weights=weights, k=1)[0]
return builders[strategy](row), strategy
(row) -> essay_textdata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF