skills/nlp/mbr-decoding-reranking/SKILL.md
Minimum Bayes Risk decoding — select the candidate with highest average chrF++ agreement against all others in the pool
npx skillsauth add wenmin-wu/ds-skills nlp-mbr-decoding-rerankingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Instead of picking the single highest-probability beam, generate a pool of candidates and select the one most agreed-upon by the others. Uses chrF++ (character F-score with word bigrams) as the utility metric. Consistently outperforms pure beam search for translation tasks.
import sacrebleu
import numpy as np
def mbr_select(candidates, pool_cap=32):
metric = sacrebleu.metrics.CHRF(word_order=2)
unique = list(dict.fromkeys(c.strip() for c in candidates if c.strip()))
pool = unique[:pool_cap]
n = len(pool)
if n <= 1:
return pool[0] if pool else ""
scores = np.zeros(n)
for i in range(n):
for j in range(n):
if i != j:
scores[i] += metric.sentence_score(pool[i], [pool[j]]).score
scores[i] /= (n - 1)
return pool[int(np.argmax(scores))]
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF