skills/nlp/char-prob-weighted-blend/SKILL.md
Blend character-level probability arrays from multiple models with OOF-tuned weights before thresholding
npx skillsauth add wenmin-wu/ds-skills nlp-char-prob-weighted-blendInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
For span extraction tasks, ensemble at the character-probability level rather than at the span level. Each model produces char-level probability arrays; blend them with scalar weights tuned on OOF predictions, then threshold once. This is smoother and more accurate than ensembling discrete spans.
import numpy as np
def blend_char_probs(model_probs, weights):
"""Weighted blend of char-level probability arrays from multiple models.
Args:
model_probs: list of [n_samples] arrays, each element is char-prob array
weights: list of floats, one per model
"""
blended = []
for sample_probs in zip(*model_probs):
combined = sum(w * p for w, p in zip(weights, sample_probs))
blended.append(combined)
return blended
# Example: 3-model blend
weights = [0.5, 0.4, 0.18] # tuned on OOF
blended = blend_char_probs(
[deberta_v3_probs, deberta_v1_probs, deberta_base_probs],
weights
)
spans = get_spans(blended, threshold=0.5)
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF