skills/nlp/o-class-threshold-suppression/SKILL.md
Thresholds the O-class (non-entity) softmax probability in NER: if below threshold, overrides with the best non-O class to boost entity recall.
npx skillsauth add wenmin-wu/ds-skills nlp-o-class-threshold-suppressionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
In NER tasks where missing an entity is costly (PII detection, medical NER), the default argmax prediction is too conservative — the O class dominates and suppresses borderline entity predictions. This technique sets a high threshold (e.g., 0.99) on the O-class probability: if the model's confidence in "not an entity" is below this threshold, it falls back to the highest-scoring non-O class. This shifts the precision-recall tradeoff toward recall, catching entities the model was uncertain about.
import numpy as np
def suppress_o_class(logits, o_class_idx=12, threshold=0.99):
"""Override O predictions when model isn't confident enough.
Args:
logits: (batch, seq_len, num_classes) softmax probabilities
o_class_idx: index of the O (non-entity) class
threshold: minimum O-class probability to keep O prediction
Returns:
predictions: (batch, seq_len) label indices
"""
preds_argmax = logits.argmax(-1)
# Best non-O prediction
non_o_logits = np.concatenate([
logits[..., :o_class_idx],
logits[..., o_class_idx+1:]
], axis=-1)
preds_without_o = non_o_logits.argmax(-1)
# Adjust indices for removed O class
preds_without_o[preds_without_o >= o_class_idx] += 1
o_probs = logits[..., o_class_idx]
return np.where(o_probs < threshold, preds_without_o, preds_argmax)
predictions = suppress_o_class(softmax_outputs, o_class_idx=12, threshold=0.99)
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF