skills/nlp/best-prob-fallback-matching/SKILL.md
When no candidate passes the threshold for a query, fall back to the single highest-scoring match to guarantee at least one prediction per query
npx skillsauth add wenmin-wu/ds-skills nlp-best-prob-fallback-matchingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
In retrieval/matching tasks scored by MAP or recall, returning zero predictions for a query is maximally penalized. When a reranker's sigmoid threshold filters out all candidates for some queries, a fallback ensures every query gets at least one prediction — the highest-scoring candidate, even if below threshold. This is especially important with aggressive (low) thresholds where a few edge-case queries still end up empty.
import pandas as pd
def apply_with_fallback(df, threshold, query_col, doc_col, score_col):
df = df.sort_values(score_col, ascending=False)
# Queries with at least one match above threshold
pos = df[df[score_col] > threshold]
matched = pos.groupby(query_col)[doc_col].agg(list).reset_index()
matched_ids = set(matched[query_col])
# Queries with no match: take top-1 candidate
remaining = df[~df[query_col].isin(matched_ids)]
fallback = remaining.groupby(query_col).head(1)[[query_col, doc_col]]
fallback[doc_col] = fallback[doc_col].apply(lambda x: [x])
return pd.concat([matched, fallback], ignore_index=True)
result = apply_with_fallback(pairs, threshold=0.001,
query_col='topic_id', doc_col='content_id', score_col='score')
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF