skills/cv/pfbeta-threshold-optimization/SKILL.md
Grid-searches the optimal classification threshold to maximize probabilistic F-beta score on validation predictions.
npx skillsauth add wenmin-wu/ds-skills cv-pfbeta-threshold-optimizationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Probabilistic F-beta (pFbeta) extends the standard F-beta score to work with soft predictions, weighting recall more heavily when beta > 1 (e.g., beta=1 for F1, beta=2 for recall-oriented medical screening). The default threshold of 0.5 is rarely optimal — especially with severe class imbalance (1–2% positive rate). Grid-searching over [0, 1] in 0.01 steps finds the threshold that maximizes pFbeta on validation data, often improving the score by 0.02–0.10.
import numpy as np
import torch
def pfbeta_torch(preds, labels, beta=1.0):
"""Probabilistic F-beta score."""
ptp = (preds * labels).sum()
pfp = (preds * (1 - labels)).sum()
pfn = ((1 - preds) * labels).sum()
precision = ptp / (ptp + pfp + 1e-10)
recall = ptp / (ptp + pfn + 1e-10)
return ((1 + beta**2) * precision * recall /
(beta**2 * precision + recall + 1e-10))
def optimize_threshold(probs, labels, beta=1.0, n_steps=101):
"""Find threshold maximizing pFbeta."""
thresholds = np.linspace(0, 1, n_steps)
scores = []
for t in thresholds:
preds = (torch.tensor(probs) > t).float()
score = pfbeta_torch(preds, torch.tensor(labels), beta).item()
scores.append(score)
best_idx = np.argmax(scores)
return thresholds[best_idx], scores[best_idx]
# Usage
best_thresh, best_score = optimize_threshold(val_probs, val_labels, beta=1.0)
print(f"Best threshold: {best_thresh:.2f}, pF1: {best_score:.4f}")
test_preds = (test_probs > best_thresh).astype(int)
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF