skills/tabular/weighted-gini-top-recall-metric/SKILL.md
Custom ranking metric combining normalized weighted Gini coefficient with top-K% capture rate for imbalanced classification with class-weighted evaluation
npx skillsauth add wenmin-wu/ds-skills tabular-weighted-gini-top-recall-metricInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
For imbalanced binary classification (e.g. credit default), combine two complementary ranking signals: (1) normalized weighted Gini coefficient measuring overall ranking quality, and (2) top-K% capture rate measuring recall among the highest-risk predictions. Apply class weights (e.g. 20x for negatives) to account for known subsampling. Final score = 0.5 * (Gini + capture_rate).
import numpy as np
def weighted_gini_top_recall(y_true, y_pred, neg_weight=20, top_pct=0.04):
"""Combined weighted Gini + top-K% capture rate.
Args:
y_true: binary labels (0/1)
y_pred: predicted scores (higher = more positive)
neg_weight: weight for negative class (compensates subsampling)
top_pct: fraction of weighted population for capture rate
"""
idx = np.argsort(y_pred)[::-1]
y_true, y_pred = y_true[idx], y_pred[idx]
weight = np.where(y_true == 0, neg_weight, 1)
# Top-K% capture rate
cum_weight = np.cumsum(weight)
cutoff = int(top_pct * weight.sum())
top_mask = cum_weight <= cutoff
capture = y_true[top_mask].sum() / y_true.sum()
# Normalized weighted Gini
cum_norm_w = cum_weight / weight.sum()
total_pos = (y_true * weight).sum()
lorentz = np.cumsum(y_true * weight) / total_pos
gini = ((lorentz - cum_norm_w) * weight).sum()
# Perfect Gini (sort by true labels)
idx_perfect = np.argsort(y_true)[::-1]
w_p = np.where(y_true[idx_perfect] == 0, neg_weight, 1)
cum_p = np.cumsum(w_p) / w_p.sum()
lor_p = np.cumsum(y_true[idx_perfect] * w_p) / (y_true[idx_perfect] * w_p).sum()
gini_max = ((lor_p - cum_p) * w_p).sum()
return 0.5 * (gini / gini_max + capture)
def feval(preds, data): ... returning (name, value, higher_is_better)data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF