skills/tabular/crps-cdf-loss/SKILL.md
Model cumulative distribution via softmax output layer and CRPS loss — for probabilistic regression over discrete bins
npx skillsauth add wenmin-wu/ds-skills tabular-crps-cdf-lossInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When the target is a probability distribution (e.g., "what is the CDF of yards gained?"), encode the label as a step function over discrete bins, use a softmax output layer, then train with Continuous Ranked Probability Score (CRPS) loss. CRPS penalizes the squared difference between predicted and true CDFs, rewarding both calibration and sharpness.
import numpy as np
import tensorflow.keras.backend as K
def crps_loss(y_true, y_pred):
"""CRPS loss on cumulative softmax output."""
return K.mean(K.square(y_true - K.cumsum(y_pred, axis=1)), axis=1)
def encode_cdf_target(values, n_bins=199, offset=99):
"""Encode scalar targets as step-function CDFs.
Args:
values: array of integer targets
n_bins: number of discrete bins
offset: bin index corresponding to target=0
"""
y = np.zeros((len(values), n_bins))
for i, v in enumerate(values):
y[i, v + offset:] = 1.0
return y
# Model
output = Dense(199, activation='softmax')(hidden)
model.compile(optimizer='adam', loss=crps_loss)
# CRPS evaluation callback
y_pred_cdf = np.clip(np.cumsum(model.predict(X_val), axis=1), 0, 1)
y_true_cdf = np.clip(np.cumsum(y_val, axis=1), 0, 1)
crps = np.mean(np.sum((y_true_cdf - y_pred_cdf) ** 2, axis=1)) / n_bins
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF