skills/tabular/regression-to-cdf-smoothing/SKILL.md
Convert a scalar regression prediction into a smoothed CDF over discrete bins using a linear ramp instead of a hard step
npx skillsauth add wenmin-wu/ds-skills tabular-regression-to-cdf-smoothingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When the evaluation metric is CRPS on a discrete CDF but your model outputs a scalar prediction, convert it to a smooth CDF using a linear ramp (or sigmoid) centered on the predicted value. A hard step function is overconfident; a ramp with width W spreads probability over ±W bins, hedging against prediction error.
import numpy as np
def scalar_to_cdf(predictions, n_bins=199, offset=99, ramp_width=10):
"""Convert scalar predictions to smoothed CDFs.
Args:
predictions: array of scalar predictions
n_bins: number of CDF bins
offset: bin index for target=0
ramp_width: half-width of linear ramp (bins)
Returns:
(N, n_bins) CDF array
"""
cdf = np.zeros((len(predictions), n_bins))
for i, pred in enumerate(predictions):
center = int(round(pred)) + offset
for j in range(n_bins):
if j >= center + ramp_width:
cdf[i, j] = 1.0
elif j >= center - ramp_width:
cdf[i, j] = (j - center + ramp_width) / (2 * ramp_width)
return np.clip(cdf, 0, 1)
# Usage with LightGBM
y_pred_scalar = np.mean([m.predict(X_test) for m in models], axis=0)
y_pred_cdf = scalar_to_cdf(y_pred_scalar, ramp_width=10)
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF