skills/timeseries/k-mode-gaussian-nll-loss/SKILL.md
Negative log-likelihood loss over K isotropic-Gaussian trajectory modes with per-mode confidences and logsumexp stability
npx skillsauth add wenmin-wu/ds-skills timeseries-k-mode-gaussian-nll-lossInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Single-trajectory regression (MSE) assumes one future is correct. Real driving, motion, and forecasting problems are multimodal — a vehicle might turn left or go straight, and both are plausible. The standard fix is to predict K trajectories plus K confidence weights, and optimize the negative log-likelihood of the ground truth under a mixture of K isotropic Gaussians (variance = 1). The math simplifies to -logsumexp(log(conf) - 0.5 * Σ error²) along the K axis, which rewards any mode that nails the ground truth while letting the confidences learn the mixture weights. Used in Lyft Motion Prediction, Waymo Open, Argoverse, and most AV trajectory benchmarks.
import torch
import numpy as np
def neg_multi_log_likelihood(gt, pred, confidences, avails):
"""
gt: (B, T, 2) ground-truth trajectory
pred: (B, K, T, 2) K candidate trajectories
confidences: (B, K) softmax over modes
avails: (B, T) 1/0 mask for valid future steps
"""
gt = gt.unsqueeze(1) # (B, 1, T, 2)
avails = avails[:, None, :, None] # (B, 1, T, 1)
err = torch.sum(((gt - pred) * avails) ** 2, dim=(2, 3)) # (B, K)
with np.errstate(divide='ignore'):
err = torch.log(confidences) - 0.5 * err # (B, K)
# logsumexp trick for numerical stability
max_val, _ = err.max(dim=1, keepdim=True)
err = -(torch.log(torch.sum(torch.exp(err - max_val), dim=1, keepdim=True))
+ max_val).squeeze(1)
return err.mean()
(pred, confidences) where pred has shape (B, K, T, 2) and confidences is softmax-normalized over Klog(conf) - 0.5 * sum(err²)data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF