skills/timeseries/availability-masked-regression-loss/SKILL.md
Multiply per-timestep regression loss by a 0/1 availability mask so missing future steps contribute zero gradient
npx skillsauth add wenmin-wu/ds-skills timeseries-availability-masked-regression-lossInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
In trajectory forecasting, many ground-truth futures are incomplete: an agent exits the scene partway through the prediction horizon, or a sensor misses frames. Training with naive MSE over a fixed-length future either requires padding with bogus values (biases the model) or dropping whole samples (wastes data). The fix is a per-timestep availability mask that's 1 where GT exists and 0 where it doesn't. Use reduction='none', multiply by the mask, and take the mean. Gradients flow only through valid steps, so the loss is unbiased without truncating the horizon.
import torch
import torch.nn as nn
criterion = nn.MSELoss(reduction='none')
def masked_loss(pred, target, avails):
"""
pred: (B, T, 2)
target: (B, T, 2)
avails: (B, T) 1/0 validity mask
"""
loss = criterion(pred, target) # (B, T, 2)
loss = loss * avails.unsqueeze(-1) # zero-out invalid steps
return loss.mean()
# Training step
pred = model(batch['image']).view(batch['target_positions'].shape)
loss = masked_loss(pred,
batch['target_positions'],
batch['target_availabilities'])
loss.backward()
target_availabilities tensor alongside each target (shape (B, T))reduction='none' on the underlying loss so element-wise values survive.unsqueeze(-1)avails.sum() for a length-normalized version).mean() divides by the total element count including masked zeros, so the effective learning rate scales with sequence length. For length-invariant loss, use loss.sum() / (avails.sum() * D).(B, T, 1) broadcasts cleanly to (B, T, 2) or any feature dim.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF