skills/timeseries/multimodal-trajectory-head/SKILL.md
Single linear head that jointly predicts K candidate trajectories and K softmax confidences, sliced and reshaped for multimodal regression
npx skillsauth add wenmin-wu/ds-skills timeseries-multimodal-trajectory-headInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Rather than using K separate prediction heads (one per mode) or a complex MoE router, pack all K trajectory predictions plus K confidence logits into a single linear head with K*T*2 + K outputs. Slice the output tensor into two chunks, reshape the first into (B, K, T, 2), and softmax the second into mode probabilities. This keeps the model architecture simple, parameters shared across modes, and adds one Linear layer to any CNN/Transformer backbone. Pairs directly with the K-mode Gaussian NLL loss.
import torch
import torch.nn as nn
from torchvision.models import resnet34
class MultiModalTrajModel(nn.Module):
def __init__(self, backbone_features=512, num_modes=3, future_len=50):
super().__init__()
self.num_modes = num_modes
self.future_len = future_len
self.num_preds = num_modes * 2 * future_len
self.backbone = resnet34(pretrained=True)
self.backbone.fc = nn.Identity()
self.head = nn.Linear(backbone_features, self.num_preds + num_modes)
def forward(self, x):
f = self.backbone(x)
out = self.head(f) # (B, K*T*2 + K)
pred, conf = out[:, :self.num_preds], out[:, self.num_preds:]
pred = pred.view(-1, self.num_modes, self.future_len, 2)
conf = torch.softmax(conf, dim=1)
return pred, conf
num_modes * 2 * future_len + num_modesnum_modes * 2 * future_len(B, K, T, 2) — the K candidate trajectoriesdim=1 to the second chunk — the mode mixture weights(pred, conf) into the K-mode NLL loss~K*T*2 extra weights vs. a single-mode model..view(): requires contiguous tensors. Use .reshape() if upstream ops produce non-contiguous output.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF