skills/nlp/sqrt-mse-loss/SKILL.md
Uses square root of MSE as training loss to directly optimize for RMSE evaluation metric alignment.
npx skillsauth add wenmin-wu/ds-skills nlp-sqrt-mse-lossInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When the competition metric is RMSE, training with standard MSE can misalign gradients because MSE squares the errors. Using sqrt(MSE) as the loss directly optimizes the metric you're evaluated on. This often improves final scores by 0.01-0.03 RMSE compared to plain MSE, especially when error magnitudes vary.
import torch
import torch.nn as nn
class RMSELoss(nn.Module):
def __init__(self, eps=1e-8):
super().__init__()
self.mse = nn.MSELoss()
self.eps = eps
def forward(self, preds, targets):
return torch.sqrt(self.mse(preds, targets) + self.eps)
# Usage
criterion = RMSELoss()
loss = criterion(model_output.view(-1), labels.view(-1))
loss.backward()
nn.MSELoss() with RMSELoss() in training loopdata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF