skills/nlp/auxiliary-target-multitask/SKILL.md
Train main target alongside auxiliary sub-type targets as multiple output heads to regularize and improve generalization
npx skillsauth add wenmin-wu/ds-skills nlp-auxiliary-target-multitaskInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When related labels exist (e.g., toxicity + severe_toxicity + obscene + insult), train them as auxiliary outputs sharing the same backbone. The auxiliary heads act as implicit regularization, forcing the shared representation to capture more general patterns. At inference, use only the main head.
import torch
import torch.nn as nn
class MultiTaskModel(nn.Module):
def __init__(self, backbone, hidden_dim, n_aux_targets):
super().__init__()
self.backbone = backbone
self.main_head = nn.Linear(hidden_dim, 1)
self.aux_head = nn.Linear(hidden_dim, n_aux_targets)
def forward(self, x):
features = self.backbone(x)
main_out = self.main_head(features)
aux_out = self.aux_head(features)
return main_out, aux_out
# Training
aux_columns = ['severe_toxicity', 'obscene', 'identity_attack', 'insult', 'threat']
y_aux = train[aux_columns].values
for x_batch, y_main, y_aux_batch in dataloader:
main_pred, aux_pred = model(x_batch)
loss_main = nn.BCEWithLogitsLoss()(main_pred, y_main)
loss_aux = nn.BCEWithLogitsLoss()(aux_pred, y_aux_batch)
loss = loss_main + loss_aux # equal weighting
loss.backward()
# Inference: only use main head
main_pred, _ = model(x_test)
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF