skills/nlp/masked-bce-span-loss/SKILL.md
Binary cross-entropy loss with mask to ignore special and padding tokens in token-level span classification
npx skillsauth add wenmin-wu/ds-skills nlp-masked-bce-span-lossInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
For token-level span classification, special tokens (CLS, SEP, PAD) should not contribute to the loss. Label them as -1 during preprocessing, then use torch.masked_select to exclude them before computing mean BCE loss.
import torch
import torch.nn as nn
criterion = nn.BCEWithLogitsLoss(reduction="none")
def masked_bce_loss(logits, labels):
"""BCE loss that ignores tokens labeled -1 (special/padding)."""
loss = criterion(logits.view(-1, 1), labels.view(-1, 1))
mask = labels.view(-1, 1) != -1
return torch.masked_select(loss, mask).mean()
import numpy as np
def create_token_labels(tokenizer, text, char_spans, max_len):
enc = tokenizer(text, max_length=max_len, padding="max_length",
return_offsets_mapping=True, add_special_tokens=True)
labels = np.zeros(max_len)
# Mark non-text tokens as -1
labels[np.array(enc.sequence_ids()) != 0] = -1
# Mark span tokens as 1
for start, end in char_spans:
for i, (os, oe) in enumerate(enc["offset_mapping"]):
if os >= start and oe <= end and labels[i] != -1:
labels[i] = 1.0
return labels
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF