skills/nlp/mean-max-concat-pooling/SKILL.md
Concatenates token-level mean pooling and max pooling from the last hidden state for a richer sequence representation.
npx skillsauth add wenmin-wu/ds-skills nlp-mean-max-concat-poolingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Mean pooling captures the average semantic content; max pooling highlights the most salient features. Concatenating both gives a representation that is both smooth (mean) and discriminative (max). This 2x-width vector feeds into the classification head, often outperforming either pooling strategy alone.
import torch
import torch.nn as nn
class MeanMaxPooling(nn.Module):
def forward(self, last_hidden_state, attention_mask):
# Mean pooling (mask-aware)
mask = attention_mask.unsqueeze(-1).float()
mean_pool = (last_hidden_state * mask).sum(dim=1) / mask.sum(dim=1).clamp(min=1e-9)
# Max pooling (mask -inf for padding)
masked = last_hidden_state.masked_fill(~attention_mask.unsqueeze(-1).bool(), -1e9)
max_pool, _ = masked.max(dim=1)
return torch.cat([mean_pool, max_pool], dim=1) # (batch, 2 * hidden_size)
# Usage:
outputs = model(input_ids, attention_mask=attention_mask)
pooled = mean_max_pool(outputs.last_hidden_state, attention_mask)
logits = nn.Linear(config.hidden_size * 2, num_classes)(pooled)
last_hidden_state from transformer (batch, seq_len, hidden_size)hidden_size * 2data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF