skills/nlp/checkpoint-ensemble-exponential/SKILL.md
Average predictions from each epoch checkpoint with exponentially increasing weights (2^epoch), favoring later more-converged snapshots
npx skillsauth add wenmin-wu/ds-skills nlp-checkpoint-ensemble-exponentialInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Instead of using only the best checkpoint, collect predictions from every epoch and average with exponential weights (2^epoch). Later epochs get more weight since they're more converged, but earlier epochs contribute diversity. Costs zero extra training time — just save predictions at each epoch.
import numpy as np
def checkpoint_ensemble_predict(model, train_fn, test_loader, n_epochs,
device='cuda'):
"""Train model and collect exponentially-weighted checkpoint predictions.
Args:
model: PyTorch model
train_fn: function(model, epoch) that trains one epoch
test_loader: DataLoader for test data
n_epochs: total training epochs
Returns:
averaged predictions array
"""
all_preds = []
weights = [2 ** epoch for epoch in range(n_epochs)]
for epoch in range(n_epochs):
train_fn(model, epoch)
model.eval()
epoch_preds = []
with torch.no_grad():
for x_batch in test_loader:
logits = model(x_batch.to(device))
epoch_preds.append(torch.sigmoid(logits).cpu().numpy())
all_preds.append(np.concatenate(epoch_preds))
return np.average(all_preds, weights=weights, axis=0)
# Usage
predictions = checkpoint_ensemble_predict(model, train_one_epoch, test_loader, 5)
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF