skills/cv/warmup-cosine-lr-schedule/SKILL.md
Combines gradual learning rate warmup with cosine annealing decay for stable fine-tuning of pretrained models.
npx skillsauth add wenmin-wu/ds-skills cv-warmup-cosine-lr-scheduleInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Start training with a low learning rate, ramp up linearly over a warmup period, then decay following a cosine curve. The warmup prevents large gradient updates from destroying pretrained weights in early steps. Cosine decay provides smooth, aggressive LR reduction that works better than step decay for fine-tuning.
import torch
from torch.optim.lr_scheduler import CosineAnnealingLR, LambdaLR
def get_cosine_with_warmup(optimizer, warmup_epochs, total_epochs, min_lr=1e-7):
"""Cosine annealing with linear warmup."""
def lr_lambda(epoch):
if epoch < warmup_epochs:
return epoch / warmup_epochs
progress = (epoch - warmup_epochs) / (total_epochs - warmup_epochs)
return max(min_lr / optimizer.defaults['lr'],
0.5 * (1 + torch.cos(torch.tensor(progress * 3.14159)).item()))
return LambdaLR(optimizer, lr_lambda)
# Usage
optimizer = torch.optim.Adam(model.parameters(), lr=3e-4)
scheduler = get_cosine_with_warmup(optimizer, warmup_epochs=2, total_epochs=30)
for epoch in range(30):
train_one_epoch(model, train_loader, optimizer)
scheduler.step()
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF