skills/nlp/cyclical-lr-triangular/SKILL.md
Cyclical learning rate (CLR) Keras callback that oscillates LR between base and max each batch for faster convergence
npx skillsauth add wenmin-wu/ds-skills nlp-cyclical-lr-triangularInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Cyclical Learning Rate (Smith 2017) avoids the manual LR tuning problem by sweeping between base_lr and max_lr repeatedly during training. The triangular pattern often converges faster than static LR and helps the model escape sharp minima. Three modes: triangular (constant amplitude), triangular2 (amplitude halves each cycle), exp_range (amplitude decays exponentially).
import numpy as np
import keras.backend as K
from keras.callbacks import Callback
class CyclicLR(Callback):
def __init__(self, base_lr=1e-3, max_lr=6e-3, step_size=2000,
mode='triangular', gamma=1.0):
super().__init__()
self.base_lr = base_lr
self.max_lr = max_lr
self.step_size = step_size
self.mode = mode
self.gamma = gamma
self.iterations = 0.
if mode == 'triangular':
self.scale_fn = lambda x: 1.
elif mode == 'triangular2':
self.scale_fn = lambda x: 1 / (2. ** (x - 1))
elif mode == 'exp_range':
self.scale_fn = lambda x: gamma ** x
def clr(self):
cycle = np.floor(1 + self.iterations / (2 * self.step_size))
x = np.abs(self.iterations / self.step_size - 2 * cycle + 1)
return self.base_lr + (self.max_lr - self.base_lr) * max(0, 1 - x) * self.scale_fn(cycle)
def on_train_begin(self, logs=None):
K.set_value(self.model.optimizer.lr, self.base_lr)
def on_batch_end(self, batch, logs=None):
self.iterations += 1
K.set_value(self.model.optimizer.lr, self.clr())
# Usage
clr = CyclicLR(base_lr=1e-3, max_lr=6e-3, step_size=2000, mode='triangular2')
model.fit(X, y, callbacks=[clr])
base_lr = start of decrease, max_lr = start of divergence.step_size to 2-8 times num_batches_per_epoch (a full cycle = 2 * step_size iterations).model.fit — LR updates every batch, no extra code required.triangular2 decays amplitude over cycles for fine-tuning; exp_range for very long training.triangular for short runs, triangular2 for longer ones (decays amplitude), exp_range when you want smooth decay.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF