skills/timeseries/recursive-multistep-forecasting/SKILL.md
Forecast a multi-step horizon by predicting one day ahead, writing the prediction back into the panel as the new "actual", recomputing all lag and rolling features that depend on it, then predicting the next day — turns a one-step LightGBM regressor into a 28-day forecaster without changing the model
npx skillsauth add wenmin-wu/ds-skills timeseries-recursive-multistep-forecastingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Direct multi-step forecasting (one model per horizon day) is expensive: 28 days = 28 models. Recursive forecasting trains one one-step model and reuses it for every horizon day by feeding its own predictions back as inputs. The trick is that lag features (sales.shift(7)) and rolling features (sales.rolling(28).mean()) at horizon day H+k depend on predictions from days H..H+k-1, so you must recompute them inside the prediction loop, not once before. Done correctly, you get 28-day forecasts with the same model that scored well on 1-step validation. Done wrong (precomputed features that don't see the predictions), you get garbage.
import pandas as pd
from datetime import timedelta
base_test = build_panel_with_unknown_target() # rows for entire horizon, sales=NaN
for h in range(1, 29): # 28-day horizon
day = first_forecast_day + timedelta(days=h - 1)
# window includes max_lags days before today so we can recompute features
window = base_test[
(base_test.date >= day - timedelta(days=max_lags)) &
(base_test.date <= day)
].copy()
create_features(window) # lags + rollings using up-to-day data
today = window.loc[window.date == day, train_cols]
yhat = alpha * model.predict(today) # alpha = bias-correction multiplier
base_test.loc[base_test.date == day, 'sales'] = yhat
sales=NaNh over the horizon days[day - max_lags, day] so feature creation has enough historyalpha ≈ 1.02-1.03 for Poisson)base_test so the next iteration's features see it[day - max_lags, day]. The rest is irrelevant for today's prediction.alpha: Poisson and Tweedie LightGBM consistently underpredict by 2-3%; multiply by ~1.02-1.03 (tuned on validation) before writing back, otherwise the bias compounds across the horizon.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF