skills/timeseries/multi-gap-lag-diff-features/SKILL.md
Generate shift/diff features at multiple lag sizes (1,2,3,5,10,20,50,100) over cursor/time/state series, then aggregate statistics per session
npx skillsauth add wenmin-wu/ds-skills timeseries-multi-gap-lag-diff-featuresInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
In event-stream data, a single-step diff (x.diff(1)) captures local change but misses medium- and long-range dynamics. Generating diffs at a log-spaced ladder of lags (1, 2, 3, 5, 10, 20, 50, 100) gives you a multi-resolution view: step-1 catches individual keystrokes, step-10 catches word-level motion, step-100 catches sentence-level motion. Each lag column is then aggregated with mean/std/min/max per session. One ladder over three series (time, cursor, word count) expands to ~96 features — expensive but consistently useful in keystroke / sensor / clickstream tasks.
GAPS = [1, 2, 3, 5, 10, 20, 50, 100]
def multi_gap_features(df, id_col='id'):
for gap in GAPS:
df[f'up_time_shift{gap}'] = df.groupby(id_col)['up_time'].shift(gap)
df[f'action_gap{gap}'] = df['down_time'] - df[f'up_time_shift{gap}']
df[f'cursor_shift{gap}'] = df.groupby(id_col)['cursor_position'].shift(gap)
df[f'cursor_change{gap}'] = df['cursor_position'] - df[f'cursor_shift{gap}']
df[f'cursor_abs_change{gap}'] = df[f'cursor_change{gap}'].abs()
df[f'wc_shift{gap}'] = df.groupby(id_col)['word_count'].shift(gap)
df[f'wc_change{gap}'] = df['word_count'] - df[f'wc_shift{gap}']
# Aggregate lag columns per id
agg_cols = [c for c in df.columns if 'gap' in c or 'change' in c]
return df.groupby(id_col)[agg_cols].agg(['mean', 'std', 'min', 'max'])
gap in the ladder, compute shift(gap) within the session, then subtractdata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF