skills/timeseries/snap-event-interaction-features/SKILL.md
Build per-state SNAP / event-flag interaction features by multiplying the binary flag with the sales and revenue columns segmented by state, capturing the demand uplift on government-benefit days that affects only specific geographies and product categories
npx skillsauth add wenmin-wu/ds-skills timeseries-snap-event-interaction-featuresInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Retail forecasts in welfare-supported markets (SNAP days in the US, holiday allowances in EU) have to know that demand spikes 30-50% on benefit-credit days, but only in the right state and only for the right product mix. A flat is_snap_day flag isn't enough — you need to localize it: SNAP-CA only matters in California stores, SNAP-grocery only matters for food categories. The cleanest encoding is to create one interaction column per (state, snap_status) combination by multiplying the per-row sales by (state_id == X) * snap_X. The resulting columns are sparse (most rows are zero) but capture an effect that no untyped event flag can. Same pattern works for any geography-conditional event: holidays, weather alerts, regional promotions.
import pandas as pd
for state in ['CA', 'TX', 'WI']:
snap_col = f'snap_{state}'
df[f'sold_{state}_snap'] = df['sold'] * (df['state_id'] == state) * df[snap_col]
df[f'sold_{state}_nonsnap'] = df['sold'] * (df['state_id'] == state) * (1 - df[snap_col])
df[f'rev_{state}_snap'] = df[f'sold_{state}_snap'] * df['sell_price']
df[f'rev_{state}_nonsnap'] = df[f'sold_{state}_nonsnap'] * df['sell_price']
# Aggregate to (date, state) for cross-state comparison
snap_agg = (
df.groupby(['date', 'state_id'])
[[f'sold_{s}_snap' for s in ['CA','TX','WI']] +
[f'sold_{s}_nonsnap' for s in ['CA','TX','WI']]]
.sum()
)
state_id, snap_CA, snap_TX, snap_WI for M5; analogous for any retail panelevent_active and event_inactive_snap and _nonsnap columns: the difference _snap - _nonsnap is what the model learns the lift from; one column alone hides the baseline.state_id; panel-level merges can leak between rows.snap_* with holiday_*, promo_*, weather_alert_*.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF