skills/timeseries/locale-scoped-holiday-flag/SKILL.md
Build a single per-row "day off" boolean from a holidays table with National/Regional/Local locale hierarchy and Work Day overrides that flip make-up working weekends back to working days
npx skillsauth add wenmin-wu/ds-skills timeseries-locale-scoped-holiday-flagInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Retail and transaction datasets ship with holidays tables that carry locale (National/Regional/Local) and type (Holiday/Event/Work Day/Transfer) columns. A naive .merge double-counts regional holidays, ignores make-up working days, and misses the fact that "Event" rows are not days off at all. The right recipe is a deterministic rule pass: seed dayoff from weekday, then apply National holidays globally, Regional by state match, Local by city match, and finally flip Work Day rows back to working so make-up Saturdays are modeled correctly. One clean boolean feeds your forecaster. Used in the Favorita Grocery Sales Forecasting top kernels.
sales['dayoff'] = sales['day'].isin([6, 7]) # Sat/Sun seed
holidays = holidays[holidays['type'] != 'Event'] # events are not days off
for d, t, locale, name in zip(holidays.date, holidays.type,
holidays.locale, holidays.locale_name):
mask = (sales.date == d)
if t != 'Work Day':
if locale == 'National':
sales.loc[mask, 'dayoff'] = True
elif locale == 'Regional':
sales.loc[mask & (sales.state == name), 'dayoff'] = True
else: # Local
sales.loc[mask & (sales.city == name), 'dayoff'] = True
else:
sales.loc[mask, 'dayoff'] = False # make-up working day
dayoff from day_of_week in {Sat, Sun}type == 'Event' — events are not days offWork Day rows last so they override any previously-set dayoff (the precedence matters)locale_name against store state/city: this is why regional and local holidays don't leak into other regions.dayoff column is leak-proof when a date has overlapping entries; one-hot locale flags double-count.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF