skills/timeseries/neighbor-average-nan-interpolation/SKILL.md
Fill gaps in a daily exogenous series (oil prices, sensor feeds) by merging against a full calendar to expose NaNs, then replacing each NaN with the midpoint of its nearest valid left and right neighbors, walking outward past consecutive NaN runs
npx skillsauth add wenmin-wu/ds-skills timeseries-neighbor-average-nan-interpolationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When you merge a sparse daily series (oil price, weather) against a full calendar for forecasting features, weekends and holidays become NaN rows that break downstream models. ffill propagates Monday's value across Sunday, biasing weekend-sensitive signals; interpolate() handles short gaps but silently skips leading/trailing NaNs. The midpoint-neighbor recipe is the deterministic workhorse: build the full calendar, expose NaN positions, walk outward past any consecutive NaN run to find the next valid value on each side, and assign the midpoint. Edge NaNs fall back to the single available neighbor.
import numpy as np
import pandas as pd
calendar = pd.DataFrame({'date': pd.date_range(min_d, max_d)})
oil = calendar.merge(raw_oil, on='date', how='left').reset_index(drop=True)
na_idx = oil.index[oil['dcoilwtico'].isnull()].values
for i in na_idx:
left = i - 1
while left >= 0 and pd.isna(oil.loc[left, 'dcoilwtico']):
left -= 1
right = i + 1
while right < len(oil) and pd.isna(oil.loc[right, 'dcoilwtico']):
right += 1
if left < 0:
oil.loc[i, 'dcoilwtico'] = oil.loc[right, 'dcoilwtico']
elif right >= len(oil):
oil.loc[i, 'dcoilwtico'] = oil.loc[left, 'dcoilwtico']
else:
oil.loc[i, 'dcoilwtico'] = (oil.loc[left, 'dcoilwtico'] +
oil.loc[right, 'dcoilwtico']) / 2
date_range calendar spanning the forecast horizonpandas.interpolate(method='time') for long gaps or strong trends.fillna won't touch them..loc[i] requires a sequential integer index, so reset_index(drop=True) after the merge.ffill for weekends: it propagates Friday's value through Sunday, which is exactly wrong for weekend-sensitive signals.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF