skills/timeseries/wavelet-denoising/SKILL.md
Denoise an erratic 1D series with discrete wavelet decomposition + universal soft thresholding (sigma estimated from MAD of the detail coefficients) to extract the underlying trend/seasonality without lagging the signal — a far better trend extractor than rolling means for spiky retail or sensor data
npx skillsauth add wenmin-wu/ds-skills timeseries-wavelet-denoisingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Rolling means are the default trend extractor but they have two failures: they lag the signal by window/2 and they smear sharp legitimate jumps. Wavelet denoising solves both. Decompose the series with pywt.wavedec, estimate the noise scale sigma from the median-absolute-deviation of the highest-frequency detail coefficients (the MAD-based universal threshold from Donoho-Johnstone), apply hard or soft thresholding to all detail levels, then waverec back. The result preserves discontinuities and has zero phase shift. Use it as a feature (denoised series alongside raw), as a smoothed target for a regression model that hates noise, or just as a visualization.
import numpy as np
import pywt
def maddest(d, axis=None):
return np.mean(np.absolute(d - np.mean(d, axis)), axis)
def denoise_signal(x, wavelet='db4', level=1):
coeff = pywt.wavedec(x, wavelet, mode='per')
sigma = (1 / 0.6745) * maddest(coeff[-level])
uthresh = sigma * np.sqrt(2 * np.log(len(x)))
coeff[1:] = [pywt.threshold(c, value=uthresh, mode='hard') for c in coeff[1:]]
return pywt.waverec(coeff, wavelet, mode='per')
# Usage as a feature
df['sales_denoised'] = denoise_signal(df['sales'].values)
db4 (Daubechies-4) is the standard default; sym8 for smoother trends; haar for piecewise-constant signalspywt.wavedec to get [approximation, detail_L, detail_L-1, ..., detail_1]sigma = MAD(detail_1) / 0.6745 — the 0.6745 factor converts MAD to a Gaussian sigmauthresh = sigma * sqrt(2 * log(N))pywt.threshold(c, uthresh, mode='hard') to all detail coefficients (keep the approximation untouched)pywt.waverec to reconstruct the cleaned signal — same length as inputmode='per' (periodic): avoids edge artifacts at series boundaries; 'symmetric' is the alternative.db4 works on arbitrary lengths but some wavelets need power-of-2 input.1/alpha and smears jumps.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF