skills/tabular/anomaly-flag-imputation/SKILL.md
Detects sentinel anomaly values in numeric columns, creates a boolean flag feature, then replaces the sentinel with NaN for proper imputation.
npx skillsauth add wenmin-wu/ds-skills tabular-anomaly-flag-imputationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Real-world datasets often encode missing or special-status values as sentinel numbers (e.g., 365243 for "not employed", 999 for "unknown", -1 for "missing"). These break statistical features (mean, std) and mislead models. This pattern: (1) creates a binary flag column indicating the anomaly, (2) replaces the sentinel with NaN so imputers and models handle it correctly. The flag preserves the information that the value was special while the NaN lets downstream code treat it as missing.
import numpy as np
import pandas as pd
def flag_and_replace(df, col, sentinel):
"""Flag sentinel value as boolean feature, replace with NaN."""
flag_col = f'{col}_ANOMALY'
df[flag_col] = (df[col] == sentinel).astype(int)
df[col] = df[col].replace({sentinel: np.nan})
return df
# Apply to both train and test identically
for df in [train, test]:
df = flag_and_replace(df, 'DAYS_EMPLOYED', 365243)
df = flag_and_replace(df, 'DAYS_LAST_PHONE_CHANGE', 0)
value_counts)col_ANOMALY) before replacingnp.nandata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF