skills/tabular/outlier-aware-two-stage-blending/SKILL.md
When a regression target has a long discrete tail (e.g. ~1% of rows pinned at -33.22 in Elo), train one regressor on the *non-outlier* subset, a separate binary classifier for the outlier flag, and splice the predictions — replace the top-K most-confident outlier predictions in the regressor's output with the outlier value, where K is calibrated on validation
npx skillsauth add wenmin-wu/ds-skills tabular-outlier-aware-two-stage-blendingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The Elo Merchant target had ~1% of rows fixed at -33.21928 (a hard "loyalty churn" sentinel). A single regressor was forced to compromise: either it learned to predict the sentinel and damaged its non-outlier RMSE, or it ignored it and lost easy points. Every top solution decoupled the problem: (A) a regressor trained with outliers removed — clean, accurate on the bulk; (B) a binary classifier is_outlier over all rows; then a splice: take the regressor's full-test predictions, sort the classifier scores, and overwrite the top-K predictions with the sentinel value. K is chosen by minimizing RMSE on the validation fold, typically around K ≈ 1.05 × (val_outlier_count). This pattern beat single-model baselines by 0.01-0.02 RMSE on the public LB.
import numpy as np
from sklearn.model_selection import KFold
import lightgbm as lgb
OUT_VAL = -33.21928
# A) regressor on non-outliers
mask = train.target != OUT_VAL
reg = lgb.LGBMRegressor(**reg_params).fit(train.loc[mask, feats], train.loc[mask, 'target'])
test_reg = reg.predict(test[feats])
# B) outlier classifier on all rows
clf = lgb.LGBMClassifier(**clf_params).fit(
train[feats], (train.target == OUT_VAL).astype(int))
test_out_score = clf.predict_proba(test[feats])[:, 1]
# Splice: replace top-K by classifier score
K = int(len(test) * 0.011) # tuned on validation
top_idx = np.argsort(-test_out_score)[:K]
final = test_reg.copy()
final[top_idx] = OUT_VAL
target != sentinel rows only — RMSE should drop substantially vs. the full-train baselineis_sentinel0.6 × spliced + 0.4 × full for variance reductionregressor + classifier × sentinel_offset underflows because the sentinel is so far from the bulk; a hard overwrite is cleaner.class_weight='balanced' or downsample non-outliers before fitting.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF