skills/tabular/oof-meta-features/SKILL.md
Generates out-of-fold predictions from auxiliary models and uses them as input features for the final model.
npx skillsauth add wenmin-wu/ds-skills tabular-oof-meta-featuresInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Train a model to predict an auxiliary target (e.g., a related physical property, an intermediate label), collect its out-of-fold (OOF) predictions, and feed those as features into the final model. This injects learned representations without leakage — each sample's meta-feature comes from a model that never saw that sample during training.
import numpy as np
from sklearn.model_selection import KFold
import lightgbm as lgb
def generate_oof_feature(X, y_aux, X_test, params, n_splits=5):
"""Train on auxiliary target, return OOF predictions as features."""
folds = KFold(n_splits=n_splits, shuffle=True, random_state=42)
oof = np.zeros(len(X))
test_pred = np.zeros(len(X_test))
for train_idx, val_idx in folds.split(X):
model = lgb.LGBMRegressor(**params)
model.fit(X.iloc[train_idx], y_aux.iloc[train_idx],
eval_set=[(X.iloc[val_idx], y_aux.iloc[val_idx])],
callbacks=[lgb.early_stopping(100)])
oof[val_idx] = model.predict(X.iloc[val_idx])
test_pred += model.predict(X_test) / n_splits
return oof, test_pred
# Usage: predict auxiliary target, add as feature
oof_aux, test_aux = generate_oof_feature(X, y_auxiliary, X_test, params)
X['meta_aux'] = oof_aux
X_test['meta_aux'] = test_aux
# Now train final model with the meta-feature included
final_model.fit(X, y_target)
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF