skills/tabular/polynomial-interaction-features/SKILL.md
Generates polynomial powers and interaction terms from selected numeric features to capture nonlinear relationships with the target.
npx skillsauth add wenmin-wu/ds-skills tabular-polynomial-interaction-featuresInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Linear and tree models benefit from explicit polynomial and interaction features when key predictors have nonlinear relationships with the target. Sklearn's PolynomialFeatures generates all combinations up to degree N — including cross-terms (A×B, A×B²) that capture interactions the model might not find on its own. Select only the top 3–5 most predictive features to avoid combinatorial explosion.
import numpy as np
import pandas as pd
from sklearn.preprocessing import PolynomialFeatures
from sklearn.impute import SimpleImputer
# Select top correlated features only
key_features = ['EXT_SOURCE_1', 'EXT_SOURCE_2', 'EXT_SOURCE_3', 'DAYS_BIRTH']
imputer = SimpleImputer(strategy='median')
X_train_poly = imputer.fit_transform(train[key_features])
X_test_poly = imputer.transform(test[key_features])
poly = PolynomialFeatures(degree=3, include_bias=False)
X_train_poly = poly.fit_transform(X_train_poly)
X_test_poly = poly.transform(X_test_poly)
poly_names = poly.get_feature_names_out(key_features)
train_poly = pd.DataFrame(X_train_poly, columns=poly_names, index=train.index)
test_poly = pd.DataFrame(X_test_poly, columns=poly_names, index=test.index)
# Check new features' correlation with target
correlations = train_poly.corrwith(train['TARGET']).abs().sort_values(ascending=False)
print(correlations.head(10))
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF