skills/tabular/column-shuffle-augmentation/SKILL.md
Augments imbalanced tabular data by independently shuffling each feature column within a class, creating synthetic samples that preserve per-column marginal distributions.
npx skillsauth add wenmin-wu/ds-skills tabular-column-shuffle-augmentationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
For imbalanced binary classification, you need more minority samples but SMOTE-style interpolation doesn't work well with high-dimensional anonymous features. Instead, independently shuffle each column within the minority class — this creates new rows that preserve each feature's marginal distribution while breaking inter-feature correlations. Apply asymmetrically: more copies for the minority class.
import numpy as np
def augment(X, y, t_pos=2, t_neg=1):
augmented_X, augmented_y = [X], [y]
for _ in range(t_pos):
x1 = X[y == 1].copy()
for c in range(x1.shape[1]):
np.random.shuffle(x1[:, c])
augmented_X.append(x1)
augmented_y.append(np.ones(len(x1)))
for _ in range(t_neg):
x0 = X[y == 0].copy()
for c in range(x0.shape[1]):
np.random.shuffle(x0[:, c])
augmented_X.append(x0)
augmented_y.append(np.zeros(len(x0)))
return np.vstack(augmented_X), np.concatenate(augmented_y)
X_aug, y_aug = augment(X_train.values, y_train.values, t_pos=2, t_neg=1)
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF