areas/software/mlops/skills/feature-engineering/SKILL.md
# Skill: Feature Engineering ## When to load When building training datasets, designing feature pipelines, or debugging training-serving skew. ## Declarative Feature Pipeline ```python from sklearn.pipeline import Pipeline from sklearn.compose import ColumnTransformer preprocessor = ColumnTransformer(transformers=[ ('num', Pipeline([ ('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler()), ]), numeric_features), ('cat', Pipeline([ ('im
npx skillsauth add sawrus/agent-guides areas/software/mlops/skills/feature-engineeringInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When building training datasets, designing feature pipelines, or debugging training-serving skew.
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
preprocessor = ColumnTransformer(transformers=[
('num', Pipeline([
('imputer', SimpleImputer(strategy='median')),
('scaler', StandardScaler()),
]), numeric_features),
('cat', Pipeline([
('imputer', SimpleImputer(strategy='constant', fill_value='unknown')),
('encoder', OneHotEncoder(handle_unknown='ignore', sparse_output=False)),
]), categorical_features),
])
# ✅ Fit ONLY on training data
preprocessor.fit(X_train)
X_train_processed = preprocessor.transform(X_train)
X_test_processed = preprocessor.transform(X_test) # Uses train statistics
# Single feature definition used in BOTH training and inference
def compute_user_features(user_id: str, reference_date: datetime) -> dict:
"""
Used by: training pipeline (historical dates) AND inference API (current date).
Identical computation guarantees no skew.
"""
orders = db.query("SELECT * FROM orders WHERE user_id = %s AND created_at < %s", (user_id, reference_date))
return {
"order_count_30d": count_in_window(orders, reference_date, days=30),
"avg_order_value_90d": avg_in_window(orders, reference_date, days=90),
}
testing
QA Expert for writing E2E tests, test scenarios, test plans, and ensuring test coverage quality.
development
Expert UI/UX design intelligence for creating distinctive, high-craft, and mobile-first interfaces. Focuses on premium aesthetics, touch-first ergonomics, and Flutter performance.
development
Code Review Expert for static analysis, security auditing, architecture review, and ensuring code quality standards.
development
Babysit a GitHub pull request after creation by continuously polling review comments, CI checks/workflow runs, and mergeability state until the PR is merged/closed or user help is required. Diagnose failures, retry likely flaky failures up to 3 times, auto-fix/push branch-related issues when appropriate, and keep watching open PRs so fresh review feedback is surfaced promptly. Use when the user asks Codex to monitor a PR, watch CI, handle review comments, or keep an eye on failures and feedback on an open PR.