skills/tabular/typed-panel-aggregation/SKILL.md
Aggregate panel/sequential data with type-appropriate statistics — numeric (mean/std/min/max/last) and categorical (count/last/nunique) — then concat into flat features
npx skillsauth add wenmin-wu/ds-skills tabular-typed-panel-aggregationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When customer or entity data arrives as multiple time-stamped rows (panel data), flatten to one row per entity by applying different aggregation functions per column type. Numeric columns get mean, std, min, max, last; categorical columns get count, last, nunique. This type-aware approach extracts richer signals than uniform aggregation.
import pandas as pd
def typed_panel_agg(df, group_col, cat_features, num_features=None):
"""Aggregate panel data with type-appropriate statistics.
Args:
df: DataFrame with multiple rows per entity
group_col: entity identifier column
cat_features: list of categorical column names
num_features: list of numeric columns (default: all non-cat)
"""
if num_features is None:
num_features = [c for c in df.columns
if c not in cat_features + [group_col]]
num_agg = df.groupby(group_col)[num_features].agg(
['mean', 'std', 'min', 'max', 'last'])
num_agg.columns = ['_'.join(x) for x in num_agg.columns]
cat_agg = df.groupby(group_col)[cat_features].agg(
['count', 'last', 'nunique'])
cat_agg.columns = ['_'.join(x) for x in cat_agg.columns]
return pd.concat([num_agg, cat_agg], axis=1)
groupby().agg() for datasets with millions of rowsdata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF