skills/tabular/spatial-distance-aggregation/SKILL.md
Compute min/max/mean/std of Euclidean distances from all entities to a key point, then aggregate per group for spatial feature engineering
npx skillsauth add wenmin-wu/ds-skills tabular-spatial-distance-aggregationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When data has multiple entities per event (players per play, sensors per reading, objects per frame), compute Euclidean distance from each entity to a key reference point, then aggregate with min/max/mean/std per group. Captures spatial density, isolation, and spread in a few dense features.
import numpy as np
import pandas as pd
def spatial_distance_features(df, ref_x, ref_y, x_col='X', y_col='Y',
group_cols=None, prefix='dist'):
"""Compute distance aggregates from entities to a reference point.
Args:
df: DataFrame with entity positions
ref_x, ref_y: columns with reference point coordinates
group_cols: columns defining each group (e.g., [GameId, PlayId])
prefix: column name prefix
"""
df = df.copy()
df[f'{prefix}_to_ref'] = np.sqrt(
(df[x_col] - df[ref_x]) ** 2 + (df[y_col] - df[ref_y]) ** 2
)
aggs = df.groupby(group_cols)[f'{prefix}_to_ref'].agg(
['min', 'max', 'mean', 'std']
).reset_index()
aggs.columns = group_cols + [
f'{prefix}_min', f'{prefix}_max', f'{prefix}_mean', f'{prefix}_std'
]
return aggs
# Usage: distance of all defenders to ball carrier
features = spatial_distance_features(
defenders, ref_x='carrier_X', ref_y='carrier_Y',
group_cols=['GameId', 'PlayId'], prefix='def_dist'
)
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF