skills/tabular/personnel-count-parsing/SKILL.md
Parse structured text fields like '1 RB, 2 TE, 2 WR' into separate numeric columns per category
npx skillsauth add wenmin-wu/ds-skills tabular-personnel-count-parsingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Datasets often encode categorical counts as comma-separated text (e.g., "1 RB, 2 TE, 2 WR" or "3 DL, 2 LB, 6 DB"). Parse these into separate numeric columns per category. Works for sports rosters, inventory manifests, ingredient lists, or any structured-text count field.
import pandas as pd
def parse_personnel(text, all_categories=None):
"""Parse '1 RB, 2 TE, 2 WR' into {RB: 1, TE: 2, WR: 2}.
Args:
text: comma-separated count+category string
all_categories: list of expected categories (for zero-filling)
"""
counts = {}
if all_categories:
counts = {c: 0 for c in all_categories}
for item in str(text).split(","):
item = item.strip()
parts = item.split(" ")
if len(parts) == 2:
counts[parts[1]] = int(parts[0])
return counts
# Usage
categories = ['DL', 'LB', 'DB', 'OL', 'QB', 'RB', 'TE', 'WR']
parsed = df['OffensePersonnel'].apply(
lambda x: pd.Series(parse_personnel(x, categories))
)
parsed.columns = [f'offense_{c}' for c in parsed.columns]
df = pd.concat([df, parsed], axis=1)
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF