skills/tabular/content-difficulty-features/SKILL.md
Precomputes item/content difficulty as historical mean accuracy, merged as a static feature for user-item prediction tasks.
npx skillsauth add wenmin-wu/ds-skills tabular-content-difficulty-featuresInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
For recommendation or knowledge tracing tasks, compute each item's historical difficulty (mean success rate) from training data and merge as a static feature. This gives the model a strong prior: some questions are inherently harder. Works for any user-item interaction dataset.
import pandas as pd
def add_content_features(train_df, content_col='content_id', target_col='answered_correctly'):
"""Compute and merge content-level difficulty features."""
content_stats = (
train_df.groupby(content_col)[target_col]
.agg(['mean', 'count', 'std'])
.reset_index()
)
content_stats.columns = [content_col, 'content_mean', 'content_count', 'content_std']
# Fill NaN std for items with single interaction
content_stats['content_std'] = content_stats['content_std'].fillna(0)
return train_df.merge(content_stats, on=content_col, how='left')
# Usage
train = add_content_features(train, 'content_id', 'answered_correctly')
# For test: merge same content_stats (computed from train only)
(count*mean + prior*global) / (count + prior)data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF