skills/tabular/vote-ensemble-outer-join/SKILL.md
Ensemble ranked recommendation lists by outer-joining exploded candidates and re-ranking by weighted vote sum
npx skillsauth add wenmin-wu/ds-skills tabular-vote-ensemble-outer-joinInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When ensembling multiple recommendation submissions that each predict a ranked list of item IDs, explode each list into per-candidate rows, outer-join on (session, item), sum weighted votes, and re-rank. Unlike rank averaging, this handles disjoint candidate sets naturally -- items appearing in more submissions get more votes.
import polars as pl
def vote_ensemble(submission_paths, weights=None, k=20):
"""Ensemble ranked lists by weighted vote counting.
Args:
submission_paths: list of CSV paths (session_type, labels)
weights: per-submission weight (default: equal)
k: number of items to keep per session
"""
if weights is None:
weights = [1] * len(submission_paths)
subs = []
for path, w in zip(submission_paths, weights):
sub = (pl.read_csv(path)
.with_columns(pl.col('labels').str.split(' '))
.explode('labels')
.with_columns([
pl.col('labels').cast(pl.UInt32).alias('aid'),
pl.lit(w).cast(pl.Float32).alias('vote')
])
.select(['session_type', 'aid', 'vote']))
subs.append(sub)
# Outer join and sum votes
merged = pl.concat(subs)
ranked = (merged
.group_by(['session_type', 'aid'])
.agg(pl.col('vote').sum().alias('vote_sum'))
.sort(['session_type', 'vote_sum'], descending=[False, True])
.group_by('session_type')
.agg(pl.col('aid').head(k).cast(pl.Utf8).alias('labels'))
.with_columns(pl.col('labels').list.join(' ')))
return ranked
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF