Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

wenmin-wu/cv-iou-weighted-assignment-metric

Name: cv-iou-weighted-assignment-metric
Author: wenmin-wu

skills/cv/iou-weighted-assignment-metric/SKILL.md

npx skillsauth add wenmin-wu/ds-skills cv-iou-weighted-assignment-metric

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Overview

For multi-object ID assignment problems (NFL helmets, pedestrians, cells), the natural metric is "did the predicted ID match the GT ID, for the correct box?". Counting IoU alone misses identity errors, counting ID alone misses localization errors. The combined metric: for each GT box per frame, find the predicted box with the highest IoU, gate by an IoU threshold (e.g. 0.35), check ID equality, then compute weighted accuracy where high-importance rows (impact plays, critical events) get a 1000× weight. The vectorized IoU computation makes it fast enough to run on every training step.

Quick Start

import numpy as np
import pandas as pd
from sklearn.metrics import accuracy_score

def vectorized_iou(df):
    ixmin = df[['x1_sub','x1_gt']].max(axis=1)
    iymin = df[['y1_sub','y1_gt']].max(axis=1)
    ixmax = df[['x2_sub','x2_gt']].min(axis=1)
    iymax = df[['y2_sub','y2_gt']].min(axis=1)
    iw = np.maximum(ixmax - ixmin + 1, 0)
    ih = np.maximum(iymax - iymin + 1, 0)
    inter = iw * ih
    area_sub = (df['x2_sub']-df['x1_sub']+1)*(df['y2_sub']-df['y1_sub']+1)
    area_gt  = (df['x2_gt' ]-df['x1_gt' ]+1)*(df['y2_gt' ]-df['y1_gt' ]+1)
    df['iou'] = inter / (area_sub + area_gt - inter)
    return df

def score(sub, gt, iou_thr=0.35, impact_weight=1000):
    merged = gt.merge(sub, on='video_frame', suffixes=('_gt','_sub'))
    merged = vectorized_iou(merged)
    top = (merged.sort_values('iou', ascending=False)
           .groupby(['video_frame','label_gt']).first().reset_index())
    top['correct'] = (top['label_gt'] == top['label_sub']) & (top['iou'] >= iou_thr)
    top['weight']  = np.where(top['isImpact'], impact_weight, 1)
    return accuracy_score(np.ones(len(top)), top['correct'],
                          sample_weight=top['weight'])

Workflow

Merge predictions and GT on the common frame/time key
Compute IoU vectorized across the merged dataframe (no Python loops)
Sort by IoU descending and take the top row per (frame, gt_label) — this is the GT's best match
Flag correct = (label matches) AND (iou ≥ threshold)
Apply per-row importance weights and compute weighted accuracy

Key Decisions

Per-GT top-match, not per-pred: ensures every GT is scored exactly once, even if multiple preds overlap it.
IoU gate vs. soft-IoU: a hard threshold matches the competition rubric and is interpretable; soft-IoU rewards near-misses but is harder to tune.
Importance weights: impact plays / critical frames get 1000× weight — mirrors the downstream cost of errors.
vs. per-class F1: assignment problems usually care about identity preservation, not class recall; this metric is simpler and more stable.

References

NFL Helmet Assignment - Getting Started Guide

wenmin-wu/cv-iou-weighted-assignment-metric

skills/cv/iou-weighted-assignment-metric/SKILL.md

Evaluation scorer that merges predictions with GT per frame, takes top-IoU match per GT, and computes weighted accuracy with IoU threshold gate

24 stars

tools

Updated Apr 17, 2026

$ install --global

skillsauth

npx skillsauth add wenmin-wu/ds-skills cv-iou-weighted-assignment-metric

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 17, 2026, 12:20 PM95.1s1 file scanned

SKILL.md

name:: cv-iou-weighted-assignment-metric
description:: Evaluation scorer that merges predictions with GT per frame, takes top-IoU match per GT, and computes weighted accuracy with IoU threshold gate

Overview

Quick Start

import numpy as np
import pandas as pd
from sklearn.metrics import accuracy_score

def vectorized_iou(df):
    ixmin = df[['x1_sub','x1_gt']].max(axis=1)
    iymin = df[['y1_sub','y1_gt']].max(axis=1)
    ixmax = df[['x2_sub','x2_gt']].min(axis=1)
    iymax = df[['y2_sub','y2_gt']].min(axis=1)
    iw = np.maximum(ixmax - ixmin + 1, 0)
    ih = np.maximum(iymax - iymin + 1, 0)
    inter = iw * ih
    area_sub = (df['x2_sub']-df['x1_sub']+1)*(df['y2_sub']-df['y1_sub']+1)
    area_gt  = (df['x2_gt' ]-df['x1_gt' ]+1)*(df['y2_gt' ]-df['y1_gt' ]+1)
    df['iou'] = inter / (area_sub + area_gt - inter)
    return df

def score(sub, gt, iou_thr=0.35, impact_weight=1000):
    merged = gt.merge(sub, on='video_frame', suffixes=('_gt','_sub'))
    merged = vectorized_iou(merged)
    top = (merged.sort_values('iou', ascending=False)
           .groupby(['video_frame','label_gt']).first().reset_index())
    top['correct'] = (top['label_gt'] == top['label_sub']) & (top['iou'] >= iou_thr)
    top['weight']  = np.where(top['isImpact'], impact_weight, 1)
    return accuracy_score(np.ones(len(top)), top['correct'],
                          sample_weight=top['weight'])

Workflow

Merge predictions and GT on the common frame/time key
Compute IoU vectorized across the merged dataframe (no Python loops)
Sort by IoU descending and take the top row per (frame, gt_label) — this is the GT's best match
Flag correct = (label matches) AND (iou ≥ threshold)
Apply per-row importance weights and compute weighted accuracy

Key Decisions

Per-GT top-match, not per-pred: ensures every GT is scored exactly once, even if multiple preds overlap it.
IoU gate vs. soft-IoU: a hard threshold matches the competition rubric and is interpretable; soft-IoU rewards near-misses but is harder to tune.
Importance weights: impact plays / critical frames get 1000× weight — mirrors the downstream cost of errors.
vs. per-class F1: assignment problems usually care about identity preservation, not class recall; this metric is simpler and more stable.

References

NFL Helmet Assignment - Getting Started Guide

Related Skills

wenmin-wu/timeseries-scaled-pinball-loss

data-ai

VerifiedTrustedCommunity

Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-scaled-pinball-loss

wenmin-wu/timeseries-retroactive-outlier-rescaling

data-ai

VerifiedTrustedCommunity

Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-retroactive-outlier-rescaling

wenmin-wu/timeseries-ratio-target-for-smape

testing

VerifiedTrustedCommunity

Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-ratio-target-for-smape

wenmin-wu/timeseries-quantile-ratio-scaling

tools

VerifiedTrustedCommunity

Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-quantile-ratio-scaling

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/wenmin-wu/ds-skills.git

# Copy into Claude Code skills folder (global)
cp -r ds-skills/skills/cv/iou-weighted-assignment-metric ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

wenmin-wu/ds-skills

24 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT