Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

wenmin-wu/cv-exam-level-label-hierarchy-aggregation

Name: cv-exam-level-label-hierarchy-aggregation
Author: wenmin-wu

skills/cv/exam-level-label-hierarchy-aggregation/SKILL.md

Aggregate per-slice predictions into exam-level labels that satisfy a competition's mutual-exclusion hierarchy (positive vs negative vs indeterminate), using a top-down rule cascade — first decide the exam class, then conditionally rescale the dependent labels so the submission stays internally consistent

24 stars

testing

Updated Apr 17, 2026

$ install --global

skillsauth

npx skillsauth add wenmin-wu/ds-skills cv-exam-level-label-hierarchy-aggregation

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 24, 2026, 9:03 PM1.9s1 file scanned

SKILL.md

name:: cv-exam-level-label-hierarchy-aggregation
description:: Aggregate per-slice predictions into exam-level labels that satisfy a competition's mutual-exclusion hierarchy (positive vs negative vs indeterminate), using a top-down rule cascade — first decide the exam class, then conditionally rescale the dependent labels so the submission stays internally consistent

Overview

Multi-label medical competitions usually impose constraints across labels: a study is negative_for_pe XOR indeterminate XOR positive_for_pe, and per-organ severity labels are only meaningful when the parent label is positive. Per-slice CNNs don't know about these rules and emit independent sigmoid scores that often violate them — negative_exam=0.7 and positive_exam=0.6 is contradictory and metric-penalized. The aggregation fix is a top-down cascade: first commit to the exam-level decision based on the strongest evidence (any slice above 0.5 → positive), then rescale the dependent labels conditionally — push winning labels up by 0.5 + score/2, push losing labels down by score/2. The final submission satisfies the hierarchy by construction.

Quick Start

import numpy as np
import pandas as pd
from scipy.special import softmax

def aggregate_exam(preds, exam_id):
    rows = preds.loc[preds.StudyInstanceUID == exam_id]
    is_positive = (rows.pe_present_on_image >= 0.5).any()

    out = {}
    if is_positive:
        out['negative_exam_for_pe'] = 0
        out['indeterminate']        = rows.indeterminate.min() / 2
    else:
        out['negative_exam_for_pe'] = 1
        if (rows.indeterminate >= 0.5).any():
            out['indeterminate'] = rows.indeterminate.max()
        else:
            out['indeterminate'] = rows.indeterminate.min() / 2

    a, b = rows[['rv_lv_ratio_gte_1', 'rv_lv_ratio_lt_1']].mean().values
    if a > b:
        a, b = a * 2, b / 2
    out['rv_lv_ratio_gte_1'], out['rv_lv_ratio_lt_1'] = softmax([a, b])

    for k in ['leftsided_pe', 'rightsided_pe', 'central_pe']:
        s = rows[k].mean()
        out[k] = (0.5 + s / 2) if is_positive else (s / 2)
    return out

Workflow

Group per-slice predictions by exam id (StudyInstanceUID or analogous)
Decide the top-level exam class from the strongest evidence — (slice_score >= 0.5).any() is the standard rule
Set mutually exclusive top-level labels deterministically based on the decision
For dependent labels (severity, location, etc.), rescale by 0.5 + mean/2 if the parent was positive, mean/2 if negative — this guarantees they stay below 0.5 in the negative case
For paired labels that must softmax to 1.0 (e.g. rv_lv_ratio_gte_1 vs lt_1), apply softmax to the per-exam means after asymmetric pre-amplification of the winner

Key Decisions

Top-down decision first, then rescale: bottom-up averaging never satisfies the hierarchy.
0.5 + score/2 and score/2 rescaling: pushes confident losers below 0.5 and confident winners above 0.5 without losing fine-grained ranking inside each side.
.any() for positive detection, not .mean(): a single confident positive slice should flip the exam — averaging dilutes it.
Asymmetric softmax pre-amplification: doubling the winner before softmax sharpens the output distribution without distorting the ranking.
Persist the rule cascade with the model: if the metric definition changes, you only update one function.

References

PE Detection with Keras - Model Creation

Related Skills

wenmin-wu/timeseries-scaled-pinball-loss

data-ai

VerifiedTrustedCommunity

Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-scaled-pinball-loss

wenmin-wu/timeseries-retroactive-outlier-rescaling

data-ai

VerifiedTrustedCommunity

Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-retroactive-outlier-rescaling

wenmin-wu/timeseries-ratio-target-for-smape

testing

VerifiedTrustedCommunity

Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-ratio-target-for-smape

wenmin-wu/timeseries-quantile-ratio-scaling

tools

VerifiedTrustedCommunity

Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-quantile-ratio-scaling

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/wenmin-wu/ds-skills.git

# Copy into Claude Code skills folder (global)
cp -r ds-skills/skills/cv/exam-level-label-hierarchy-aggregation ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

wenmin-wu/ds-skills

24 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

wenmin-wu/cv-exam-level-label-hierarchy-aggregation

skills/cv/exam-level-label-hierarchy-aggregation/SKILL.md

24 stars

testing

Updated Apr 17, 2026

$ install --global

skillsauth

npx skillsauth add wenmin-wu/ds-skills cv-exam-level-label-hierarchy-aggregation

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 24, 2026, 9:03 PM1.9s1 file scanned

SKILL.md

name:: cv-exam-level-label-hierarchy-aggregation
description:: Aggregate per-slice predictions into exam-level labels that satisfy a competition's mutual-exclusion hierarchy (positive vs negative vs indeterminate), using a top-down rule cascade — first decide the exam class, then conditionally rescale the dependent labels so the submission stays internally consistent

Overview

Quick Start

import numpy as np
import pandas as pd
from scipy.special import softmax

def aggregate_exam(preds, exam_id):
    rows = preds.loc[preds.StudyInstanceUID == exam_id]
    is_positive = (rows.pe_present_on_image >= 0.5).any()

    out = {}
    if is_positive:
        out['negative_exam_for_pe'] = 0
        out['indeterminate']        = rows.indeterminate.min() / 2
    else:
        out['negative_exam_for_pe'] = 1
        if (rows.indeterminate >= 0.5).any():
            out['indeterminate'] = rows.indeterminate.max()
        else:
            out['indeterminate'] = rows.indeterminate.min() / 2

    a, b = rows[['rv_lv_ratio_gte_1', 'rv_lv_ratio_lt_1']].mean().values
    if a > b:
        a, b = a * 2, b / 2
    out['rv_lv_ratio_gte_1'], out['rv_lv_ratio_lt_1'] = softmax([a, b])

    for k in ['leftsided_pe', 'rightsided_pe', 'central_pe']:
        s = rows[k].mean()
        out[k] = (0.5 + s / 2) if is_positive else (s / 2)
    return out

Workflow

Group per-slice predictions by exam id (StudyInstanceUID or analogous)
Decide the top-level exam class from the strongest evidence — (slice_score >= 0.5).any() is the standard rule
Set mutually exclusive top-level labels deterministically based on the decision
For dependent labels (severity, location, etc.), rescale by 0.5 + mean/2 if the parent was positive, mean/2 if negative — this guarantees they stay below 0.5 in the negative case
For paired labels that must softmax to 1.0 (e.g. rv_lv_ratio_gte_1 vs lt_1), apply softmax to the per-exam means after asymmetric pre-amplification of the winner

Key Decisions

Top-down decision first, then rescale: bottom-up averaging never satisfies the hierarchy.
0.5 + score/2 and score/2 rescaling: pushes confident losers below 0.5 and confident winners above 0.5 without losing fine-grained ranking inside each side.
.any() for positive detection, not .mean(): a single confident positive slice should flip the exam — averaging dilutes it.
Asymmetric softmax pre-amplification: doubling the winner before softmax sharpens the output distribution without distorting the ranking.
Persist the rule cascade with the model: if the metric definition changes, you only update one function.

References

PE Detection with Keras - Model Creation

Related Skills

wenmin-wu/timeseries-scaled-pinball-loss

data-ai

VerifiedTrustedCommunity

Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-scaled-pinball-loss

wenmin-wu/timeseries-retroactive-outlier-rescaling

data-ai

VerifiedTrustedCommunity

Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-retroactive-outlier-rescaling

wenmin-wu/timeseries-ratio-target-for-smape

testing

VerifiedTrustedCommunity

Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-ratio-target-for-smape

wenmin-wu/timeseries-quantile-ratio-scaling

tools

VerifiedTrustedCommunity

Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF

31SKILL.mdUpdated Apr 23, 2026

wenmin-wu/timeseries-quantile-ratio-scaling

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/wenmin-wu/ds-skills.git

# Copy into Claude Code skills folder (global)
cp -r ds-skills/skills/cv/exam-level-label-hierarchy-aggregation ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

wenmin-wu/ds-skills

24 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT