skills/cv/open-set-distance-cutoff/SKILL.md
Assign an unknown/novel class when all nearest-neighbor distances exceed a tuned cutoff threshold for open-set recognition
npx skillsauth add wenmin-wu/ds-skills cv-open-set-distance-cutoffInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
In open-set recognition, test images may belong to classes not seen during training (e.g., "new_whale"). After retrieving k nearest neighbors by embedding distance, if all distances exceed a learned cutoff, insert the unknown class into the prediction list. The cutoff is tuned on a validation set to maximize the ranking metric (e.g., MAP@5).
import numpy as np
def predict_with_unknown(query_dists, query_nbs, train_labels,
unknown_label='new_whale', dcut=3.8, top_k=5):
predictions = []
for i in range(len(query_dists)):
seen = {}
for j in range(query_nbs.shape[1]):
label = train_labels[query_nbs[i, j]]
dist = query_dists[i, j]
if dist > dcut and unknown_label not in seen:
seen[unknown_label] = dcut
if label not in seen:
seen[label] = dist
if len(seen) >= top_k:
break
preds = sorted(seen.items(), key=lambda x: x[1])[:top_k]
predictions.append([p[0] for p in preds])
return predictions
dcutdcut on validation set to maximize MAP@k or accuracydcut values on validation set; optimal value depends on embedding space scaledata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF