skills/cv/hair-overlay-augmentation/SKILL.md
Overlay real hair PNGs (masked via threshold) onto dermoscopy images to simulate body-hair occlusion as a domain-specific augmentation
npx skillsauth add wenmin-wu/ds-skills cv-hair-overlay-augmentationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Dermoscopy images of skin lesions often have body hair obscuring the lesion. Training images without hair do not generalize well to test images with hair — the model learns hair is OOD. Standard cutout/cutmix injects synthetic noise, but real hair has color, shape, and directionality. The dermoscopy-specific fix is to maintain a small library of real hair PNGs (foreground on a dark background), randomly pick a few per training image, threshold them into masks, and composite onto the dermoscopy image via OpenCV bitwise_and. Lift of ~0.5-1 AUC points on melanoma classification reported in top Kaggle solutions.
import cv2, os, random
import numpy as np
class HairOverlay:
def __init__(self, hairs_folder, max_hairs=5, p=0.5):
self.hairs_folder = hairs_folder
self.max_hairs = max_hairs
self.p = p
self.files = [f for f in os.listdir(hairs_folder) if f.endswith('.png')]
def __call__(self, img):
if random.random() > self.p:
return img
n = random.randint(0, self.max_hairs)
for _ in range(n):
hair = cv2.imread(os.path.join(self.hairs_folder, random.choice(self.files)))
hair = cv2.flip(hair, random.choice([-1, 0, 1]))
hair = cv2.rotate(hair, random.choice([0, 1, 2]))
h, w, _ = hair.shape
if h >= img.shape[0] or w >= img.shape[1]:
continue
y = random.randint(0, img.shape[0] - h)
x = random.randint(0, img.shape[1] - w)
roi = img[y:y+h, x:x+w]
gray = cv2.cvtColor(hair, cv2.COLOR_BGR2GRAY)
_, mask = cv2.threshold(gray, 10, 255, cv2.THRESH_BINARY)
bg = cv2.bitwise_and(roi, roi, mask=cv2.bitwise_not(mask))
fg = cv2.bitwise_and(hair, hair, mask=mask)
img[y:y+h, x:x+w] = cv2.add(bg, fg)
return img
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF