skills/cv/kmeans-dominant-color-extraction/SKILL.md
Extract an image's dominant RGB color via k-means over pixel-color space and emit three dense features capturing the modal color of the subject
npx skillsauth add wenmin-wu/ds-skills cv-kmeans-dominant-color-extractionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Mean-channel color is a poor feature because a red product on a gray backdrop averages out to brown. K-means in pixel color space finds the modal color cluster — the one a human would name when asked "what color is this?" — and it works without any segmentation. Run k-means (k=5) on the flattened Nx3 pixel matrix, pick the centroid of the most populous cluster, and emit three normalized features dominant_r/g/b. Used in Avito Demand Prediction top kernels to capture product-color signal for listing-quality modeling.
import cv2
import numpy as np
def dominant_color(path, n_colors=5):
img = cv2.imread(path) # BGR
pixels = np.float32(img.reshape(-1, 3))
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 200, .1)
_, labels, centroids = cv2.kmeans(
pixels, n_colors, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)
counts = np.bincount(labels.flatten())
b, g, r = centroids[np.argmax(counts)].astype(np.uint8)
return {'dominant_r': r / 255., 'dominant_g': g / 255., 'dominant_b': b / 255.}
Nx3 float32cv2.kmeans with k=5, 10 attempts, EPS + MAX_ITER criterianp.bincount and pick the argmax cluster — the modal colordominant_r/g/b as three featuresargmax(counts) picks the color a human would name; the mean of centroids just reproduces average color.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF