skills/cv/coco-annotation-conversion/SKILL.md
Convert per-instance RLE or polygon annotations to COCO JSON format for seamless use with Detectron2 and MMDetection
npx skillsauth add wenmin-wu/ds-skills cv-coco-annotation-conversionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Detectron2 and MMDetection expect COCO-format JSON annotations (images, annotations, categories arrays). Competition data often comes as CSV with RLE strings or per-image annotation lists. Converting to COCO JSON enables direct use of register_coco_instances and standard data loaders, avoiding custom dataset classes.
import json
import numpy as np
from pycocotools import mask as mask_util
def build_coco_json(df, image_dir, output_path):
images, annotations = [], []
ann_id = 1
for img_id, (image_name, group) in enumerate(df.groupby('image_id')):
h, w = group.iloc[0]['height'], group.iloc[0]['width']
images.append({
'id': img_id, 'file_name': f'{image_name}.png',
'height': h, 'width': w
})
for _, row in group.iterrows():
mask = rle_decode(row['annotation'], (h, w))
rle = mask_util.encode(np.asfortranarray(mask))
rle['counts'] = rle['counts'].decode('utf-8')
bbox = mask_util.toBbox(rle).tolist()
annotations.append({
'id': ann_id, 'image_id': img_id,
'category_id': row.get('class_id', 0),
'segmentation': rle, 'bbox': bbox,
'bbox_mode': 0, 'area': int(mask.sum()),
'iscrowd': 0
})
ann_id += 1
coco = {
'images': images,
'annotations': annotations,
'categories': [{'id': 0, 'name': 'cell'}]
}
with open(output_path, 'w') as f:
json.dump(coco, f)
# Register with Detectron2
from detectron2.data.datasets import register_coco_instances
register_coco_instances('train', {}, 'annotations_train.json', 'images/')
images, annotations, categoriesregister_coco_instancescounts as UTF-8 string, not bytes — decode after encodingBoxMode.XYXY_ABS (mode 0) by default; COCO uses XYWH — check your frameworkdata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF