skills/cv/per-organ-multihead-sigmoid-softmax/SKILL.md
Single CNN backbone with one shallow Dense neck per organ and mixed sigmoid (binary) + softmax (multi-class severity) heads, trained with a dict of losses so each organ is calibrated independently while sharing visual features
npx skillsauth add wenmin-wu/ds-skills cv-per-organ-multihead-sigmoid-softmaxInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Multi-organ trauma classification has a heterogeneous label structure: some organs are binary (injured / not), others have ordered severity grades (healthy / low / high). The naive answer — one big sigmoid head with all classes flattened — destroys the mutual exclusivity inside each grade group and trains every label to compete with every other label. The right structure is one shared backbone, a tiny per-organ "neck" Dense layer, and a head whose activation matches the label semantics: sigmoid for binary organs, softmax for severity-graded ones. Keras compile(loss={...}) accepts a dict mapping head names to losses, so each head gets its correct loss without hand-rolling.
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense
from tensorflow.keras.models import Model
from tensorflow.keras.losses import BinaryCrossentropy, CategoricalCrossentropy
x = GlobalAveragePooling2D()(backbone.output)
necks = {n: Dense(32, activation='silu', name=f'{n}_neck')(x)
for n in ['bowel', 'extra', 'liver', 'kidney', 'spleen']}
outs = [
Dense(1, activation='sigmoid', name='bowel')(necks['bowel']),
Dense(1, activation='sigmoid', name='extra')(necks['extra']),
Dense(3, activation='softmax', name='liver')(necks['liver']),
Dense(3, activation='softmax', name='kidney')(necks['kidney']),
Dense(3, activation='softmax', name='spleen')(necks['spleen']),
]
model = Model(backbone.inputs, outs)
model.compile(
optimizer='adam',
loss={
'bowel': BinaryCrossentropy(label_smoothing=0.05),
'extra': BinaryCrossentropy(label_smoothing=0.05),
'liver': CategoricalCrossentropy(label_smoothing=0.05),
'kidney': CategoricalCrossentropy(label_smoothing=0.05),
'spleen': CategoricalCrossentropy(label_smoothing=0.05),
},
)
compile(loss=...) matching the head names — Keras auto-routes per-output losseslabel_smoothing=0.05 across all heads to prevent any one organ from collapsing onto a 0/1 saturated predictioncv-multihead-softmax-to-flat-submission patterns)silu over relu in the neck: smoother gradients on a tiny 32-unit layer; the difference is small but consistently positive.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF