skills/cv/multi-efficientnet-shared-input/SKILL.md
Combine EfficientNetB0..B6 into one Keras model with a shared image input and one sigmoid head per backbone, training all N models in a single fit() call on TPU
npx skillsauth add wenmin-wu/ds-skills cv-multi-efficientnet-shared-inputInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Training N separate CNN backbones (B0 through B6) in sequence wastes TPU idle time and checkpoint bandwidth. A much cleaner pattern: build one Keras model with N parallel branches that share the same input tensor, each branch its own EfficientNet backbone and sigmoid head, and N copies of the loss. One fit() call trains them all at the same LR schedule on the same batches, and at inference you get N predictions per image for free — perfect for ensembling. Used in the Kaggle SIIM-ISIC Melanoma top solutions to fine-tune 7 EfficientNets in the time of one.
import tensorflow as tf
import efficientnet.tfkeras as efn
def multi_effnet(img_size, n_nets=7, label_smoothing=0.05):
inp = tf.keras.Input(shape=(img_size, img_size, 3))
dummy = tf.keras.layers.Lambda(lambda x: x)(inp) # forces TF to share
outputs = []
for i in range(n_nets):
Ctor = getattr(efn, f'EfficientNetB{i}')
x = Ctor(include_top=False, weights='noisy-student',
input_shape=(img_size, img_size, 3), pooling='avg')(dummy)
x = tf.keras.layers.Dense(1, activation='sigmoid', name=f'head_b{i}')(x)
outputs.append(x)
model = tf.keras.Model(inp, outputs)
losses = [tf.keras.losses.BinaryCrossentropy(label_smoothing=label_smoothing)
for _ in range(n_nets)]
model.compile(optimizer='adam', loss=losses,
metrics=[tf.keras.metrics.AUC()])
return model
# Broadcast a single label to one per head
ds = ds.map(lambda img, y: (img, tuple([y] * 7)))
model.fit(ds, epochs=10)
Input layer and pass it through a Lambda(x: x) so TF treats the input as shareableinclude_top=False, pooling='avg'Dense(1, sigmoid) output and collect the listModel(inp, outputs) and compile with one loss per outputtuple([y] * N) so every head sees the same targettuple(labels * N) is faster than N datasets.data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF