skills/nlp/tpu-distribution-strategy/SKILL.md
Auto-detects TPU vs CPU/GPU at runtime and wraps model construction in the appropriate TensorFlow distribution strategy with scaled batch size.
npx skillsauth add wenmin-wu/ds-skills nlp-tpu-distribution-strategyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Kaggle and cloud environments may or may not have TPUs available. Instead of maintaining separate code paths, auto-detect the accelerator at startup and select the correct TF distribution strategy. Scale batch size by the number of replicas so each device gets a consistent local batch.
import tensorflow as tf
try:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
tf.config.experimental_connect_to_cluster(tpu)
tf.tpu.experimental.initialize_tpu_system(tpu)
strategy = tf.distribute.TPUStrategy(tpu)
except ValueError:
strategy = tf.distribute.get_strategy() # CPU or single GPU
BATCH_SIZE = 16 * strategy.num_replicas_in_sync
with strategy.scope():
model = build_model()
model.compile(optimizer="adam", loss="binary_crossentropy")
ValueError if none existsTPUStrategynum_replicas_in_sync (8 for TPU v3-8)strategy.scope()tf.data with drop_remainder=True for TPU (requires fixed shapes)strategy.scope()data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF