skills/43-wentorai-research-plugins/skills/domains/ai-ml/keras-deep-learning/SKILL.md
Build and debug deep learning models with Keras and TensorFlow backend
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research keras-deep-learningInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Keras is the high-level deep learning API that ships as part of TensorFlow 2.x and is the recommended interface for building, training, and deploying neural networks. Its Sequential and Functional APIs provide a progressive disclosure of complexity: beginners can stack layers in minutes, while researchers can build arbitrary DAG architectures, custom training loops, and multi-output models with the same framework.
This guide covers practical patterns for academic research with Keras, from image classification and sequence modeling to custom loss functions and experiment reproducibility. The focus is on patterns that appear repeatedly in published work -- data loading pipelines, callback orchestration, hyperparameter search, and model introspection -- rather than toy examples.
Keras is particularly strong in rapid prototyping for research papers. Its integration with TensorBoard, Weights & Biases, and tf.data pipelines makes it straightforward to go from idea to reproducible experiment to publication-quality results.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# Image classification baseline
model = keras.Sequential([
layers.Input(shape=(224, 224, 3)),
layers.Rescaling(1.0 / 255),
layers.Conv2D(32, 3, activation="relu", padding="same"),
layers.BatchNormalization(),
layers.MaxPooling2D(2),
layers.Conv2D(64, 3, activation="relu", padding="same"),
layers.BatchNormalization(),
layers.MaxPooling2D(2),
layers.Conv2D(128, 3, activation="relu", padding="same"),
layers.GlobalAveragePooling2D(),
layers.Dropout(0.3),
layers.Dense(256, activation="relu"),
layers.Dense(10, activation="softmax"),
])
model.compile(
optimizer=keras.optimizers.AdamW(learning_rate=1e-3, weight_decay=1e-4),
loss="sparse_categorical_crossentropy",
metrics=["accuracy"],
)
# Multi-input model for multimodal research
image_input = keras.Input(shape=(224, 224, 3), name="image")
text_input = keras.Input(shape=(128,), dtype="int32", name="text")
# Image branch
x_img = keras.applications.EfficientNetV2B0(
include_top=False, weights="imagenet", input_tensor=image_input
).output
x_img = layers.GlobalAveragePooling2D()(x_img)
# Text branch
x_txt = layers.Embedding(10000, 128)(text_input)
x_txt = layers.Bidirectional(layers.LSTM(64))(x_txt)
# Merge
merged = layers.Concatenate()([x_img, x_txt])
merged = layers.Dense(256, activation="relu")(merged)
merged = layers.Dropout(0.4)(merged)
output = layers.Dense(5, activation="softmax", name="classification")(merged)
model = keras.Model(inputs=[image_input, text_input], outputs=output)
Efficient data loading is critical for GPU utilization in research experiments:
def build_dataset(file_pattern, batch_size=32, training=True):
"""Build a tf.data pipeline with augmentation for research experiments."""
dataset = tf.data.Dataset.list_files(file_pattern, shuffle=training)
def parse_image(path):
img = tf.io.read_file(path)
img = tf.image.decode_jpeg(img, channels=3)
img = tf.image.resize(img, [256, 256])
label = tf.strings.split(path, os.sep)[-2]
return img, label
dataset = dataset.map(parse_image, num_parallel_calls=tf.data.AUTOTUNE)
if training:
dataset = dataset.shuffle(1000)
dataset = dataset.map(
lambda x, y: (tf.image.random_flip_left_right(x), y),
num_parallel_calls=tf.data.AUTOTUNE,
)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(tf.data.AUTOTUNE)
return dataset
import os
import random
import numpy as np
def set_seed(seed=42):
"""Ensure reproducibility across runs for paper results."""
os.environ["PYTHONHASHSEED"] = str(seed)
random.seed(seed)
np.random.seed(seed)
tf.random.set_seed(seed)
set_seed(42)
callbacks = [
keras.callbacks.ModelCheckpoint(
"best_model.keras", monitor="val_loss", save_best_only=True
),
keras.callbacks.EarlyStopping(
monitor="val_loss", patience=10, restore_best_weights=True
),
keras.callbacks.ReduceLROnPlateau(
monitor="val_loss", factor=0.5, patience=5, min_lr=1e-6
),
keras.callbacks.TensorBoard(log_dir="./logs", histogram_freq=1),
keras.callbacks.CSVLogger("training_log.csv"),
]
history = model.fit(
train_dataset,
validation_data=val_dataset,
epochs=100,
callbacks=callbacks,
)
@tf.function
def train_step(model, optimizer, x, y, loss_fn):
with tf.GradientTape() as tape:
predictions = model(x, training=True)
loss = loss_fn(y, predictions)
gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
return loss
# Custom metric tracking
train_loss = keras.metrics.Mean(name="train_loss")
for epoch in range(num_epochs):
train_loss.reset_state()
for x_batch, y_batch in train_dataset:
loss = train_step(model, optimizer, x_batch, y_batch, loss_fn)
train_loss.update_state(loss)
print(f"Epoch {epoch+1}, Loss: {train_loss.result():.4f}")
| Issue | Symptom | Solution |
|-------|---------|----------|
| Exploding gradients | Loss becomes NaN | Add gradient clipping, reduce learning rate |
| Overfitting | Val loss diverges from train loss | Add Dropout, data augmentation, weight decay |
| Underfitting | Both losses plateau high | Increase model capacity, reduce regularization |
| Slow training | Low GPU utilization | Use tf.data with prefetch, increase batch size |
| Memory errors | OOM on GPU | Reduce batch size, use mixed precision |
| Non-deterministic results | Different results per run | Call set_seed(), set TF_DETERMINISTIC_OPS=1 |
# Enable mixed precision for 2x speedup on modern GPUs
keras.mixed_precision.set_global_policy("mixed_float16")
# Ensure the output layer uses float32 for numerical stability
output = layers.Dense(10, activation="softmax", dtype="float32")(x)
tensorflow, keras, numpy, and cuda versions in your paper appendix.keras.utils.set_random_seed(42) for full determinism (TF 2.12+)..keras format (not HDF5) for forward compatibility.tf.debugging.enable_check_numerics() during development to catch NaN/Inf early.tf.saved_model for deployment; export ONNX for cross-framework comparison.development
Conduct rigorous thematic analysis (TA) of qualitative data following Braun and Clarke's (2006) six-phase framework. Use whenever the user mentions 'thematic analysis', 'TA', 'Braun and Clarke', 'qualitative coding', 'identifying themes', or asks for help analysing interviews, focus groups, open-ended survey responses, or transcripts to identify patterns. Also trigger for questions about inductive vs theoretical coding, semantic vs latent themes, essentialist vs constructionist epistemology, building a thematic map, or writing up a qualitative findings section. Covers all six phases, the four upfront analytic decisions, the 15-point quality checklist, and the five common pitfalls. Produces a Word document write-up and an annotated thematic map. Does NOT cover IPA, grounded theory, discourse analysis, conversation analysis, or narrative analysis — use a different method for those.
development
Guide users through writing a systematic literature review (SLR) following the PRISMA 2020 framework. Use this skill whenever the user mentions 'systematic review', 'systematic literature review', 'SLR', 'PRISMA', 'PRISMA 2020', 'PRISMA flow diagram', 'PRISMA checklist', or asks for help writing, structuring, or auditing a literature review that follows reporting guidelines. Also trigger when the user asks about inclusion/exclusion criteria for a review, search strategies for databases like Scopus/WoS/PubMed, study selection processes, risk of bias assessment, or narrative synthesis for a review paper. This skill covers the full PRISMA 2020 checklist (27 items), produces a Word document manuscript in strict journal article format, generates an annotated PRISMA flow diagram, and enforces APA 7th Edition referencing throughout. It does NOT cover meta-analysis or statistical pooling. By Chuah Kee Man.
testing
Performs placebo-in-time sensitivity analysis with hierarchical null model and optional Bayesian assurance. Use when checking model robustness, verifying lack of pre-intervention effects, or estimating study power.
data-ai
Fit, summarize, plot, and interpret a chosen CausalPy experiment. Use after the causal method has been selected, including when configuring PyMC/sklearn models and scale-aware custom priors.