skills/notebook-ml-architect/SKILL.md
Expert guidance for auditing, refactoring, and designing machine learning Jupyter notebooks with production-quality patterns. Use when: (1) Analyzing notebook structure and identifying anti-patterns, (2) Detecting data leakage and reproducibility issues, (3) Refactoring messy notebooks into modular pipelines, (4) Generating templates for ML workflows (EDA, classification, experiments), (5) Adding reproducibility instrumentation (seeding, logging, env capture), (6) Converting notebooks to Python scripts, (7) Generating experiment summary reports. Triggers on: ML notebook, Jupyter audit, notebook refactor, data leakage, experiment template, ipynb best practices, notebook to script, reproducibility.
npx skillsauth add bjornmelin/dev-skills notebook-ml-architectInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Expert guidance for production-quality ML notebooks.
| Operation | Use Case | |-----------|----------| | audit | Analyze notebook for anti-patterns, leakage, reproducibility issues | | refactor | Transform notebook into modular Python pipeline | | template | Generate new notebook from EDA/classification/experiment template | | report | Create markdown summary from executed notebook | | convert | Extract Python script from notebook |
When auditing a notebook:
python scripts/analyze_notebook.py <notebook.ipynb>
Transform notebooks into production pipelines:
Look for markdown headers that indicate logical sections:
Convert repeated or complex cell code into functions:
# Before: inline code
df = pd.read_csv('data.csv')
df = df.dropna()
df['feature'] = df['a'] * df['b']
# After: function
def load_and_prepare_data(path: str) -> pd.DataFrame:
df = pd.read_csv(path)
df = df.dropna()
df['feature'] = df['a'] * df['b']
return df
project/
├── data.py # Data loading and preprocessing
├── features.py # Feature engineering
├── model.py # Model definition
├── train.py # Training loop
├── evaluate.py # Evaluation metrics
├── config.py # Configuration parameters
└── main.py # Pipeline entry point
python scripts/convert_to_script.py notebook.ipynb output.py --group-by-sections
Generate new notebooks from templates:
EDA Template (assets/templates/eda_template.ipynb)
Classification Template (assets/templates/classification_template.ipynb)
Experiment Template (assets/templates/experiment_template.ipynb)
Copy template to project and customize:
cp ~/.claude/skills/notebook-ml-architect/assets/templates/classification_template.ipynb ./my_experiment.ipynb
Or generate programmatically with modifications.
Random Seeds Use the reproducibility header snippet:
# Copy from assets/snippets/reproducibility_header.py
Environment Capture
import sys
print(f"Python: {sys.version}")
for pkg in ['numpy', 'pandas', 'sklearn', 'torch']:
try:
mod = __import__(pkg)
print(f"{pkg}: {mod.__version__}")
except ImportError:
pass
Dependency File
pip freeze > requirements.txt
# Or for conda:
conda env export > environment.yml
Data Versioning
When you need accurate API information:
1. Call resolve-library-id with library name
2. Call get-library-docs with the returned ID and topic
Examples:
When you need up-to-date recommendations:
web_search_exa for discoverycrawling_exa to pull full content from good URLsdeep_search_exa for focused queriesExamples:
When you need to see how others do it:
searchGitHub with:
- query: specific code pattern
- language: ["Python"]
- path: ".ipynb" for notebooks
Examples:
Parse notebook and extract structure:
python scripts/analyze_notebook.py <notebook.ipynb> [--output json|text]
Output includes:
Execute notebook with parameters:
python scripts/run_notebook.py input.ipynb output.ipynb \
--params '{"learning_rate": 0.01, "epochs": 100}' \
--timeout 3600
Extract Python from notebook:
python scripts/convert_to_script.py notebook.ipynb output.py \
--include-markdown \
--group-by-sections \
--add-main
Problem: Preprocessing on full dataset before split
# BAD
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X) # Fits on all data
X_train, X_test = train_test_split(X_scaled)
Fix: Split first, fit on train only
# GOOD
X_train, X_test = train_test_split(X)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test) # Transform only
Problem: Variables from previous runs affect results
# Cell 1 run multiple times
results.append(model.score(X_test, y_test)) # results grows each run
Fix: Initialize state in cell
results = [] # Always start fresh
results.append(model.score(X_test, y_test))
Problem: Different results each run
X_train, X_test = train_test_split(X, y) # Random each time
Fix: Set seeds explicitly
SEED = 42
X_train, X_test = train_test_split(X, y, random_state=SEED)
development
Repo/monorepo modernization: dependency upgrades, security fixes, deprecation cleanup, framework migrations, dependency-native refactors, and verified hard-cut simplification.
development
Use this skill for Browser Web Animations API: Element.animate(), Animation, KeyframeEffect, playback control, generated keyframes, cancel/finish, commitStyles, and cleanup. Trigger on Element.animate, WAAPI, Web Animations API, KeyframeEffect, Animation object, commitStyles. Do not use for near-miss tasks outside these boundaries; route to adjacent motion or platform skills when they own the implementation.
tools
Use this skill for Three.js, React Three Fiber, Drei, Canvas/createRoot lifecycle, loaders, GLTF, useFrame, disposal, SSR/client boundaries, DPR, and browser proof. Trigger on Three.js, THREE, @react-three/fiber, @react-three/drei, R3F Canvas, useFrame, GLTF, WebGLRenderer. Do not use for near-miss tasks outside these boundaries; route to adjacent motion or platform skills when they own the implementation.
development
Use this skill for Tailwind CSS v4 transition, animation, duration, easing, motion-safe/motion-reduce, @theme motion tokens, and static class safety. Trigger on Tailwind animation, transition-all, motion-safe, motion-reduce, @theme, animate-, duration-. Do not use for near-miss tasks outside these boundaries; route to adjacent motion or platform skills when they own the implementation.