Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

bjornmelin/notebook-ml-architect

Name: notebook-ml-architect
Author: bjornmelin

skills/notebook-ml-architect/SKILL.md

npx skillsauth add bjornmelin/dev-skills notebook-ml-architect

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Notebook ML Architect

Expert guidance for production-quality ML notebooks.

Quick Reference

| Operation | Use Case | |-----------|----------| | audit | Analyze notebook for anti-patterns, leakage, reproducibility issues | | refactor | Transform notebook into modular Python pipeline | | template | Generate new notebook from EDA/classification/experiment template | | report | Create markdown summary from executed notebook | | convert | Extract Python script from notebook |

Audit Workflow

When auditing a notebook:

Read the notebook using the Read tool
Check structure against ml-workflow-guide.md
Detect anti-patterns using anti-patterns.md
Check for data leakage using leakage-checklist.md

Run analysis script if deeper inspection needed:

python scripts/analyze_notebook.py <notebook.ipynb>

Audit Checklist

[ ] Execution order: Cells numbered sequentially (no gaps, no out-of-order)
[ ] Random seeds: Set early (np.random.seed, torch.manual_seed, random.seed)
[ ] Imports at top: All imports in first code cell(s)
[ ] No hardcoded paths: Use relative paths or config variables
[ ] Train/test split: Clear separation before any modeling
[ ] No data leakage: Pre-processing after split, no test data peeking
[ ] Modularization: Functions/classes for reusable logic
[ ] Dependencies documented: requirements.txt or environment.yml referenced

Severity Levels

CRITICAL: Data leakage, missing train/test split, results unreproducible
HIGH: No seeds, hardcoded paths, execution order issues
MEDIUM: Missing modularization, no dependency docs
LOW: Naming conventions, missing comments, style issues

Refactoring Guide

Transform notebooks into production pipelines:

Step 1: Identify Sections

Look for markdown headers that indicate logical sections:

Data loading
Preprocessing
Feature engineering
Model definition
Training
Evaluation

Step 2: Extract Functions

Convert repeated or complex cell code into functions:

# Before: inline code
df = pd.read_csv('data.csv')
df = df.dropna()
df['feature'] = df['a'] * df['b']

# After: function
def load_and_prepare_data(path: str) -> pd.DataFrame:
    df = pd.read_csv(path)
    df = df.dropna()
    df['feature'] = df['a'] * df['b']
    return df

Step 3: Create Module Structure

project/
├── data.py          # Data loading and preprocessing
├── features.py      # Feature engineering
├── model.py         # Model definition
├── train.py         # Training loop
├── evaluate.py      # Evaluation metrics
├── config.py        # Configuration parameters
└── main.py          # Pipeline entry point

Step 4: Use convert script

python scripts/convert_to_script.py notebook.ipynb output.py --group-by-sections

Template Generation

Generate new notebooks from templates:

Available Templates

EDA Template (assets/templates/eda_template.ipynb)
- Data loading, basic info, missing values, distributions, correlations
Classification Template (assets/templates/classification_template.ipynb)
- Full supervised learning pipeline with evaluation metrics
Experiment Template (assets/templates/experiment_template.ipynb)
- Parameterized notebook for experiment tracking

Using Templates

Copy template to project and customize:

cp ~/.claude/skills/notebook-ml-architect/assets/templates/classification_template.ipynb ./my_experiment.ipynb

Or generate programmatically with modifications.

Reproducibility Checklist

Required Elements

Random Seeds Use the reproducibility header snippet:

# Copy from assets/snippets/reproducibility_header.py

Environment Capture

import sys
print(f"Python: {sys.version}")
for pkg in ['numpy', 'pandas', 'sklearn', 'torch']:
    try:
        mod = __import__(pkg)
        print(f"{pkg}: {mod.__version__}")
    except ImportError:
        pass

Dependency File

pip freeze > requirements.txt
# Or for conda:
conda env export > environment.yml

Data Versioning
- Record data source, download date, preprocessing steps
- Use relative paths from project root
- Consider DVC for large datasets

MCP Tool Usage

Context7 - Library API Lookups

When you need accurate API information:

1. Call resolve-library-id with library name
2. Call get-library-docs with the returned ID and topic

Examples:

sklearn train_test_split parameters
papermill execute_notebook options
nbformat cell structure

Exa Search - Current Best Practices

When you need up-to-date recommendations:

Use web_search_exa for discovery
Use crawling_exa to pull full content from good URLs
Use deep_search_exa for focused queries

Examples:

"PyTorch reproducibility best practices 2024"
"How to handle class imbalance"
"MLflow notebook integration"

GitHub Search - Real-World Patterns

When you need to see how others do it:

searchGitHub with:
- query: specific code pattern
- language: ["Python"]
- path: ".ipynb" for notebooks

Examples:

Production notebook seeding patterns
Evaluation metric implementations
Config management in notebooks

Script Reference

analyze_notebook.py

Parse notebook and extract structure:

python scripts/analyze_notebook.py <notebook.ipynb> [--output json|text]

Output includes:

Cell counts by type
Import statements
Function/class definitions
Detected issues

run_notebook.py

Execute notebook with parameters:

python scripts/run_notebook.py input.ipynb output.ipynb \
  --params '{"learning_rate": 0.01, "epochs": 100}' \
  --timeout 3600

convert_to_script.py

Extract Python from notebook:

python scripts/convert_to_script.py notebook.ipynb output.py \
  --include-markdown \
  --group-by-sections \
  --add-main

Common Issues and Fixes

Data Leakage

Problem: Preprocessing on full dataset before split

# BAD
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)  # Fits on all data
X_train, X_test = train_test_split(X_scaled)

Fix: Split first, fit on train only

# GOOD
X_train, X_test = train_test_split(X)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)  # Transform only

Hidden State

Problem: Variables from previous runs affect results

# Cell 1 run multiple times
results.append(model.score(X_test, y_test))  # results grows each run

Fix: Initialize state in cell

results = []  # Always start fresh
results.append(model.score(X_test, y_test))

Missing Seeds

Problem: Different results each run

X_train, X_test = train_test_split(X, y)  # Random each time

Fix: Set seeds explicitly

SEED = 42
X_train, X_test = train_test_split(X, y, random_state=SEED)

bjornmelin/notebook-ml-architect

skills/notebook-ml-architect/SKILL.md

Expert guidance for auditing, refactoring, and designing machine learning Jupyter notebooks with production-quality patterns. Use when: (1) Analyzing notebook structure and identifying anti-patterns, (2) Detecting data leakage and reproducibility issues, (3) Refactoring messy notebooks into modular pipelines, (4) Generating templates for ML workflows (EDA, classification, experiments), (5) Adding reproducibility instrumentation (seeding, logging, env capture), (6) Converting notebooks to Python scripts, (7) Generating experiment summary reports. Triggers on: ML notebook, Jupyter audit, notebook refactor, data leakage, experiment template, ipynb best practices, notebook to script, reproducibility.

2 stars

development

Updated Jun 11, 2026

$ install --global

skillsauth

npx skillsauth add bjornmelin/dev-skills notebook-ml-architect

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 11, 2026, 3:47 AM160.6s13 files scanned

SKILL.md

name:: notebook-ml-architect
description:: >
Triggers on:: ML notebook, Jupyter audit, notebook refactor, data leakage,

Notebook ML Architect

Expert guidance for production-quality ML notebooks.

Quick Reference

Audit Workflow

When auditing a notebook:

Read the notebook using the Read tool
Check structure against ml-workflow-guide.md
Detect anti-patterns using anti-patterns.md
Check for data leakage using leakage-checklist.md

Run analysis script if deeper inspection needed:

python scripts/analyze_notebook.py <notebook.ipynb>

Audit Checklist

[ ] Execution order: Cells numbered sequentially (no gaps, no out-of-order)
[ ] Random seeds: Set early (np.random.seed, torch.manual_seed, random.seed)
[ ] Imports at top: All imports in first code cell(s)
[ ] No hardcoded paths: Use relative paths or config variables
[ ] Train/test split: Clear separation before any modeling
[ ] No data leakage: Pre-processing after split, no test data peeking
[ ] Modularization: Functions/classes for reusable logic
[ ] Dependencies documented: requirements.txt or environment.yml referenced

Severity Levels

CRITICAL: Data leakage, missing train/test split, results unreproducible
HIGH: No seeds, hardcoded paths, execution order issues
MEDIUM: Missing modularization, no dependency docs
LOW: Naming conventions, missing comments, style issues

Refactoring Guide

Transform notebooks into production pipelines:

Step 1: Identify Sections

Look for markdown headers that indicate logical sections:

Data loading
Preprocessing
Feature engineering
Model definition
Training
Evaluation

Step 2: Extract Functions

Convert repeated or complex cell code into functions:

# Before: inline code
df = pd.read_csv('data.csv')
df = df.dropna()
df['feature'] = df['a'] * df['b']

# After: function
def load_and_prepare_data(path: str) -> pd.DataFrame:
    df = pd.read_csv(path)
    df = df.dropna()
    df['feature'] = df['a'] * df['b']
    return df

Step 3: Create Module Structure

project/
├── data.py          # Data loading and preprocessing
├── features.py      # Feature engineering
├── model.py         # Model definition
├── train.py         # Training loop
├── evaluate.py      # Evaluation metrics
├── config.py        # Configuration parameters
└── main.py          # Pipeline entry point

Step 4: Use convert script

python scripts/convert_to_script.py notebook.ipynb output.py --group-by-sections

Template Generation

Generate new notebooks from templates:

Available Templates

EDA Template (assets/templates/eda_template.ipynb)
- Data loading, basic info, missing values, distributions, correlations
Classification Template (assets/templates/classification_template.ipynb)
- Full supervised learning pipeline with evaluation metrics
Experiment Template (assets/templates/experiment_template.ipynb)
- Parameterized notebook for experiment tracking

Using Templates

Copy template to project and customize:

cp ~/.claude/skills/notebook-ml-architect/assets/templates/classification_template.ipynb ./my_experiment.ipynb

Or generate programmatically with modifications.

Reproducibility Checklist

Required Elements

Random Seeds Use the reproducibility header snippet:

# Copy from assets/snippets/reproducibility_header.py

Environment Capture

import sys
print(f"Python: {sys.version}")
for pkg in ['numpy', 'pandas', 'sklearn', 'torch']:
    try:
        mod = __import__(pkg)
        print(f"{pkg}: {mod.__version__}")
    except ImportError:
        pass

Dependency File

pip freeze > requirements.txt
# Or for conda:
conda env export > environment.yml

Data Versioning
- Record data source, download date, preprocessing steps
- Use relative paths from project root
- Consider DVC for large datasets

MCP Tool Usage

Context7 - Library API Lookups

When you need accurate API information:

1. Call resolve-library-id with library name
2. Call get-library-docs with the returned ID and topic

Examples:

sklearn train_test_split parameters
papermill execute_notebook options
nbformat cell structure

Exa Search - Current Best Practices

When you need up-to-date recommendations:

Use web_search_exa for discovery
Use crawling_exa to pull full content from good URLs
Use deep_search_exa for focused queries

Examples:

"PyTorch reproducibility best practices 2024"
"How to handle class imbalance"
"MLflow notebook integration"

GitHub Search - Real-World Patterns

When you need to see how others do it:

searchGitHub with:
- query: specific code pattern
- language: ["Python"]
- path: ".ipynb" for notebooks

Examples:

Production notebook seeding patterns
Evaluation metric implementations
Config management in notebooks

Script Reference

analyze_notebook.py

Parse notebook and extract structure:

python scripts/analyze_notebook.py <notebook.ipynb> [--output json|text]

Output includes:

Cell counts by type
Import statements
Function/class definitions
Detected issues

run_notebook.py

Execute notebook with parameters:

python scripts/run_notebook.py input.ipynb output.ipynb \
  --params '{"learning_rate": 0.01, "epochs": 100}' \
  --timeout 3600

convert_to_script.py

Extract Python from notebook:

python scripts/convert_to_script.py notebook.ipynb output.py \
  --include-markdown \
  --group-by-sections \
  --add-main

Common Issues and Fixes

Data Leakage

Problem: Preprocessing on full dataset before split

# BAD
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)  # Fits on all data
X_train, X_test = train_test_split(X_scaled)

Fix: Split first, fit on train only

# GOOD
X_train, X_test = train_test_split(X)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)  # Transform only

Hidden State

Problem: Variables from previous runs affect results

# Cell 1 run multiple times
results.append(model.score(X_test, y_test))  # results grows each run

Fix: Initialize state in cell

results = []  # Always start fresh
results.append(model.score(X_test, y_test))

Missing Seeds

Problem: Different results each run

X_train, X_test = train_test_split(X, y)  # Random each time

Fix: Set seeds explicitly

SEED = 42
X_train, X_test = train_test_split(X, y, random_state=SEED)

Related Skills

bjornmelin/multi-model-review

development

VerifiedTrustedCommunity

Pre-PR multi-model review, parallel opus and codex exec adversarial lanes, then adversarial verification of merged findings. Read-only. Use before shipping nontrivial diffs.

3SKILL.mdUpdated Jul 23, 2026

bjornmelin/multi-model-review

bjornmelin/codex-review

tools

VerifiedTrustedCommunity

Independent gpt-5.6 diff review via the Codex CLI, normal or steerable adversarial with JSON findings. Use before shipping nontrivial changes.

3SKILL.mdUpdated Jul 23, 2026

bjornmelin/codex-review

bjornmelin/codex-delegate

development

VerifiedTrustedCommunity

Delegate implementation, investigation, or bulk work to gpt-5.6 codex via pinned codex exec. Use for clear-spec builds, migrations, debugging, or any task MODELS.md routes to codex.

3SKILL.mdUpdated Jul 23, 2026

bjornmelin/codex-delegate

bjornmelin/pre-mortem

development

VerifiedTrustedCommunity

Adversarial pre-mortem: imagine the plan failed, work backwards to surface risky assumptions + irreversible bets, then harden them. Proactively offer it (after the current request; confirm first) before a hard-to-reverse or one-way-door call (API, schema, framework, a hire), an all-upside plan, or unvalidated assumptions. Also on request.

3SKILL.mdUpdated Jul 22, 2026

bjornmelin/pre-mortem

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/bjornmelin/dev-skills.git

# Copy into Claude Code skills folder (global)
cp -r dev-skills/skills/notebook-ml-architect ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

bjornmelin/dev-skills

2 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT