Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

aipoch/TorchDrug (English)

Name: TorchDrug (English)
Author: aipoch

scientific-skills/Data Analysis/TorchDrug-English/SKILL.md

npx skillsauth add aipoch/medical-research-skills TorchDrug (English)

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

TorchDrug

When to Use

Use this skill when you need pytorch-native graph neural network framework for molecules and proteins. suitable for building custom gnn architectures for drug discovery, protein modeling, or knowledge graph reasoning. best for custom model development, protein property prediction, and retrosynthesis. if you need pretrained models and diverse feature extractors, use deepchem; if you need benchmark datasets, use pytdc in a reproducible workflow.
Use this skill when a data analytics task needs a packaged method instead of ad-hoc freeform output.
Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
Use this skill when the documented workflow in this package is the most direct path to complete the request.
Use this skill when you need the TorchDrug (English) package behavior rather than a generic answer.

Key Features

Scope-focused workflow aligned to: PyTorch-native Graph Neural Network framework for molecules and proteins. Suitable for building custom GNN architectures for drug discovery, protein modeling, or knowledge graph reasoning. Best for custom model development, protein property prediction, and retrosynthesis. If you need pretrained models and diverse feature extractors, use deepchem; if you need benchmark datasets, use pytdc.
Documentation-first workflow with no packaged script requirement.
Reference material available in references/ for task-specific guidance.
Structured execution path designed to keep outputs consistent and reviewable.

Dependencies

Python: 3.10+. Repository baseline for current packaged skills.
Third-party packages: not explicitly version-pinned in this skill package. Add pinned versions if this skill needs stricter environment control.

Example Usage

Skill directory: 20260316/scientific-skills/Data Analytics/TorchDrug-English
No packaged executable script was detected.
Use the documented workflow in SKILL.md together with the references/assets in this folder.

Example run plan:

Read the skill instructions and collect the required inputs.
Follow the documented workflow exactly.
Use packaged references/assets from this folder when the task needs templates or rules.
Return a structured result tied to the requested deliverable.

Implementation Details

See ## Overview above for related details.

Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
Primary implementation surface: instruction-only workflow in SKILL.md.
Reference guidance: references/ contains supporting rules, prompts, or checklists.
Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.

Overview

TorchDrug is a PyTorch-based comprehensive machine learning toolbox designed for drug discovery and molecular science. It applies graph neural networks, pretrained models, and task definitions to molecules, proteins, and biological knowledge graphs, covering molecular property prediction, protein modeling, knowledge graph reasoning, molecular generation, retrosynthesis planning, and more, including 40+ curated datasets and 20+ model architectures.

When to Use This Skill

Use this skill when dealing with the following:

Data Types:

SMILES strings or molecular structures
Protein sequences or 3D structures (PDB files)
Reactions and retrosynthesis
Biomedical knowledge graphs
Drug discovery datasets

Tasks:

Predict molecular properties (solubility, toxicity, activity)
Protein function or structure prediction
Drug-target binding prediction
Generate new molecular structures
Plan chemical synthesis routes
Link prediction in biomedical knowledge bases
Train graph neural networks on scientific data

Libraries and Integration:

TorchDrug as core library
Often with RDKit for cheminformatics
PyTorch and PyTorch Lightning compatibility
Integrated AlphaFold and ESM for proteins

Getting Started

Installation

pip install torchdrug

Or install full version with optional dependencies

pip install torchdrug[full]

Quick Example

from torchdrug import datasets, models, tasks
from torch.utils.data import DataLoader
import torch

# Load molecular dataset
dataset = datasets.BBBP("~/molecule-datasets/")
train_set, valid_set, test_set = dataset.split()

# Define GNN model
model = models.GIN(
    input_dim=dataset.node_feature_dim,
    hidden_dims=[256, 256, 256],
    edge_input_dim=dataset.edge_feature_dim,
    batch_norm=True,
    readout="mean"
)

# Create property prediction task
task = tasks.PropertyPrediction(
    model,
    task=dataset.tasks,
    criterion="bce",
    metric=["auroc", "auprc"]
)

# Train with PyTorch
optimizer = torch.optim.Adam(task.parameters(), lr=1e-3)
train_loader = DataLoader(train_set, batch_size=32, shuffle=True)

for epoch in range(100):
    for batch in train_loader:
        loss = task(batch)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

Core Capabilities

1. Molecular Property Prediction

Predict chemical, physical, and biological properties from molecular structures.

Use Cases:

Drug-likeness and ADMET properties
Toxicity screening
Quantum chemical properties
Binding affinity prediction

Core Components:

20+ molecular datasets (BBBP, HIV, Tox21, QM9, etc.)
GNN models (GIN, GAT, SchNet)
PropertyPrediction and MultipleBinaryClassification tasks

Reference: See molecular_property_prediction.md

2. Protein Modeling

Process protein sequences, structures, and properties.

Use Cases:

Enzyme function prediction
Protein stability and solubility
Subcellular localization
Protein-protein interactions
Structure prediction

Core Components:

15+ protein datasets (EnzymeCommission, GeneOntology, PDBBind, etc.)
Sequence models (ESM, ProteinBERT, ProteinLSTM)
Structure models (GearNet, SchNet)
Multiple task types for different prediction levels

Reference: See protein_modeling.md

3. Knowledge Graph Reasoning

Predict missing links and relations in biomedical knowledge graphs.

Use Cases:

Drug repurposing
Disease mechanism discovery
Gene-disease associations
Multi-hop biomedical reasoning

Core Components:

General KGs (FB15k, WN18) and biomedical KGs (Hetionet)
Embedding models (TransE, RotatE, ComplEx)
KnowledgeGraphCompletion task

Reference: See knowledge_graphs.md

4. Molecular Generation

Generate novel molecular structures with desired properties.

Use Cases:

De novo drug design
Lead compound optimization
Chemical space exploration
Property-directed generation

Core Components:

Autoregressive generation
GCPN (policy-based generation)
GraphAutoregressiveFlow
Property optimization workflows

Reference: See molecular_generation.md

5. Retrosynthesis

Predict synthesis routes from target molecules to starting materials.

Use Cases:

Synthesis planning
Route optimization
Synthesizability assessment
Multi-step planning

Core Components:

USPTO-50k reaction dataset
CenterIdentification (reaction center prediction)
SynthonCompletion (reactant prediction)
End-to-end retrosynthesis pipeline

Reference: See retrosynthesis.md

6. Graph Neural Network Models

Comprehensive catalog of GNN architectures for different data types and tasks.

Available Models:

General GNN: GCN, GAT, GIN, RGCN, MPNN
3D-aware: SchNet, GearNet
Protein-specific: ESM, ProteinBERT, GearNet
Knowledge graphs: TransE, RotatE, ComplEx, SimplE
Generative: GraphAutoregressiveFlow

Reference: See models_architectures.md

7. Datasets

40+ curated datasets covering chemistry, biology, and knowledge graphs.

Categories:

Molecular properties (drug discovery, quantum chemistry)
Protein properties (function, structure, interactions)
Knowledge graphs (general and biomedical)
Retrosynthesis reactions

Reference: See datasets.md

Common Workflows

Workflow 1: Molecular Property Prediction

Scenario: Predict blood-brain barrier permeability for drug candidates. Steps:

Load dataset: datasets.BBBP()
Choose model: GNN for molecular graphs (e.g., GIN)
Define task: PropertyPrediction with binary classification
Train using scaffold split for realistic evaluation
Evaluate with AUROC and AUPRC

Navigation: references/molecular_property_prediction.md → Dataset Selection → Model Selection → Training

Workflow 2: Protein Function Prediction

Scenario: Predict enzyme function from sequence. Steps:

Load dataset: datasets.EnzymeCommission()
Choose model: pretrained ESM or GearNet with structure
Define task: PropertyPrediction with multi-class classification
Finetune pretrained model or train from scratch
Evaluate with accuracy and per-class metrics

Navigation: references/protein_modeling.md → Model Selection (Sequence vs Structure) → Pretraining Strategies

Workflow 3: Drug Repurposing via Knowledge Graph

Scenario: Find new disease treatments in Hetionet. Steps:

Load dataset: datasets.Hetionet()
Choose model: RotatE or ComplEx
Define task: KnowledgeGraphCompletion
Train with negative sampling
Query predictions for compound-treats-disease
Filter by plausibility and mechanism

Navigation: references/knowledge_graphs.md → Hetionet Dataset → Model Selection → Biomedical Applications

aipoch/TorchDrug (English)

scientific-skills/Data Analysis/TorchDrug-English/SKILL.md

PyTorch-native Graph Neural Network framework for molecules and proteins. Suitable for building custom GNN architectures for drug discovery, protein modeling, or knowledge graph reasoning. Best for custom model development, protein property prediction, and retrosynthesis. If you need pretrained models and diverse feature extractors, use deepchem; if you need benchmark datasets, use pytdc.

37 stars

development

Updated Mar 26, 2026

$ install --global

skillsauth

npx skillsauth add aipoch/medical-research-skills TorchDrug (English)

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

70%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 5, 2026, 9:43 AM212.6s1 file scanned

SKILL.md

name:: TorchDrug (English)
description:: PyTorch-native Graph Neural Network framework for molecules and proteins. Suitable for building custom GNN architectures for drug discovery, protein modeling, or knowledge graph reasoning. Best for custom model development, protein property prediction, and retrosynthesis. If you need pretrained models and diverse feature extractors, use deepchem; if you need benchmark datasets, use pytdc.
license:: MIT
skill-author:: AIPOCH

TorchDrug

When to Use

Use this skill when you need pytorch-native graph neural network framework for molecules and proteins. suitable for building custom gnn architectures for drug discovery, protein modeling, or knowledge graph reasoning. best for custom model development, protein property prediction, and retrosynthesis. if you need pretrained models and diverse feature extractors, use deepchem; if you need benchmark datasets, use pytdc in a reproducible workflow.
Use this skill when a data analytics task needs a packaged method instead of ad-hoc freeform output.
Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
Use this skill when the documented workflow in this package is the most direct path to complete the request.
Use this skill when you need the TorchDrug (English) package behavior rather than a generic answer.

Key Features

Scope-focused workflow aligned to: PyTorch-native Graph Neural Network framework for molecules and proteins. Suitable for building custom GNN architectures for drug discovery, protein modeling, or knowledge graph reasoning. Best for custom model development, protein property prediction, and retrosynthesis. If you need pretrained models and diverse feature extractors, use deepchem; if you need benchmark datasets, use pytdc.
Documentation-first workflow with no packaged script requirement.
Reference material available in references/ for task-specific guidance.
Structured execution path designed to keep outputs consistent and reviewable.

Dependencies

Python: 3.10+. Repository baseline for current packaged skills.
Third-party packages: not explicitly version-pinned in this skill package. Add pinned versions if this skill needs stricter environment control.

Example Usage

Skill directory: 20260316/scientific-skills/Data Analytics/TorchDrug-English
No packaged executable script was detected.
Use the documented workflow in SKILL.md together with the references/assets in this folder.

Example run plan:

Read the skill instructions and collect the required inputs.
Follow the documented workflow exactly.
Use packaged references/assets from this folder when the task needs templates or rules.
Return a structured result tied to the requested deliverable.

Implementation Details

See ## Overview above for related details.

Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
Primary implementation surface: instruction-only workflow in SKILL.md.
Reference guidance: references/ contains supporting rules, prompts, or checklists.
Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.

Overview

When to Use This Skill

Use this skill when dealing with the following:

Data Types:

SMILES strings or molecular structures
Protein sequences or 3D structures (PDB files)
Reactions and retrosynthesis
Biomedical knowledge graphs
Drug discovery datasets

Tasks:

Predict molecular properties (solubility, toxicity, activity)
Protein function or structure prediction
Drug-target binding prediction
Generate new molecular structures
Plan chemical synthesis routes
Link prediction in biomedical knowledge bases
Train graph neural networks on scientific data

Libraries and Integration:

TorchDrug as core library
Often with RDKit for cheminformatics
PyTorch and PyTorch Lightning compatibility
Integrated AlphaFold and ESM for proteins

Getting Started

Installation

pip install torchdrug

Or install full version with optional dependencies

pip install torchdrug[full]

Quick Example

from torchdrug import datasets, models, tasks
from torch.utils.data import DataLoader
import torch

# Load molecular dataset
dataset = datasets.BBBP("~/molecule-datasets/")
train_set, valid_set, test_set = dataset.split()

# Define GNN model
model = models.GIN(
    input_dim=dataset.node_feature_dim,
    hidden_dims=[256, 256, 256],
    edge_input_dim=dataset.edge_feature_dim,
    batch_norm=True,
    readout="mean"
)

# Create property prediction task
task = tasks.PropertyPrediction(
    model,
    task=dataset.tasks,
    criterion="bce",
    metric=["auroc", "auprc"]
)

# Train with PyTorch
optimizer = torch.optim.Adam(task.parameters(), lr=1e-3)
train_loader = DataLoader(train_set, batch_size=32, shuffle=True)

for epoch in range(100):
    for batch in train_loader:
        loss = task(batch)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

Core Capabilities

1. Molecular Property Prediction

Predict chemical, physical, and biological properties from molecular structures.

Use Cases:

Drug-likeness and ADMET properties
Toxicity screening
Quantum chemical properties
Binding affinity prediction

Core Components:

20+ molecular datasets (BBBP, HIV, Tox21, QM9, etc.)
GNN models (GIN, GAT, SchNet)
PropertyPrediction and MultipleBinaryClassification tasks

Reference: See molecular_property_prediction.md

2. Protein Modeling

Process protein sequences, structures, and properties.

Use Cases:

Enzyme function prediction
Protein stability and solubility
Subcellular localization
Protein-protein interactions
Structure prediction

Core Components:

15+ protein datasets (EnzymeCommission, GeneOntology, PDBBind, etc.)
Sequence models (ESM, ProteinBERT, ProteinLSTM)
Structure models (GearNet, SchNet)
Multiple task types for different prediction levels

Reference: See protein_modeling.md

3. Knowledge Graph Reasoning

Predict missing links and relations in biomedical knowledge graphs.

Use Cases:

Drug repurposing
Disease mechanism discovery
Gene-disease associations
Multi-hop biomedical reasoning

Core Components:

General KGs (FB15k, WN18) and biomedical KGs (Hetionet)
Embedding models (TransE, RotatE, ComplEx)
KnowledgeGraphCompletion task

Reference: See knowledge_graphs.md

4. Molecular Generation

Generate novel molecular structures with desired properties.

Use Cases:

De novo drug design
Lead compound optimization
Chemical space exploration
Property-directed generation

Core Components:

Autoregressive generation
GCPN (policy-based generation)
GraphAutoregressiveFlow
Property optimization workflows

Reference: See molecular_generation.md

5. Retrosynthesis

Predict synthesis routes from target molecules to starting materials.

Use Cases:

Synthesis planning
Route optimization
Synthesizability assessment
Multi-step planning

Core Components:

USPTO-50k reaction dataset
CenterIdentification (reaction center prediction)
SynthonCompletion (reactant prediction)
End-to-end retrosynthesis pipeline

Reference: See retrosynthesis.md

6. Graph Neural Network Models

Comprehensive catalog of GNN architectures for different data types and tasks.

Available Models:

General GNN: GCN, GAT, GIN, RGCN, MPNN
3D-aware: SchNet, GearNet
Protein-specific: ESM, ProteinBERT, GearNet
Knowledge graphs: TransE, RotatE, ComplEx, SimplE
Generative: GraphAutoregressiveFlow

Reference: See models_architectures.md

7. Datasets

40+ curated datasets covering chemistry, biology, and knowledge graphs.

Categories:

Molecular properties (drug discovery, quantum chemistry)
Protein properties (function, structure, interactions)
Knowledge graphs (general and biomedical)
Retrosynthesis reactions

Reference: See datasets.md

Common Workflows

Workflow 1: Molecular Property Prediction

Scenario: Predict blood-brain barrier permeability for drug candidates. Steps:

Load dataset: datasets.BBBP()
Choose model: GNN for molecular graphs (e.g., GIN)
Define task: PropertyPrediction with binary classification
Train using scaffold split for realistic evaluation
Evaluate with AUROC and AUPRC

Navigation: references/molecular_property_prediction.md → Dataset Selection → Model Selection → Training

Workflow 2: Protein Function Prediction

Scenario: Predict enzyme function from sequence. Steps:

Load dataset: datasets.EnzymeCommission()
Choose model: pretrained ESM or GearNet with structure
Define task: PropertyPrediction with multi-class classification
Finetune pretrained model or train from scratch
Evaluate with accuracy and per-class metrics

Navigation: references/protein_modeling.md → Model Selection (Sequence vs Structure) → Pretraining Strategies

Workflow 3: Drug Repurposing via Knowledge Graph

Scenario: Find new disease treatments in Hetionet. Steps:

Load dataset: datasets.Hetionet()
Choose model: RotatE or ComplEx
Define task: KnowledgeGraphCompletion
Train with negative sampling
Query predictions for compound-treats-disease
Filter by plausibility and mechanism

Navigation: references/knowledge_graphs.md → Hetionet Dataset → Model Selection → Biomedical Applications

Related Skills

aipoch/conventional-oncology-hub-gene

tools

VerifiedTrustedCommunity

Generates complete conventional oncology bulk-transcriptome biomarker and hub-gene research designs from a user-provided cancer type and study direction. Always use this skill whenever a user wants to design, plan, or build a tumor bioinformatics study centered on differential expression, prognostic filtering or risk modeling, PPI-based hub-gene prioritization, diagnostic/prognostic evaluation, clinical association, immune infiltration context, methylation context, and optional tissue or cell validation. Covers five study patterns (signature-first prognostic workflow, hub-gene-first biomarker workflow, hybrid signature-to-hub workflow, immune-context biomarker workflow, translational validation workflow) and always outputs four workload configs (Lite / Standard / Advanced / Publication+) with recommended primary plan, step-by-step workflow, figure plan, validation strategy, minimal executable version, publication upgrade path...

348SKILL.mdUpdated Apr 28, 2026

aipoch/conventional-oncology-hub-gene

aipoch/conventional-non-oncology-hub-gene

development

VerifiedTrustedCommunity

Generates complete conventional non-oncology bioinformatics research designs from a user-provided disease context, process-related gene family or biological theme, and validation direction. Use when a study centers on multi-dataset bulk transcriptome integration, DEG analysis, process-gene intersection, enrichment analysis, GSEA, PPI hub-gene prioritization, TF/miRNA regulatory networks, ROC-based biomarker evaluation, and immune infiltration analysis. Covers five study patterns (process-DEG discovery, enrichment/GSEA interpretation, hub-gene prioritization, regulatory-network and immune interpretation, multi-layer public validation) and always outputs Lite / Standard / Advanced / Publication+ with a recommended primary plan, stepwise workflow, figure plan, validation hierarchy, minimal executable version, publication upgrade path, and strictly verified literature retrieval.

348SKILL.mdUpdated Apr 28, 2026

aipoch/conventional-non-oncology-hub-gene

aipoch/confounder-and-bias-control-planner

tools

VerifiedTrustedCommunity

Plans confounder control, variable adjustment logic, and bias mitigation strategies at the protocol stage for clinical, epidemiologic, translational, observational, and biomarker studies. Always use this skill when a user needs to identify major confounders, decide which variables should or should not be adjusted for, compare matching/stratification/weighting approaches, anticipate selection or measurement bias, or pressure-test a study design before execution. Focus on bias sensing, causal structure awareness, variable-role classification, and critical design review rather than generic statistical advice.

348SKILL.mdUpdated Apr 28, 2026

aipoch/confounder-and-bias-control-planner

aipoch/comparative-network-toxicology-shared-mechanism-reference-grounded

testing

VerifiedTrustedCommunity

Generates complete comparative network-toxicology research designs from a user-provided exposure pair, shared toxic phenotype, and validation direction. Use when a study centers on two related exposures under one outcome and needs target collection, shared-vs-specific target decomposition, enrichment, PPI hub prioritization, docking, optional transcriptomic cross-checks, and conservative mechanistic synthesis. Covers five study patterns and always outputs Lite / Standard / Advanced / Publication+ with a recommended primary plan, stepwise workflow, figure plan, validation hierarchy, minimal executable version, publication upgrade path, and strictly verified literature retrieval.

348SKILL.mdUpdated Apr 28, 2026

aipoch/comparative-network-toxicology-shared-mechanism-reference-grounded

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/aipoch/medical-research-skills.git

# Copy into Claude Code skills folder (global)
cp -r medical-research-skills/scientific-skills/Data Analysis/TorchDrug-English ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

aipoch/medical-research-skills

37 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT