Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

drugclaw/pharma-ml-tools

Name: pharma-ml-tools
Author: drugclaw

skills/pharma/pharma-ml-tools/SKILL.md

npx skillsauth add drugclaw/drugclaw pharma-ml-tools

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Pharma ML Tools

Use this skill when the user asks for compound-library profiling, chemistry ML feature generation, medicinal-chemistry screening, or benchmark dataset preparation.

Typical triggers:

standardize and profile a compound library before QSAR or screening
featurize molecules with molfeat for downstream ML
pull benchmark-ready ADME, toxicity, DTI, or DDI datasets with PyTDC
apply medicinal-chemistry rules or alert filters before prioritization
compare scaffolds, duplicates, or diversity in a virtual-screening library
prepare a docking or QSAR campaign with better compound hygiene

Environment Check

which python3 || true
python3 - <<'PY'
mods = ["pandas", "numpy", "datamol", "molfeat", "medchem"]
for name in mods:
    try:
        __import__(name)
        print(f"{name}: ok")
    except Exception as exc:
        print(f"{name}: missing ({exc})")
try:
    import tdc
    print("PyTDC: ok")
except Exception as exc:
    print(f"PyTDC: missing ({exc})")
PY

If a requested module is missing, say so explicitly. Do not claim the screen, featurization, or dataset pull completed.

Bundled Assets

templates/datamol_library_profile.py
templates/molfeat_featurize.py
templates/pytdc_dataset_fetch.py
templates/medchem_screen.py

Preferred Workflow

Normalize the input table first and identify the exact SMILES column.
Run datamol_library_profile.py before building models so duplicates, invalid structures, and scaffold concentration are visible.
Use medchem_screen.py before large docking or QSAR jobs to flag problematic chemotypes.
Use molfeat_featurize.py when the user needs model-ready features rather than only descriptor summaries.
Use pytdc_dataset_fetch.py when the user needs reproducible public benchmark datasets rather than ad hoc CSV collection.
Keep outputs under a dedicated directory such as ./pharma_ml/.

Library Profiling With Datamol

python3 templates/datamol_library_profile.py \
  --input libraries/kinase_hits.csv \
  --smiles-column smiles \
  --id-column compound_id \
  --output pharma_ml/kinase_hits_profile.csv \
  --summary pharma_ml/kinase_hits_profile.json

Use this first for:

canonical SMILES and InChIKey generation
invalid structure detection
scaffold counts
molecular-property summaries before modeling

Molfeat Featurization

python3 templates/molfeat_featurize.py \
  --input libraries/kinase_hits.csv \
  --smiles-column smiles \
  --id-column compound_id \
  --featurizer ecfp \
  --output pharma_ml/kinase_hits_ecfp.csv \
  --summary pharma_ml/kinase_hits_ecfp.json

Supported baseline featurizers in the bundled template:

ecfp
maccs
rdkit2d

Use this for local QSAR, ranking, clustering, or embedding handoff.

PyTDC Benchmark Datasets

python3 templates/pytdc_dataset_fetch.py \
  --task adme \
  --dataset Caco2_Wang \
  --split-method scaffold \
  --out-dir pharma_ml/caco2_wang

Good use cases:

ADME or toxicity baselines
DTI or DDI dataset retrieval
reproducible train/valid/test splits for benchmarking

Medicinal-Chemistry Screening

python3 templates/medchem_screen.py \
  --input libraries/kinase_hits.csv \
  --smiles-column smiles \
  --id-column compound_id \
  --output pharma_ml/kinase_hits_medchem.csv \
  --summary pharma_ml/kinase_hits_medchem.json

Use this for:

Rule-of-Five and lead-like checks
alert-oriented library triage
quick pass/fail summaries before wet-lab nomination

Treat these filters as prioritization heuristics, not hard truth.

DiffDock Boundary

If the user asks for diffusion docking or deep pose generation, acknowledge that this runtime already includes docking-tools for Vina-style workflows, but DiffDock-class workflows require a heavier environment with PyTorch Geometric, model weights, and usually GPU acceleration. Do not pretend that support is bundled unless the environment is confirmed.

Output Expectations

Good answers should mention:

the exact input file and SMILES column
which template ran
valid versus invalid molecule counts
whether outputs are profiling, features, dataset splits, or medchem filters
what files were written
any module, network, or dataset-license caveats

Related Skills

For public APIs such as PubChem, ChEMBL, openFDA, ClinicalTrials.gov, or OpenAlex, activate pharma-db-tools. For RDKit descriptors, ADMET heuristics, DrugBank, QSAR, or structure-aware affinity, activate chem-tools. For docking and pose-level workflows, activate docking-tools.

drugclaw/pharma-ml-tools

skills/pharma/pharma-ml-tools/SKILL.md

Pharmaceutical machine-learning workflow guide for library profiling, molecular featurization, benchmark dataset fetch, medicinal-chemistry filtering, and optional pose-generation handoff. Use when the user asks for datamol, molfeat, PyTDC, medchem, compound-library triage, dataset preparation, or chemistry-ML baselines beyond simple descriptor calculation.

52 stars

tools

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add drugclaw/drugclaw pharma-ml-tools

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 6:25 PM24.7s5 files scanned

SKILL.md

name:: pharma-ml-tools
description:: Pharmaceutical machine-learning workflow guide for library profiling, molecular featurization, benchmark dataset fetch, medicinal-chemistry filtering, and optional pose-generation handoff. Use when the user asks for datamol, molfeat, PyTDC, medchem, compound-library triage, dataset preparation, or chemistry-ML baselines beyond simple descriptor calculation.
source:: drugclaw
updated_at:: 2026-03-11

Pharma ML Tools

Use this skill when the user asks for compound-library profiling, chemistry ML feature generation, medicinal-chemistry screening, or benchmark dataset preparation.

Typical triggers:

standardize and profile a compound library before QSAR or screening
featurize molecules with molfeat for downstream ML
pull benchmark-ready ADME, toxicity, DTI, or DDI datasets with PyTDC
apply medicinal-chemistry rules or alert filters before prioritization
compare scaffolds, duplicates, or diversity in a virtual-screening library
prepare a docking or QSAR campaign with better compound hygiene

Environment Check

which python3 || true
python3 - <<'PY'
mods = ["pandas", "numpy", "datamol", "molfeat", "medchem"]
for name in mods:
    try:
        __import__(name)
        print(f"{name}: ok")
    except Exception as exc:
        print(f"{name}: missing ({exc})")
try:
    import tdc
    print("PyTDC: ok")
except Exception as exc:
    print(f"PyTDC: missing ({exc})")
PY

If a requested module is missing, say so explicitly. Do not claim the screen, featurization, or dataset pull completed.

Bundled Assets

templates/datamol_library_profile.py
templates/molfeat_featurize.py
templates/pytdc_dataset_fetch.py
templates/medchem_screen.py

Preferred Workflow

Normalize the input table first and identify the exact SMILES column.
Run datamol_library_profile.py before building models so duplicates, invalid structures, and scaffold concentration are visible.
Use medchem_screen.py before large docking or QSAR jobs to flag problematic chemotypes.
Use molfeat_featurize.py when the user needs model-ready features rather than only descriptor summaries.
Use pytdc_dataset_fetch.py when the user needs reproducible public benchmark datasets rather than ad hoc CSV collection.
Keep outputs under a dedicated directory such as ./pharma_ml/.

Library Profiling With Datamol

python3 templates/datamol_library_profile.py \
  --input libraries/kinase_hits.csv \
  --smiles-column smiles \
  --id-column compound_id \
  --output pharma_ml/kinase_hits_profile.csv \
  --summary pharma_ml/kinase_hits_profile.json

Use this first for:

canonical SMILES and InChIKey generation
invalid structure detection
scaffold counts
molecular-property summaries before modeling

Molfeat Featurization

python3 templates/molfeat_featurize.py \
  --input libraries/kinase_hits.csv \
  --smiles-column smiles \
  --id-column compound_id \
  --featurizer ecfp \
  --output pharma_ml/kinase_hits_ecfp.csv \
  --summary pharma_ml/kinase_hits_ecfp.json

Supported baseline featurizers in the bundled template:

ecfp
maccs
rdkit2d

Use this for local QSAR, ranking, clustering, or embedding handoff.

PyTDC Benchmark Datasets

python3 templates/pytdc_dataset_fetch.py \
  --task adme \
  --dataset Caco2_Wang \
  --split-method scaffold \
  --out-dir pharma_ml/caco2_wang

Good use cases:

ADME or toxicity baselines
DTI or DDI dataset retrieval
reproducible train/valid/test splits for benchmarking

Medicinal-Chemistry Screening

python3 templates/medchem_screen.py \
  --input libraries/kinase_hits.csv \
  --smiles-column smiles \
  --id-column compound_id \
  --output pharma_ml/kinase_hits_medchem.csv \
  --summary pharma_ml/kinase_hits_medchem.json

Use this for:

Rule-of-Five and lead-like checks
alert-oriented library triage
quick pass/fail summaries before wet-lab nomination

Treat these filters as prioritization heuristics, not hard truth.

DiffDock Boundary

Output Expectations

Good answers should mention:

the exact input file and SMILES column
which template ran
valid versus invalid molecule counts
whether outputs are profiling, features, dataset splits, or medchem filters
what files were written
any module, network, or dataset-license caveats

Related Skills

drugclaw/survival-analysis-tools

tools

VerifiedTrustedCommunity

Survival and time-to-event workflow guide for Kaplan-Meier summaries, log-rank tests, and Cox proportional hazards models with reproducible outputs. Use when the user asks for time-to-event analysis, censored data summaries, hazard ratios, or survival-group comparison for research datasets.

52SKILL.mdUpdated Apr 4, 2026

drugclaw/survival-analysis-tools

drugclaw/stat-modeling-tools

tools

VerifiedTrustedCommunity

Statistical modeling workflow guide for hypothesis tests, effect-size reporting, statsmodels regression, diagnostics, and structured result export. Use when the user asks for statistical test selection, OLS or logistic regression, coefficient tables, inference, or reproducible statistical summaries for scientific datasets.

52SKILL.mdUpdated Apr 4, 2026

drugclaw/stat-modeling-tools

drugclaw/scientific-workflow-tools

tools

VerifiedTrustedCommunity

Research-method workflow guide for hypothesis framing, peer-review style critique, reproducibility planning, study-design checks, and scientific-writing structure. Use when the user asks for manuscript critique, research-gap framing, hypothesis generation, reproducibility checklists, or study-planning support that should stay on the research side rather than patient-care decisions.

52SKILL.mdUpdated Apr 4, 2026

drugclaw/scientific-workflow-tools

drugclaw/scientific-visualization-tools

tools

VerifiedTrustedCommunity

Scientific visualization workflow guide for publication-ready static figures with seaborn or matplotlib and interactive figures with Plotly. Use when the user asks for scientific plots, cohort or assay figures, publication graphics, dashboards, or reusable plotting scripts for research datasets.

52SKILL.mdUpdated Apr 4, 2026

drugclaw/scientific-visualization-tools

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/drugclaw/drugclaw.git

# Copy into Claude Code skills folder (global)
cp -r drugclaw/skills/pharma/pharma-ml-tools ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

drugclaw/drugclaw

52 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT