.claude/skills/scientific-skills/SKILL.md
Comprehensive scientific research toolkit with 139 specialized skills for biology, chemistry, medicine, data science, and computational research. Transforms Claude into an AI research assistant with access to scientific databases, analysis tools, and domain-specific workflows.
npx skillsauth add oimiragieo/agent-studio scientific-skillsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A comprehensive collection of 139 ready-to-use scientific skills that transform Claude into an AI research assistant capable of executing complex multi-step scientific workflows across biology, chemistry, medicine, and related fields.
Invoke this skill when:
// Invoke the main skill catalog
Skill({ skill: 'scientific-skills' });
// Or invoke specific sub-skills directly
Skill({ skill: 'scientific-skills/rdkit' }); // Cheminformatics
Skill({ skill: 'scientific-skills/scanpy' }); // Single-cell analysis
Skill({ skill: 'scientific-skills/biopython' }); // Bioinformatics
Skill({ skill: 'scientific-skills/literature-review' }); // Literature review
| Skill | Description |
| ------------------------- | --------------------------------------- |
| pubchem | Chemical compound database |
| chembl-database | Bioactivity database for drug discovery |
| uniprot-database | Protein sequence and function database |
| pdb | Protein Data Bank structures |
| drugbank-database | Drug and drug target information |
| kegg | Pathway and genome database |
| clinvar-database | Clinical variant interpretations |
| cosmic-database | Cancer mutation database |
| ensembl-database | Genome browser and annotations |
| geo-database | Gene expression data |
| gwas-database | Genome-wide association studies |
| reactome-database | Biological pathways |
| string-database | Protein-protein interactions |
| alphafold-database | Protein structure predictions |
| biorxiv-database | Preprint server for biology |
| clinicaltrials-database | Clinical trial registry |
| ena-database | European Nucleotide Archive |
| fda-database | FDA drug approvals and labels |
| gene-database | Gene information from NCBI |
| zinc-database | Commercially available compounds |
| brenda-database | Enzyme database |
| clinpgx-database | Pharmacogenomics annotations |
| uspto-database | Patent database |
| Skill | Description |
| ----------------------------------- | ---------------------------- |
| rdkit | Cheminformatics toolkit |
| scanpy | Single-cell RNA-seq analysis |
| anndata | Annotated data matrices |
| biopython | Computational biology tools |
| pytorch-lightning | Deep learning framework |
| scikit-learn | Machine learning library |
| transformers | NLP and deep learning models |
| pandas / polars / vaex | Data manipulation |
| matplotlib / seaborn / plotly | Visualization |
| deepchem | Deep learning for chemistry |
| esm | Evolutionary Scale Modeling |
| datamol | Molecular data processing |
| pymatgen | Materials science |
| qiskit | Quantum computing |
| pymoo | Multi-objective optimization |
| statsmodels | Statistical modeling |
| sympy | Symbolic mathematics |
| networkx | Network analysis |
| geopandas | Geospatial analysis |
| shap | Model explainability |
| Skill | Description |
| ------------------ | ------------------------------- |
| gget | Gene and transcript information |
| pysam | SAM/BAM file manipulation |
| deeptools | NGS data analysis |
| pydeseq2 | Differential expression |
| scvi-tools | Deep learning for single-cell |
| etetoolkit | Phylogenetic analysis |
| scikit-bio | Bioinformatics algorithms |
| bioservices | Web services for biology |
| cellxgene-census | Cell atlas exploration |
| Skill | Description |
| ----------- | ------------------------- |
| rdkit | Molecular manipulation |
| datamol | Molecular data handling |
| molfeat | Molecular featurization |
| diffdock | Molecular docking |
| torchdrug | Drug discovery ML |
| pytdc | Therapeutics data commons |
| cobrapy | Metabolic modeling |
| Skill | Description |
| ----------------------- | ----------------------------- |
| literature-review | Systematic literature reviews |
| scientific-writing | Academic writing assistance |
| scientific-schematics | AI-generated figures |
| scientific-slides | Presentation generation |
| hypothesis-generation | Hypothesis development |
| venue-templates | Journal-specific formatting |
| citation-management | Reference management |
| Skill | Description |
| --------------------------- | ------------------------- |
| clinical-decision-support | Clinical reasoning |
| clinical-reports | Medical report generation |
| treatment-plans | Treatment planning |
| pyhealth | Healthcare ML |
| pydicom | Medical imaging |
| Skill | Description |
| ----------------------- | ------------------------ |
| benchling-integration | Lab informatics platform |
| dnanexus-integration | Genomics cloud platform |
| pylabrobot | Laboratory automation |
| flowio | Flow cytometry data |
| omero-integration | Bioimaging platform |
# 7-phase systematic literature review
# 1. Planning with PICO framework
# 2. Multi-database search execution
# 3. Screening with PRISMA flow
# 4. Data extraction and quality assessment
# 5. Thematic synthesis
# 6. Citation verification
# 7. PDF generation
# Using RDKit + ChEMBL + datamol
from rdkit import Chem
from rdkit.Chem import Descriptors, AllChem
# 1. Query ChEMBL for bioactivity data
# 2. Calculate molecular properties
# 3. Filter by drug-likeness (Lipinski)
# 4. Similarity screening
# 5. Substructure analysis
# Using scanpy + anndata
import scanpy as sc
# 1. Load and QC data
# 2. Normalization and feature selection
# 3. Dimensionality reduction (PCA, UMAP)
# 4. Clustering (Leiden algorithm)
# 5. Marker gene identification
# 6. Cell type annotation
# 8-step systematic process
# 1. Understand phenomenon
# 2. Literature search
# 3. Synthesize evidence
# 4. Generate competing hypotheses
# 5. Evaluate quality
# 6. Design experiments
# 7. Formulate predictions
# 8. Generate report
Each sub-skill follows a consistent structure:
scientific-skills/
├── SKILL.md # This file (catalog/index)
├── skills/ # Individual skill directories
│ ├── rdkit/
│ │ ├── SKILL.md # Skill documentation
│ │ ├── references/ # API references, patterns
│ │ └── scripts/ # Example scripts
│ ├── scanpy/
│ ├── biopython/
│ └── ... (139 total)
// Invoke specific skill
Skill({ skill: 'scientific-skills/rdkit' });
Skill({ skill: 'scientific-skills/scanpy' });
// Multi-skill workflow
Skill({ skill: 'scientific-skills/literature-review' });
Skill({ skill: 'scientific-skills/hypothesis-generation' });
Skill({ skill: 'scientific-skills/scientific-schematics' });
| Agent | Scientific Skills |
| -------------------- | ------------------------------------- |
| data-engineer | polars, dask, vaex, zarr-python |
| python-pro | All Python-based skills |
| database-architect | Database skills for schema design |
| technical-writer | literature-review, scientific-writing |
Task({
task_id: 'task-1',
subagent_type: 'python-pro',
description: 'Analyze molecular dataset with RDKit',
prompt: `You are the PYTHON-PRO agent with scientific research expertise.
## Task
Analyze the molecular dataset for drug-likeness properties.
## Skills to Invoke
1. Skill({ skill: "scientific-skills/rdkit" })
2. Skill({ skill: "scientific-skills/datamol" })
## Workflow
1. Load molecular data
2. Calculate descriptors
3. Apply Lipinski filters
4. Generate visualization
5. Report findings
`,
});
skills/*/SKILL.md - Individual skill documentationskills/*/references/ - API references and patternsskills/*/scripts/ - Example scripts and templates| Anti-Pattern | Why It Fails | Correct Approach | | ----------------------------------------------------------- | ----------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------- | | Performing analysis without querying databases first | Missing context from existing literature duplicates known work and misses prior art | Query PubMed/ChEMBL/UniProt before analysis to ground work in existing scientific knowledge | | Using a single tool for complex multi-domain analysis | Single-tool analysis misses domain boundary interdependencies | Chain multiple domain-specific skills (rdkit for chemistry, scanpy for single-cell, biopython for genomics) | | Skipping intermediate visualization during data processing | Errors and outliers propagate silently from preprocessing to final results | Visualize data distribution and quality metrics after each major transformation step | | Generating hypotheses without reviewing existing literature | Reinvents known solutions and ignores contradictory prior findings | Always invoke literature-review skill first; only generate hypotheses after reviewing existing evidence | | Reporting findings without documenting analysis provenance | Research cannot be reproduced, verified, or extended by other researchers | Log all data sources, version numbers, parameters, and transformation steps in the research report |
Before starting:
Read .claude/context/memory/learnings.md
After completing:
.claude/context/memory/learnings.md.claude/context/memory/issues.md.claude/context/memory/decisions.mdASSUME INTERRUPTION: If it's not in memory, it didn't happen.
MIT License - Open source and freely available for research and commercial use.
tools
Comprehensive biosignal processing toolkit for analyzing physiological data including ECG, EEG, EDA, RSP, PPG, EMG, and EOG signals. Use this skill when processing cardiovascular signals, brain activity, electrodermal responses, respiratory patterns, muscle activity, or eye movements. Applicable for heart rate variability analysis, event-related potentials, complexity measures, autonomic nervous system assessment, psychophysiology research, and multi-modal physiological signal integration.
tools
Comprehensive toolkit for creating, analyzing, and visualizing complex networks and graphs in Python. Use when working with network/graph data structures, analyzing relationships between entities, computing graph algorithms (shortest paths, centrality, clustering), detecting communities, generating synthetic networks, or visualizing network topologies. Applicable to social networks, biological networks, transportation systems, citation networks, and any domain involving pairwise relationships.
data-ai
Molecular featurization for ML (100+ featurizers). ECFP, MACCS, descriptors, pretrained models (ChemBERTa), convert SMILES to features, for QSAR and molecular ML.
development
Run Python code in the cloud with serverless containers, GPUs, and autoscaling. Use when deploying ML models, running batch processing jobs, scheduling compute-intensive tasks, or serving APIs that require GPU acceleration or dynamic scaling.