plugin/skills/tooluniverse-systems-biology/SKILL.md
Systems biology and pathway analysis integrating Reactome, KEGG, WikiPathways, BioCarta, NCI-Nature Pathway Interaction Database. Multi-database pathway enrichment, protein-pathway relationships, network reasoning. Use for pathway analysis on a gene list, multi-source pathway concordance, and systems-level interpretation across databases.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-systems-biologyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Comprehensive pathway and systems biology analysis integrating multiple curated databases to provide multi-dimensional view of biological systems, pathway enrichment, and protein-pathway relationships.
Triggers:
Use Cases:
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Pathway analysis answers: which biological processes are enriched in my gene list? But enrichment is not causation. A pathway being enriched means your gene list overlaps it more than expected by chance. Ask: is the enrichment driven by a few hub genes, or by many genes distributed across the pathway? A pathway with 3 input genes but 200 annotated members is less informative than one where 15 of 40 members are in your list.
LOOK UP DON'T GUESS: pathway membership, gene-to-pathway assignments, and enrichment statistics. Do not assume a gene is in a pathway — use Reactome, KEGG, or Enrichr to verify. Pathway databases disagree on membership; cross-validate key findings across at least two sources.
| Database | Strengths | |----------|-----------| | Reactome | Detailed mechanistic pathways with reactions; human-curated | | KEGG | Metabolic maps, disease pathways, drug targets | | WikiPathways | Emerging and community-curated pathways | | Pathway Commons | Meta-database aggregating multiple sources | | BioModels | Mathematical/computational SBML models | | Enrichr | Statistical over-representation analysis |
Input → Phase 1: Enrichment → Phase 2: Protein Mapping → Phase 3: Keyword Search → Phase 4: Top Pathways → Report
When: Gene list provided (from experiments, screens, differentially expressed genes)
Objective: Identify biological pathways statistically over-represented in gene list
| Tool | Input | Use |
|------|-------|-----|
| ReactomeAnalysis_pathway_enrichment | identifiers (newline-separated symbols), page_size | FDR-corrected Reactome enrichment (recommended) |
| enrichr_gene_enrichment_analysis | gene_list (array), libs (array) | Over-representation with KEGG/Reactome/WikiPathways |
| STRING_functional_enrichment | protein_ids (array), species, category | Functional enrichment from PPI networks |
| intact_get_interactions | identifier (UniProt accession) | Binary protein interactions with evidence |
When: Protein UniProt ID provided
Objective: Map protein to all known pathways it participates in
Reactome_map_uniprot_to_pathways:
uniprot_id: UniProt accession (e.g., "P53350")Reactome_get_pathway_reactions:
stId: Reactome pathway stable ID (e.g., "R-HSA-73817")When: User provides keyword or biological process name
Objective: Search multiple pathway databases to find relevant pathways
| Tool | Key Params | Coverage |
|------|-----------|----------|
| kegg_search_pathway | keyword | Reference, metabolic, disease pathways |
| kegg_get_pathway_info | pathway_id (e.g., "hsa04930") | Detailed genes/compounds for a pathway |
| WikiPathways_search | query, organism | Community-curated, emerging pathways |
| PathwayCommons_search | action="search_pathways", keyword | Meta-database aggregating multiple sources |
| biomodels_search | query, limit | SBML computational models |
Search all databases in parallel. Group results by pathway concept. BioModels often returns empty — this is normal.
When: Always included to provide context
Objective: Show major biological systems/pathways for organism
Reactome_list_top_pathways:
species (e.g., "Homo sapiens")Create a markdown report progressively: header → Phase 1 enrichment results → Phase 2 protein mapping → Phase 3 keyword search → Phase 4 top pathway catalog. Note empty results explicitly; never silently omit them. Include pathway IDs for follow-up.
Critical Parameter Notes (from testing):
| Tool | Correct Parameter | Common Mistake |
|------|-------------------|----------------|
| Reactome_map_uniprot_to_pathways | uniprot_id | id |
| PathwayCommons_search | action + keyword (both required) | omitting action |
| enrichr_gene_enrichment_analysis | gene_list (array) | string |
Response Format Notes:
{status, data})total_hits and pathways{status: "success", data: [...]} formatLOOK UP DON'T GUESS: Km values, kcat values, cofactor requirements, and optimal pH/temperature for specific enzymes. Use BindingDB_search_by_target, ChEMBL_get_molecule, BRENDA_get_enzyme_info (requires BRENDA_EMAIL + BRENDA_PASSWORD env vars; free academic registration at brenda-enzymes.org) (if available), or EuropePMC_search_articles to retrieve published kinetic parameters. Do not estimate Km from first principles.
The foundational model: v = Vmax * [S] / (Km + [S])
To determine Km and Vmax from data: use Lineweaver-Burk (1/v vs 1/[S]), Eadie-Hofstee (v vs v/[S]), or nonlinear regression (preferred — avoids distortion from reciprocal transforms). See enzyme_kinetics.py in skills/tooluniverse-computational-biophysics/scripts/.
Not all enzymes follow Michaelis-Menten. Sigmoidal v-vs-[S] curves indicate cooperativity.
| Type | Effect on Km | Effect on Vmax | Lineweaver-Burk pattern | |------|-------------|----------------|------------------------| | Competitive | Increases (Km_app = Km * (1 + [I]/Ki)) | Unchanged | Lines intersect on y-axis | | Uncompetitive | Decreases | Decreases | Parallel lines | | Noncompetitive (pure) | Unchanged | Decreases (Vmax_app = Vmax / (1 + [I]/Ki)) | Lines intersect on x-axis | | Mixed | Changes | Decreases | Lines intersect in quadrant II or III |
To determine Ki: measure v at multiple [I] and [S], fit to the appropriate model. The enzyme_kinetics.py script handles competitive, uncompetitive, and noncompetitive inhibition calculations.
When a purified enzyme shows no catalytic activity, systematically check:
Metabolic flux analysis (MFA) quantifies the rates of metabolic reactions in vivo, not just enzyme activities in vitro.
Key concepts:
biomodels_search to find published SBML models for the organism.LOOK UP DON'T GUESS: stoichiometric coefficients, pathway topology, and published flux distributions. Use KEGG (kegg_get_pathway_info), Reactome (Reactome_get_pathway_reactions), and BioModels (biomodels_search) for these data.
Best for: Gene set analysis, protein function investigation, pathway discovery, systems-level biology
tools
Post-market safety surveillance and recall/adverse-event RETRIEVAL across the full spectrum of FDA-regulated products that are NOT covered by the drug-AE signal skills: medical devices, food / dietary supplements / cosmetics, veterinary drugs, and drug supply (shortages). Orchestrates openFDA endpoints (MAUDE device adverse events + device recalls + 510(k), CAERS food/supplement/ cosmetic adverse events, veterinary adverse events, drug shortages, and cross-product enforcement/recall reports). USE WHEN the user asks: "are there adverse events for [device / pacemaker / infusion pump / insulin pump]", "device recalls for [firm/product]", "supplement / vitamin / cosmetic adverse reactions", "is [drug] in shortage", "what injectables are on shortage", "veterinary / animal adverse events for [drug] in [dog/cat/horse]", "food recall for listeria", "MAUDE report for [device]", "CAERS reactions for [brand]". DO NOT USE for drug adverse-event SIGNAL detection or disproportionality (PRR / ROR / IC) or drug-AE association scoring — that is `tooluniverse-pharmacovigilance` / `tooluniverse-adverse-event-detection`. This skill is multi-product surveillance and retrieval, not drug-AE statistical signal mining.
tools
--- name: tooluniverse-phewas description: Cross-ancestry / cross-biobank phenome-wide association (PheWAS) and replication. Given ONE variant (rsID) or ONE gene, look up every phenotype it associates with across European/UK (UKB-TOPMed), Finnish (FinnGen), Japanese (BioBank Japan), and Taiwanese (TPMI) biobanks, plus exome-wide gene-burden PheWAS (Genebass), then judge whether an association replicates across ancestries or is population-specific. Use whenever the user asks "what else is this va
tools
Dereplicate a putative natural product and assign its chemical taxonomy. Use to answer "is [compound] a known natural product", "what microbe/organism produces [compound]", "what chemical class is [compound]", "dereplicate this metabolite (by formula/exact mass/InChIKey/SMILES)", or "classify this molecule into ChemOnt". Searches NPAtlas for known microbial natural products (producing organism + literature reference), assigns the ChemOnt kingdom→superclass→class→subclass hierarchy via ClassyFire, resolves systematic IUPAC names to structure via OPSIN, and cross-references identity in PubChem. NOT for general drug/compound identity or ADMET (use tooluniverse-chemical-compound-retrieval / tooluniverse-small-molecule-discovery) and NOT for metabolomics pathway/enrichment analysis (use tooluniverse-metabolomics skills).
tools
Genome-ASSEMBLY discovery, QC, and replicon mapping for any organism (bacteria, archaea, fungi, and beyond) using NCBI Datasets. Resolves an organism name or taxid to assemblies, picks the reference/representative or best-quality assembly, pulls assembly QC metrics (total length, contig/scaffold N50, contig count, GC%, assembly level, RefSeq category), enumerates chromosomes and plasmids via per-replicon sequence reports, and compares candidate assemblies on quality. Use for "what genomes are available for [organism]", "assembly stats / N50 / GC content for [GCF_/GCA_ accession]", "how many plasmids does [strain] have", "compare assemblies for [species]", "find the reference genome for [taxon]", "is this assembly Complete Genome or just contigs". NOT for gene-level orthology/synteny (use tooluniverse-comparative-genomics), plant gene structure (use tooluniverse-plant-genomics), de novo assembly from raw reads (no tool exists), or taxonomy-only name/lineage lookups.