skills/tooluniverse-metabolomics-pathway/SKILL.md
Metabolomics pathway analysis — metabolite identification (HMDB, KEGG, ChEBI), pathway mapping (Reactome, KEGG, MetaCyc), disease associations, enzyme/gene linkage. Use for metabolite-to-pathway-to-disease connections, BridgeDb-based ID conversion, and integrating metabolomics with gene-level pathway analyses.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-metabolomics-pathwayInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Identify metabolites, map to metabolic pathways, find disease associations, and connect to enzymes/genes.
Metabolite-to-pathway mapping requires correct, database-specific identifiers. HMDB IDs link to KEGG/Reactome but must be converted via BridgeDb; PubChem CIDs need explicit cross-referencing. Always verify metabolite identity first: the same common name can refer to structurally distinct isomers, and PubChem names frequently differ from CTD/KEGG names.
MetaCyc_get_compound, KEGG_get_compound, or ReactomeContent_searchBridgeDb_xrefsCTD_get_chemical_gene_interactions or KEGG_get_compoundMetabolite_get_diseases or CTD_get_chemical_diseasesWhen analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Phase 0: Identify & Resolve → Phase 1: Characterize → Phase 2: Pathway Map →
Phase 3: Enzyme/Gene Linkage → Phase 4: Disease Associations → Phase 5: Cross-DB Enrichment → Report
Metabolite_search: query (REQUIRED), search_type ("name"/"formula"). Returns PubChem matches with CID, name, formula, MW, SMILES.
MetabolomicsWorkbench_search_compound_by_name: name (REQUIRED). Cross-reference with RefMet.
MetabolomicsWorkbench_search_by_mz: mz (REQUIRED), adduct (e.g., "M+H"), tolerance. Uses moverz/REFMET/{mz}/{adduct}/{tolerance}.
MetabolomicsWorkbench_search_by_exact_mass: exact_mass (REQUIRED), tolerance. Uses moverz/REFMET/{mass}/M/{tolerance}.
Metabolite_get_info: compound_name, hmdb_id (e.g., "HMDB0000122"), or pubchem_cid. Returns HMDB ID, CID, InChIKey, classification.
KEGG_get_compound: compound_id (e.g., "C00031"). Returns linked pathways, enzymes, reactions.
BridgeDb_xrefs: identifier (REQUIRED), source (REQUIRED: "Ch"=HMDB, "Cs"=ChemSpider, "Ck"=KEGG, "Ce"=ChEBI), target (optional).
BridgeDb_search: query (REQUIRED), organism. Free-text metabolite search.
Metabolite_get_info: classification (super_class/class/sub_class), biological_roles, cellular_locations.
MetabolomicsWorkbench_get_refmet_info: refmet_name (REQUIRED). Standardized RefMet classification.
KEGG_get_compound: linked enzyme/reaction/pathway IDs.
MetaCyc_search_pathways: query (keyword search, e.g., "glycolysis")MetaCyc_get_pathway: pathway_id (e.g., "GLYCOLYSIS") -- reactions, enzymes, compoundsMetaCyc_get_compound: compound_id (e.g., "PYRUVATE") -- pathways it participates inMetaCyc_get_reaction: reaction_id -- substrates, products, enzymesKEGG_get_gene_pathways: gene_id (e.g., "hsa:5230") -- pathways for enzyme geneKEGG_get_pathway_genes: pathway_id (e.g., "hsa00010") -- all genes in pathwayReactomeContent_search: query, types (e.g., "Pathway"), speciesReactome_get_pathway: id (e.g., "R-HSA-70171")ReactomeAnalysis_pathway_enrichment: identifiers (space-separated string, NOT array)Reactome_map_uniprot_to_pathways: uniprot_idCTD_get_chemical_gene_interactions: input_terms (chemical name). Returns interacting genes.
KEGG_get_gene_pathways: which pathways an enzyme gene participates in.
BridgeDb_attributes: identifier, source, organism. Get attributes for identifier.
Workflow: KEGG compound -> enzyme IDs -> MetaCyc reaction -> enzyme names -> Reactome uniprot -> pathways -> MyGene for gene info.
CTD_get_chemical_diseases: input_terms (chemical name, MeSH, CAS RN). Curated associations with direct/inferred evidence.
CTD_get_gene_diseases: input_terms (gene name). For metabolite-processing genes from Phase 3.
Metabolite_get_diseases: compound_name/hmdb_id/pubchem_cid, limit (default 50). CTD-backed.
MetabolomicsWorkbench_get_study: study_id (e.g., "ST000001").
MetabolomicsWorkbench_get_compound_by_pubchem_cid: pubchem_cid.
PubMed_search_articles / EuropePMC_search_articles: literature context.
For metabolite list enrichment: (1) convert names to gene/enzyme IDs via CTD, (2) run ReactomeAnalysis_pathway_enrichment with space-separated identifiers, (3) use KEGG_get_gene_pathways per enzyme.
| Mistake | Correction | |---------|-----------| | Array to ReactomeAnalysis_pathway_enrichment | Must be space-separated string | | HMDB IDs in CTD_get_chemical_diseases | CTD uses common names or MeSH IDs | | Not resolving names first | Always start with Metabolite_search | | gene_id without organism prefix for KEGG | Need "hsa:5230" not "5230" | | Expecting HMDB API | No open API; use Metabolite_get_info (PubChem-backed) | | PubChem title to CTD when names differ | Try both PubChem name and common synonyms | | MetabolomicsWorkbench exactmass | Use moverz/REFMET/{mass}/M/{tolerance} (exactmass broken) |
| Tier | Criteria | Sources | |------|----------|---------| | T1 | Curated disease association, direct evidence | CTD curated, OMIM | | T2 | Multiple database pathway concordance | MetaCyc + KEGG + Reactome agreement | | T3 | Inferred or single-database | CTD inferred, single pathway DB | | T4 | Computational prediction or text-mining | Literature, RefMet classification |
tools
PCR / qPCR primer and oligo design — design forward/reverse primers for a target region (SantaLucia nearest-neighbor thermodynamics), compute melting temperature (Tm) and annealing temperature (Ta), check GC content, and screen an oligo for hairpins and primer-dimers. Use when you need primers for a sequence, want to QC an existing primer pair, or need the Tm of an oligo. Covers the primer-design rules (Tm matching, GC clamp, 3'-end, length) and the tools' constraint quirks.
tools
Pharmacokinetic (PK) analysis of concentration-time data — non-compartmental analysis (NCA) for Cmax, Tmax, AUC (0-t and 0-∞), terminal half-life, clearance (CL), volume of distribution (Vd), MRT, and absolute bioavailability (F). Also one-compartment fitting. Use when you have plasma/serum drug concentrations over time after a dose and need PK parameters, or to compute bioavailability from IV + oral AUCs. NOT for ADMET property prediction from structure (use tooluniverse-admet-prediction).
tools
Molecular cloning assembly design — Gibson Assembly (overlap design for seamless multi-fragment joining) and Golden Gate Assembly (Type IIS / BsaI / BbsI design with unique 4-bp fusion overhangs). Use when you need to plan how to join DNA fragments into a construct, design assembly overlaps/overhangs, or decide between cloning methods. Covers the domestication (internal-site removal), overhang-uniqueness, and overlap-Tm rules. For PCR primers to generate the fragments, see tooluniverse-primer-design.
tools
Meta-analysis / evidence synthesis — pool effect sizes across studies (odds ratios, risk ratios, hazard ratios, mean differences, correlations, GWAS betas) with fixed- or random-effects models, quantify heterogeneity (Q, I², τ²), and build a forest plot. Use when you have results from MULTIPLE studies and need a single pooled estimate, or to synthesize evidence from a systematic review / multiple GWAS / replicated experiments. Handles the error-prone effect-size + standard-error preparation (converting OR/HR/CI, two-group means±SD, proportions, and correlations into the (effect, SE) the pooling step needs).