Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

mims-harvard/tooluniverse-metagenomics-analysis

Name: tooluniverse-metagenomics-analysis
Author: mims-harvard

skills/tooluniverse-metagenomics-analysis/SKILL.md

npx skillsauth add mims-harvard/tooluniverse tooluniverse-metagenomics-analysis

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Metagenomics & Microbiome Analysis

Integrated pipeline for exploring microbiome studies, classifying taxa, assessing genome quality, linking microbial composition to clinical phenotypes, and interpreting findings through pathway analysis and literature context.

Guiding principles:

Study context first -- understand biome, sequencing method, and metadata before diving into taxa
Taxonomic consistency -- GTDB taxonomy as reference standard; reconcile NCBI where needed
Genome quality matters -- CheckM completeness/contamination thresholds determine trustworthy MAGs
Interpretation over enumeration -- explain what taxa mean for the biological question
English-first queries -- use English terms in tool calls

LOOK UP, DON'T GUESS

When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory.

COMPUTE, DON'T DESCRIBE

When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.

Core Databases

| Database | Best For | |----------|---------| | MGnify | Processed metagenomics studies, taxonomic/functional results | | GTDB | Standardized bacterial/archaeal taxonomy, species-level resolution | | GMrepo | Gut species-to-human-health phenotype associations | | ENA | Raw sequencing datasets and study metadata | | KEGG | Pathway mapping for microbial functional annotations | | PubMed/EuropePMC | Published microbiome-disease studies | | CTD | Chemical-microbiome-disease relationships |

Workflow

Phase 0: Parse query → organism, biome, phenotype, or accession
Phase 1: Study Discovery → MGnify_search_studies, ENAPortal_search_studies
Phase 2: Taxonomic Classification → GTDB_search_genomes, GTDB_get_species, GTDB_search_taxon
Phase 3: Genome Quality → MGnify_search_genomes, MGnify_get_genome (CheckM metrics)
Phase 4: Functional Annotation → MGnify GO terms + KEGG pathway mapping
Phase 5: Clinical Associations → GMrepo species-phenotype links
Phase 6: Literature → PubMed/EuropePMC + CTD gene-disease
Phase 7: Interpretation & Report Synthesis

Key Phase Notes

Phase 1: ENA requires structured queries (e.g., study_title="*IBD*"), not free text. If ENA fails, fall back to MGnify.

Phase 2: GTDB uses its own naming (e.g., s__Bacteroides_A fragilis vs NCBI Bacteroides fragilis). Always note discrepancies. Use GTDB_search_taxon(operation="search_taxon", query=name).

Phase 3 - Quality tiers (MIMAG):

High: >= 90% complete, <= 5% contamination, rRNA + >= 18 tRNAs
Medium: >= 50% complete, <= 10% contamination
Low: below medium -- flag but don't exclude

Phase 4 - Functional interpretation: Don't just list GO terms. Connect to biology:

| Functional Category | Key KEGG Pathways | Significance | |---|---|---| | SCFA production | map00650, map00640 | Gut barrier, anti-inflammatory | | LPS biosynthesis | map00540 | Pro-inflammatory, endotoxemia | | Bile acid metabolism | map00120 | Fat absorption, FXR signaling | | Tryptophan metabolism | map00380 | Serotonin, AhR, immune | | Vitamin biosynthesis | map00730/740/760 | Host nutritional contribution |

Use kegg_search_pathway(keyword=...) (NOT query). Pathway IDs need organism prefix (hsa, ko, eco), NOT bare map.

Phase 5: GMrepo uses MeSH terms: "Crohn Disease" not "IBD", "Colitis, Ulcerative" not "UC", "Colorectal Neoplasms" not "colorectal cancer". Try NCBI taxon IDs if species name fails.

Phase 6 - Evidence grading:

Strong: Meta-analysis or >5 studies, consistent direction
Moderate: 2-5 studies consistent, or 1 large cohort
Preliminary: Single study or conflicting
Mechanistic only: In vitro/animal, no human epidemiology

Phase 7 - Report: Executive summary, study landscape, GTDB taxonomy, functional interpretation (not GO term lists), clinical relevance with evidence grades, mechanistic model, genome catalog with quality tiers, data gaps.

Edge Cases & Fallbacks

Taxon not in GTDB: Try partial search or fall back to MGnify (NCBI taxonomy)
No GMrepo data: Normal for non-gut organisms; use literature
GMrepo 0 results: Use formal MeSH terms or NCBI taxon IDs
No KEGG match: Check MetaCyc or literature

Limitations

GMrepo: Gut-only
GTDB: Bacteria/Archaea only
ENA: Raw data only, strict query syntax
No sequence analysis: Queries databases, not raw FASTQ/FASTA

mims-harvard/tooluniverse-metagenomics-analysis

skills/tooluniverse-metagenomics-analysis/SKILL.md

Microbiome and metagenomics analysis using MGnify, GTDB taxonomy, ENA sequencing data, and EuropePMC literature. Covers taxonomic classification, genome quality assessment, biome-clinical phenotype linkage, and pathway interpretation. Use for amplicon/shotgun metagenomics study analysis.

1,393 stars

tools

Updated May 30, 2026

$ install --global

skillsauth

npx skillsauth add mims-harvard/tooluniverse tooluniverse-metagenomics-analysis

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 26, 2026, 7:13 AM13.6s1 file scanned

SKILL.md

name:: tooluniverse-metagenomics-analysis
description:: Microbiome and metagenomics analysis using MGnify, GTDB taxonomy, ENA sequencing data, and EuropePMC literature. Covers taxonomic classification, genome quality assessment, biome-clinical phenotype linkage, and pathway interpretation. Use for amplicon/shotgun metagenomics study analysis.
disable-model-invocation:: true

Metagenomics & Microbiome Analysis

Guiding principles:

Study context first -- understand biome, sequencing method, and metadata before diving into taxa
Taxonomic consistency -- GTDB taxonomy as reference standard; reconcile NCBI where needed
Genome quality matters -- CheckM completeness/contamination thresholds determine trustworthy MAGs
Interpretation over enumeration -- explain what taxa mean for the biological question
English-first queries -- use English terms in tool calls

LOOK UP, DON'T GUESS

When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory.

COMPUTE, DON'T DESCRIBE

Core Databases

Workflow

Phase 0: Parse query → organism, biome, phenotype, or accession
Phase 1: Study Discovery → MGnify_search_studies, ENAPortal_search_studies
Phase 2: Taxonomic Classification → GTDB_search_genomes, GTDB_get_species, GTDB_search_taxon
Phase 3: Genome Quality → MGnify_search_genomes, MGnify_get_genome (CheckM metrics)
Phase 4: Functional Annotation → MGnify GO terms + KEGG pathway mapping
Phase 5: Clinical Associations → GMrepo species-phenotype links
Phase 6: Literature → PubMed/EuropePMC + CTD gene-disease
Phase 7: Interpretation & Report Synthesis

Key Phase Notes

Phase 1: ENA requires structured queries (e.g., study_title="*IBD*"), not free text. If ENA fails, fall back to MGnify.

Phase 2: GTDB uses its own naming (e.g., s__Bacteroides_A fragilis vs NCBI Bacteroides fragilis). Always note discrepancies. Use GTDB_search_taxon(operation="search_taxon", query=name).

Phase 3 - Quality tiers (MIMAG):

High: >= 90% complete, <= 5% contamination, rRNA + >= 18 tRNAs
Medium: >= 50% complete, <= 10% contamination
Low: below medium -- flag but don't exclude

Phase 4 - Functional interpretation: Don't just list GO terms. Connect to biology:

Use kegg_search_pathway(keyword=...) (NOT query). Pathway IDs need organism prefix (hsa, ko, eco), NOT bare map.

Phase 5: GMrepo uses MeSH terms: "Crohn Disease" not "IBD", "Colitis, Ulcerative" not "UC", "Colorectal Neoplasms" not "colorectal cancer". Try NCBI taxon IDs if species name fails.

Phase 6 - Evidence grading:

Strong: Meta-analysis or >5 studies, consistent direction
Moderate: 2-5 studies consistent, or 1 large cohort
Preliminary: Single study or conflicting
Mechanistic only: In vitro/animal, no human epidemiology

Edge Cases & Fallbacks

Taxon not in GTDB: Try partial search or fall back to MGnify (NCBI taxonomy)
No GMrepo data: Normal for non-gut organisms; use literature
GMrepo 0 results: Use formal MeSH terms or NCBI taxon IDs
No KEGG match: Check MetaCyc or literature

Limitations

GMrepo: Gut-only
GTDB: Bacteria/Archaea only
ENA: Raw data only, strict query syntax
No sequence analysis: Queries databases, not raw FASTQ/FASTA

Related Skills

mims-harvard/tooluniverse-self-review

tools

VerifiedTrustedCommunity

Generate the success criteria for a task or question, then review work against them. Given a task, goal, or open-ended question, decompose it into scenarios, evaluation perspectives, and fine-grained weighted YES/NO criteria using the Recursive Expansion Tree (RET) method; if work is supplied, score it criterion-by-criterion and surface what is missing or could be better. Use when asked to self-review or check your own work, judge whether a task is done well or completely, build a definition-of-done or completeness checklist, create an evaluation rubric or grading criteria, score or grade answers to a question, set up an LLM-as-judge rubric, or when the user mentions self-review, completeness check, success criteria, evaluation criteria, scoring rubric, Qworld, or the RET algorithm.

1,583SKILL.mdUpdated Jul 22, 2026

mims-harvard/tooluniverse-self-review

mims-harvard/tooluniverse-peptide-target-deorphanization

tools

VerifiedTrustedCommunity

Find the real protein target(s) of a peptide from its sequence — peptide target deorphanization / off-target identification, for ANY target class (GPCR, ion channel, protease, cytokine/growth-factor receptor, enzyme, integrin), not only GPCRs. Use when a peptide has a phenotype but does not bind its hypothesized target, when a peptide binds a target in one species or assay but not another, or to screen candidate targets for an orphan peptide. A target-class router steers a multi-route keyless pipeline (PROSITE/ELM motif, BLAST homology, HGNC/InterPro/GPCRdb/GtoPdb target-family enumeration, OpenTargets phenotype anchor, EnsemblCompara/Alliance cross-species reconciliation) plus optional NVIDIA-NIM co-folding (Boltz2, AlphaFold2-Multimer, OpenFold3) for structural confirmation.

1,583SKILL.mdUpdated Jul 22, 2026

mims-harvard/tooluniverse-peptide-target-deorphanization

mims-harvard/tooluniverse-cs-setup

tools

VerifiedTrustedCommunity

Install or update ToolUniverse in Claude Science — create the conda env, install the tooluniverse pip package, and (re)build the tooluniverse-research skill by fetching the current workflow library from GitHub. Use for first-time setup, upgrading the ToolUniverse version, refreshing the bundled workflows after an upstream release, or reinstalling on a new machine.

1,583SKILL.mdUpdated Jul 22, 2026

mims-harvard/tooluniverse-cs-setup

mims-harvard/tooluniverse-codex-plugin

tools

VerifiedTrustedCommunity

Install, set up, verify, update, pin, uninstall, or troubleshoot the ToolUniverse plugin on OpenAI Codex. ALWAYS consult this skill for any of those — don't answer from memory, because the exact marketplace name (mims-harvard/ToolUniverse), the "codex plugin marketplace add" then "codex plugin add -m tooluniverse" flow, Codex's startup auto-upgrade behavior, the uvx tooluniverse MCP server, and the API-key env vars are easy to get wrong. Use it whenever someone wants to get ToolUniverse (or "the 1000+ scientific tools" / "the harvard tools") working on Codex, says the Codex plugin or its tools/skills won't load, hits a uvx or MCP-server startup error, asks how Codex updates it, wants to pin or remove it, or finds it running an old tool version — even if they never say the word "plugin". Not for the Claude Code plugin (use tooluniverse-claude-code-plugin), for running research with the tools, or for authoring new tools or skills.

1,583SKILL.mdUpdated Jul 22, 2026

mims-harvard/tooluniverse-codex-plugin

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/mims-harvard/tooluniverse.git

# Copy into Claude Code skills folder (global)
cp -r tooluniverse/skills/tooluniverse-metagenomics-analysis ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

mims-harvard/tooluniverse

1,393 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT