plugin/skills/tooluniverse-rare-disease-diagnosis/SKILL.md
Rare disease differential diagnosis from patient phenotype — HPO term matching to candidate diseases (Orphanet, OMIM), gene panel prioritization, ACMG variant interpretation, and structure-based variant analysis. Use for diagnostic odyssey assistance, phenotype-to-disease ranking, and genetic-counseling differential generation.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-rare-disease-diagnosisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Systematic diagnosis support for rare diseases using phenotype matching, gene panel prioritization, and variant interpretation across Orphanet, OMIM, HPO, ClinVar, and structure-based analysis.
KEY PRINCIPLES:
When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory.
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Apply these strategies to form a 3-5 candidate differential, then use tools to confirm/refute:
Common pitfalls: Felty's (RA+splenomegaly+neutropenia) mimics infection; SLE nephritis mimics PSGN (check ASO); occupational exposures trigger autoimmunity (silica→scleroderma/RA/SLE).
| Tool | WRONG | CORRECT |
|------|-------|---------|
| OpenTargets_get_associated_drugs_by_target_ensemblID | ensemblID | ensemblId |
| ClinVar_get_variant_details | variant_id | id |
| MyGene_query_genes | gene | q |
| gnomad_get_variant | variant | variant_id |
Phase 0: Clinical Reasoning → 3-5 candidate differential
Phase 1: Phenotype → HPO terms (HPO_search_terms), core vs variable, onset, family history
Phase 2: Disease Matching → Orphanet_search_diseases, OMIM_search, DisGeNET_search_gene
Phase 3: Gene Panel → MARRVEL_get_gene (aggregated IDs) + MARRVEL_get_omim_phenotypes (OMIM disease+inheritance), ClinGen validation, GTEx expression, prioritization scoring
Phase 3.5: Expression Context → CELLxGENE, ChIPAtlas for tissue/cell-type confirmation
Phase 3.6: Pathway Analysis → KEGG, IntAct for convergent pathways
Phase 4: Variant Interpretation → FAVOR_annotate_variant (one-call: freq + CADD/SIFT/PolyPhen/AlphaMissense + ClinVar + conservation), then ClinVar, gnomAD frequency, EVE/SpliceAI, ACMG criteria
Phase 5: Structure Analysis → AlphaFold2, InterPro domains (for VUS)
Phase 6: Literature → PubMed, BioRxiv/MedRxiv, OpenAlex
Phase 7: Report Synthesis → Prioritized differential with next steps
Phase 2 - Disease Matching: Orphanet_search_diseases(operation="search_diseases", query=keyword) then Orphanet_get_genes(operation="get_genes", orpha_code=code). Score overlap: Excellent >80%, Good 60-80%, Possible 40-60%.
Phase 3 - Gene Panel: For each candidate gene, MARRVEL_get_gene(symbol) resolves OMIM/HGNC/Ensembl/Entrez/UniProt IDs in one call, and MARRVEL_get_omim_phenotypes(symbol) lists the Mendelian diseases linked to the gene with mode of inheritance — use the inheritance pattern to filter candidates against the pedigree (e.g. drop AR genes for a clearly dominant pedigree). Then ClinGen classification drives inclusion (Definitive/Strong/Moderate = include; Limited = flag; Disputed/Refuted = exclude). Scoring: Tier 1 (top disease gene +5), Tier 2 (multi-disease +3), Tier 3 (ClinGen Definitive +3), Tier 4 (tissue expression +2), Tier 5 (pLI >0.9 +1).
Phase 4 - Variants: Start with FAVOR_annotate_variant("chr-pos-ref-alt") (GRCh38) for a single-call snapshot — population frequencies (gnomAD by ancestry, BRAVO), GENCODE consequence, CADD/SIFT/PolyPhen-2/AlphaMissense scores, conservation, and ClinVar significance — then drill into ClinVar/gnomAD/EVE/SpliceAI for detail. gnomAD frequency classes: ultra-rare <0.00001, rare <0.0001, low-freq <0.01. ACMG: PVS1 (null), PS1 (same AA), PM2 (absent pop), PP3 (computational), BA1 (>5% AF). 2+ concordant predictors strengthen PP3.
| Tier | Criteria | |------|----------| | T1 (High) | Phenotype match >80% + gene match | | T2 (Medium-High) | Phenotype match 60-80% OR likely pathogenic variant | | T3 (Medium) | Phenotype match 40-60% OR VUS in candidate gene | | T4 (Low) | Phenotype <40% OR uncertain gene |
| Primary | Fallback 1 | Fallback 2 |
|---------|------------|------------|
| get_joint_associated_diseases_by_HPO_ID_list | Orphanet_search_diseases | PubMed phenotype search |
| MARRVEL_get_omim_phenotypes | OMIM_search | Orphanet gene-disease |
| FAVOR_annotate_variant | ClinVar_get_variant_details | gnomad_get_variant |
| ClinVar_get_variant_details | gnomad_get_variant | VEP annotation |
| GTEx_get_expression_summary | HPA_search_genes_by_query | Tissue-specific literature |
scripts/clinical_patterns.py - Clinical pattern lookup (syndromes, differentials, red flags, occupational exposures)tools
Post-market safety surveillance and recall/adverse-event RETRIEVAL across the full spectrum of FDA-regulated products that are NOT covered by the drug-AE signal skills: medical devices, food / dietary supplements / cosmetics, veterinary drugs, and drug supply (shortages). Orchestrates openFDA endpoints (MAUDE device adverse events + device recalls + 510(k), CAERS food/supplement/ cosmetic adverse events, veterinary adverse events, drug shortages, and cross-product enforcement/recall reports). USE WHEN the user asks: "are there adverse events for [device / pacemaker / infusion pump / insulin pump]", "device recalls for [firm/product]", "supplement / vitamin / cosmetic adverse reactions", "is [drug] in shortage", "what injectables are on shortage", "veterinary / animal adverse events for [drug] in [dog/cat/horse]", "food recall for listeria", "MAUDE report for [device]", "CAERS reactions for [brand]". DO NOT USE for drug adverse-event SIGNAL detection or disproportionality (PRR / ROR / IC) or drug-AE association scoring — that is `tooluniverse-pharmacovigilance` / `tooluniverse-adverse-event-detection`. This skill is multi-product surveillance and retrieval, not drug-AE statistical signal mining.
tools
--- name: tooluniverse-phewas description: Cross-ancestry / cross-biobank phenome-wide association (PheWAS) and replication. Given ONE variant (rsID) or ONE gene, look up every phenotype it associates with across European/UK (UKB-TOPMed), Finnish (FinnGen), Japanese (BioBank Japan), and Taiwanese (TPMI) biobanks, plus exome-wide gene-burden PheWAS (Genebass), then judge whether an association replicates across ancestries or is population-specific. Use whenever the user asks "what else is this va
tools
Dereplicate a putative natural product and assign its chemical taxonomy. Use to answer "is [compound] a known natural product", "what microbe/organism produces [compound]", "what chemical class is [compound]", "dereplicate this metabolite (by formula/exact mass/InChIKey/SMILES)", or "classify this molecule into ChemOnt". Searches NPAtlas for known microbial natural products (producing organism + literature reference), assigns the ChemOnt kingdom→superclass→class→subclass hierarchy via ClassyFire, resolves systematic IUPAC names to structure via OPSIN, and cross-references identity in PubChem. NOT for general drug/compound identity or ADMET (use tooluniverse-chemical-compound-retrieval / tooluniverse-small-molecule-discovery) and NOT for metabolomics pathway/enrichment analysis (use tooluniverse-metabolomics skills).
tools
Genome-ASSEMBLY discovery, QC, and replicon mapping for any organism (bacteria, archaea, fungi, and beyond) using NCBI Datasets. Resolves an organism name or taxid to assemblies, picks the reference/representative or best-quality assembly, pulls assembly QC metrics (total length, contig/scaffold N50, contig count, GC%, assembly level, RefSeq category), enumerates chromosomes and plasmids via per-replicon sequence reports, and compares candidate assemblies on quality. Use for "what genomes are available for [organism]", "assembly stats / N50 / GC content for [GCF_/GCA_ accession]", "how many plasmids does [strain] have", "compare assemblies for [species]", "find the reference genome for [taxon]", "is this assembly Complete Genome or just contigs". NOT for gene-level orthology/synteny (use tooluniverse-comparative-genomics), plant gene structure (use tooluniverse-plant-genomics), de novo assembly from raw reads (no tool exists), or taxonomy-only name/lineage lookups.