plugin/skills/tooluniverse-gene-disease-association/SKILL.md
Gene-disease association analysis across DisGeNET, OpenTargets, Monarch, OMIM, GenCC, Orphanet. Cross-references multiple sources for evidence-graded association reports with concordance scoring (5/5 sources agree → strong, 1/5 → weak). Use for 'which diseases is gene X associated with' or 'which genes cause disease Y' queries with quantitative confidence.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-gene-disease-associationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Systematically query and compare gene-disease associations across 6+ databases to produce a unified, evidence-graded report. Cross-references DisGeNET scores, OpenTargets evidence, Monarch Initiative cross-species data, OMIM Mendelian mappings, GenCC curated validity, and Orphanet rare disease links.
IMPORTANT: Always use English gene names and disease terms in tool calls. Respond in the user's language.
When uncertain about any scientific fact, SEARCH databases first (PubMed, UniProt, ChEMBL, ClinVar, etc.) rather than reasoning from memory. A database-verified answer is always more reliable than a guess.
Phase 1: Gene/Disease Identification & ID Resolution
Resolve gene symbol to Ensembl ID, HGNC CURIE, MIM number
OR resolve disease name to UMLS CUI, EFO ID, MONDO ID, ORPHA code
|
Phase 2: DisGeNET Associations (scored, multi-evidence)
Gene-disease association scores with evidence type filtering
|
Phase 3: OpenTargets Associations (integrated evidence)
Disease phenotypes and genetic associations from OpenTargets
|
Phase 4: Monarch Initiative (cross-species evidence)
Gene-disease associations integrating OMIM, ClinVar, model organisms
|
Phase 5: Mendelian Disease Evidence (curated)
OMIM gene-disease map, GenCC validity classifications, Orphanet rare diseases
|
Phase 6: Variant-Disease Associations (optional, if gene query)
DisGeNET variant-disease links, ClinVar pathogenic variants
|
Phase 7: Evidence Synthesis
Unified table, concordance scoring, confidence levels, final report
from tooluniverse import ToolUniverse
tu = ToolUniverse()
tu.load_tools()
# Gene query: resolve IDs
gene_info = tu.tools.MyGene_query_genes(query=f"symbol:{gene_symbol}", species="human",
fields="symbol,ensembl.gene,entrezgene,name", size=5) # -> ensembl_id
monarch_search = tu.tools.MonarchV3_search(query=gene_symbol, category="biolink:Gene", limit=5) # -> HGNC CURIE
omim_result = tu.tools.OMIM_search(query=gene_symbol, limit=5) # -> MIM number
gene_summary = tu.tools.Harmonizome_get_gene(gene_symbol=gene_symbol)
# Disease query: resolve IDs
monarch_disease = tu.tools.MonarchV3_search(query=disease_name, category="biolink:Disease", limit=5) # -> MONDO CURIE
mappings = tu.tools.MonarchV3_get_mappings(entity_id=mondo_id, limit=20) # -> OMIM, ICD10, SNOMED, Orphanet
API KEY REQUIRED: DisGeNET tools require
DISGENET_API_KEYenvironment variable. Without it, all DisGeNET calls will fail. Register at https://www.disgenet.org/api/#/Authorization for a free academic key. Fallback if no key: Skip this phase and rely on OpenTargets (Phase 3) + Monarch (Phase 4) which are free and cover much of the same data.
# Gene -> diseases
disgenet_diseases = tu.tools.DisGeNET_search_gene(gene=gene_symbol, limit=20)
disgenet_gda = tu.tools.DisGeNET_get_gda(gene=gene_symbol, source="CURATED", min_score=0.3, limit=25)
# Disease -> genes (accepts name or UMLS CUI like "C0006142")
disgenet_genes = tu.tools.DisGeNET_search_disease(disease=disease_name, limit=20)
disgenet_ranked = tu.tools.DisGeNET_get_disease_genes(disease=disease_name, min_score=0.3, limit=50)
Interpreting DisGeNET scores: Higher scores reflect more evidence sources and stronger curation. Rather than memorizing cutoffs, ask: is this score driven by curated sources or text-mining? Use source="CURATED" to distinguish.
ot_diseases = tu.tools.OpenTargets_get_diseases_phenotypes_by_target_ensembl(ensemblId=ensembl_id)
ot_evidence = tu.tools.OpenTargets_target_disease_evidence(ensemblId=ensembl_id, efoId=efo_id)
# Both require pre-resolved Ensembl/EFO IDs. Use OpenTargets_multi_entity_search_by_query_string to discover IDs.
# Gene -> diseases (integrates OMIM, ClinVar, Orphanet, model organisms)
monarch_diseases = tu.tools.MonarchV3_get_associations(
subject=hgnc_curie, category="biolink:CausalGeneToDiseaseAssociation", limit=20)
# Disease -> genes
monarch_genes = tu.tools.MonarchV3_get_associations(
subject=mondo_id, category="biolink:CorrelatedGeneToDiseaseAssociation", limit=20)
histopheno = tu.tools.MonarchV3_get_histopheno(entity_id=mondo_id) # phenotypes by body system
entity = tu.tools.MonarchV3_get_entity(entity_id=hgnc_curie) # details, synonyms, xrefs
API KEY REQUIRED: OMIM tools require
OMIM_API_KEY. Register at https://omim.org/api for academic access. Fallback if no key: Use Monarch Initiative (biolink:CausalGeneToDiseaseAssociationfrom Phase 4) which includes OMIM data without requiring a key. Also use GenCC (below) which is fully open.
# OMIM: Mendelian gene-disease mapping (use gene MIM number, not phenotype MIM)
omim_entry = tu.tools.OMIM_get_entry(mim_number=mim_number)
omim_gene_map = tu.tools.OMIM_get_gene_map(mim_number=mim_number)
omim_clinical = tu.tools.OMIM_get_clinical_synopsis(mim_number=phenotype_mim)
# GenCC: curated validity (Definitive/Strong/Moderate/Limited/Disputed/Refuted)
gencc_result = tu.tools.GenCC_search_gene(gene_symbol=gene_symbol) # handles gene renames
gencc_disease = tu.tools.GenCC_search_disease(disease="Marfan syndrome") # word-tokenized matching
gencc_classifications = tu.tools.GenCC_get_classifications(gene_symbol="BRCA1", disease="breast cancer")
# Orphanet: rare disease associations (filter results by exact gene.symbol match)
orphanet_result = tu.tools.Orphanet_get_gene_diseases(gene_name=gene_symbol)
Run when the query is gene-based and variant-level evidence adds value.
vda_result = tu.tools.DisGeNET_get_vda(gene=gene_symbol, limit=25) # variant-disease links
clinvar_result = tu.tools.ClinVar_search_variants(gene=gene_symbol, max_results=20)
clinvar_detail = tu.tools.ClinVar_get_variant_details(variant_id="12345") # detailed variant info
Compile all results into a single table per gene-disease pair:
## Gene-Disease Associations for BRCA1
| Disease | DisGeNET Score | OpenTargets Score | Monarch | OMIM | GenCC | Orphanet | Sources |
|---------|---------------|-------------------|---------|------|-------|----------|---------|
| Breast cancer | 0.82 | 0.95 | Yes | #114480 | Definitive | ORPHA:227535 | 6/6 |
| Ovarian cancer | 0.78 | 0.91 | Yes | #604370 | Definitive | ORPHA:213500 | 6/6 |
| Pancreatic cancer | 0.35 | 0.42 | Yes | - | Moderate | - | 3/6 |
| Fanconi anemia | 0.45 | 0.38 | Yes | #605724 | Strong | ORPHA:84 | 5/6 |
Evidence strength reasoning: A gene-disease association supported by multiple independent lines of evidence (genetic, functional, model organism) is stronger than one supported by a single study. Ask: how many independent sources support this link? Do they converge on the same mechanism?
Genetic evidence hierarchy: Mendelian segregation (gene mutation causes disease in family) > GWAS (statistical association in population) > candidate gene study (hypothesis-driven). The first proves causation. The second shows correlation. The third is hypothesis. OMIM/GenCC "Definitive" entries represent the top of this hierarchy; DisGeNET text-mining hits represent the bottom.
Cross-database concordance: If DisGeNET, OpenTargets, AND OMIM all link gene X to disease Y, that's strong concordance. If only one database shows the link, check why -- is it a single study indexed by that database? Concordance across databases does not equal independent evidence if they all cite the same primary study. Count the number of databases supporting each association, but reason about whether they represent truly independent evidence.
Mechanism reasoning: Knowing the gene's function helps evaluate the association. A gene encoding a liver enzyme being linked to liver disease is mechanistically plausible. The same gene being linked to a psychiatric disorder needs stronger evidence because the mechanism is less obvious. Use Harmonizome gene summaries and Monarch phenotype profiles to assess mechanistic plausibility.
_gene_matches(). Other tools require current HGNC symbol from MyGene_query_genes.fields="ensembl.gene".For comprehensive disease reports: tooluniverse-disease-research For rare disease diagnosis: tooluniverse-rare-disease-diagnosis For variant interpretation: tooluniverse-variant-interpretation For drug-target validation: tooluniverse-drug-target-validation
tools
PCR / qPCR primer and oligo design — design forward/reverse primers for a target region (SantaLucia nearest-neighbor thermodynamics), compute melting temperature (Tm) and annealing temperature (Ta), check GC content, and screen an oligo for hairpins and primer-dimers. Use when you need primers for a sequence, want to QC an existing primer pair, or need the Tm of an oligo. Covers the primer-design rules (Tm matching, GC clamp, 3'-end, length) and the tools' constraint quirks.
tools
Pharmacokinetic (PK) analysis of concentration-time data — non-compartmental analysis (NCA) for Cmax, Tmax, AUC (0-t and 0-∞), terminal half-life, clearance (CL), volume of distribution (Vd), MRT, and absolute bioavailability (F). Also one-compartment fitting. Use when you have plasma/serum drug concentrations over time after a dose and need PK parameters, or to compute bioavailability from IV + oral AUCs. NOT for ADMET property prediction from structure (use tooluniverse-admet-prediction).
tools
Molecular cloning assembly design — Gibson Assembly (overlap design for seamless multi-fragment joining) and Golden Gate Assembly (Type IIS / BsaI / BbsI design with unique 4-bp fusion overhangs). Use when you need to plan how to join DNA fragments into a construct, design assembly overlaps/overhangs, or decide between cloning methods. Covers the domestication (internal-site removal), overhang-uniqueness, and overlap-Tm rules. For PCR primers to generate the fragments, see tooluniverse-primer-design.
tools
Meta-analysis / evidence synthesis — pool effect sizes across studies (odds ratios, risk ratios, hazard ratios, mean differences, correlations, GWAS betas) with fixed- or random-effects models, quantify heterogeneity (Q, I², τ²), and build a forest plot. Use when you have results from MULTIPLE studies and need a single pooled estimate, or to synthesize evidence from a systematic review / multiple GWAS / replicated experiments. Handles the error-prone effect-size + standard-error preparation (converting OR/HR/CI, two-group means±SD, proportions, and correlations into the (effect, SE) the pooling step needs).