plugin/skills/tooluniverse-cancer-variant-interpretation/SKILL.md
Clinical interpretation of somatic cancer mutations for precision oncology. Transforms a gene + variant + cancer-type input into an actionable report: clinical evidence tier (CIViC, OncoKB), therapeutic options (FDA-approved + investigational), resistance mechanisms, prognosis, and matching clinical trials. Use for tumor-board variant calls, somatic-mutation actionability assessment, and treatment selection. Always cancer-type-specific.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-cancer-variant-interpretationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Comprehensive clinical interpretation of somatic mutations in cancer. Transforms a gene + variant input into an actionable precision oncology report covering clinical evidence, therapeutic options, resistance mechanisms, clinical trials, and prognostic implications.
KEY PRINCIPLES:
When uncertain about any scientific fact, SEARCH databases first (PubMed, UniProt, ChEMBL, ClinVar, etc.) rather than reasoning from memory. A database-verified answer is always more reliable than a guess.
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Apply when user asks:
Required: Gene symbol + variant notation (e.g., "EGFR L858R", "BRAF p.V600E", "EML4-ALK fusion", "HER2 amplification") Optional: Cancer type (improves specificity)
Parse the gene symbol and variant separately. For fusions, use the kinase partner as the primary gene. For amplifications/deletions, use the gene name directly. Normalize common aliases: HER2 -> ERBB2, PD-L1 -> CD274, VEGF -> VEGFA.
BEFORE calling ANY tool for the first time, verify its parameters.
| Tool | WRONG Parameter | CORRECT Parameter |
|------|-----------------|-------------------|
| OpenTargets_get_associated_drugs_by_target_ensemblID | ensemblID | ensemblId (camelCase) |
| OpenTargets_get_drug_chembId_by_generic_name | genericName | drugName |
| OpenTargets_target_disease_evidence | ensemblID | ensemblId + efoId |
| MyGene_query_genes | q | query |
| search_clinical_trials | disease, biomarker | condition, query_term (required) |
| civic_get_variants_by_gene | gene_symbol | gene_id (CIViC numeric ID) |
| drugbank_* | any 3 params | ALL 4 required: query, case_sensitive, exact_match, limit |
| ChEMBL_get_drug_mechanisms | chembl_id | drug_chembl_id__exact |
| ensembl_lookup_gene | no species | species='homo_sapiens' is REQUIRED |
Input: Gene symbol + Variant notation + Optional cancer type
Phase 1: Gene Disambiguation & ID Resolution
- Resolve gene to Ensembl ID, UniProt accession, Entrez ID
- Get gene function, pathways, protein domains
- Identify cancer type EFO ID (if cancer type provided)
Phase 2: Clinical Variant Evidence (CIViC)
- Find gene in CIViC (via Entrez ID matching)
- Get all variants for the gene, match specific variant
- Retrieve evidence items (predictive, prognostic, diagnostic)
Phase 3: Mutation Prevalence (cBioPortal)
- Frequency across cancer studies
- Co-occurring mutations, cancer type distribution
Phase 4: Therapeutic Associations (OpenTargets + ChEMBL + FDA + DrugBank)
- FDA-approved targeted therapies
- Clinical trial drugs (phase 2-3), drug mechanisms
- Combination therapies
Phase 5: Resistance Mechanisms
- Known resistance variants (CIViC, literature)
- Bypass pathway analysis (Reactome)
Phase 6: Clinical Trials
- Active trials recruiting for this mutation
- Trial phase, status, eligibility
Phase 7: Prognostic Impact & Pathway Context
- Survival associations (literature)
- Pathway context (Reactome), Expression data (GTEx)
Phase 8: Report Synthesis
- Executive summary, clinical actionability score
- Treatment recommendations (prioritized), completeness checklist
For detailed code snippets and API call patterns for each phase, see ANALYSIS_DETAILS.md.
Not every mutation in a tumor is driving the cancer. Before querying databases, form a hypothesis:
ESM_explain_variant_mechanism(sequence=wt_protein_seq, position=..., ref_aa=..., alt_aa=..., top_k_features=5) answers how the substitution disrupts function — catalytic / ligand-binding / PTM / structural-stability loss. A unique missense that disrupts the same SAE feature category as a known driver hotspot in the same gene is more likely a driver than a missense that disrupts unrelated features. Requires ESM_API_KEY; missense only.Actionable means a therapy exists that targets this alteration. Think in tiers based on evidence strength:
When synthesizing, state the tier and explain WHY you assigned it based on the evidence you found, not just which database returned a hit.
If the patient has already been treated, ask: could this mutation be a resistance mechanism?
Form your clinical hypothesis FIRST based on gene function and mutation type, THEN use tools to validate:
civic_search_genes, civic_get_variants_by_gene): Your primary source for clinical evidence. Returns curated evidence items with evidence levels, clinical significance, and associated therapies. Start here for any variant with potential clinical relevance.cBioPortal_get_mutations): Use to assess mutation prevalence — is this a hotspot? How common is it across cancer types? This informs your driver vs passenger assessment.OpenTargets_get_associated_drugs_by_target_ensemblID): Use for actionability — what drugs target this gene? Cross-reference with CIViC evidence to assign tiers.PubMed_search_articles): Use when CIViC lacks entries for your variant, or to find resistance mechanism reports and recent clinical trial results.search_clinical_trials): Use after establishing the variant is potentially actionable, to find enrollment opportunities.| Tool | Key Parameters | Response Key Fields |
|------|---------------|-------------------|
| MyGene_query_genes | query, species | hits[].ensembl.gene, .entrezgene, .symbol |
| UniProt_search | query, organism, limit | results[].accession |
| OpenTargets_get_target_id_description_by_name | targetName | data.search.hits[].id |
| ensembl_lookup_gene | gene_id, species (REQUIRED) | data.id, .version |
| Tool | Key Parameters | Response Key Fields |
|------|---------------|-------------------|
| civic_search_genes | query, limit | data.genes.nodes[].id, .entrezId |
| civic_get_variants_by_gene | gene_id (CIViC numeric) | data.gene.variants.nodes[] |
| civic_get_variant | variant_id | data.variant |
| Tool | Key Parameters | Response Key Fields |
|------|---------------|-------------------|
| OpenTargets_get_associated_drugs_by_target_ensemblID | ensemblId, size | data.target.drugAndClinicalCandidates.rows[] |
| FDA_get_indications_by_drug_name | drug_name, limit | results[].indications_and_usage |
| drugbank_get_drug_basic_info_by_drug_name_or_id | query, case_sensitive, exact_match, limit (ALL required) | results[] |
| Tool | Key Parameters | Response Key Fields |
|------|---------------|-------------------|
| cBioPortal_get_mutations | study_id, gene_list | data[].proteinChange |
| cBioPortal_get_cancer_studies | limit | [].studyId, .cancerTypeId |
| Tool | Key Parameters | Response Key Fields |
|------|---------------|-------------------|
| search_clinical_trials | query_term (required), condition | studies[] |
| PubMed_search_articles | query, limit, include_abstract | Returns list of dicts (NOT wrapped) |
| Reactome_map_uniprot_to_pathways | id (UniProt accession) | Pathway mappings |
| GTEx_get_median_gene_expression | gencode_id, operation="median" | Expression by tissue |
When a primary tool returns no results, fall back rather than reporting "no data found":
tools
Post-market safety surveillance and recall/adverse-event RETRIEVAL across the full spectrum of FDA-regulated products that are NOT covered by the drug-AE signal skills: medical devices, food / dietary supplements / cosmetics, veterinary drugs, and drug supply (shortages). Orchestrates openFDA endpoints (MAUDE device adverse events + device recalls + 510(k), CAERS food/supplement/ cosmetic adverse events, veterinary adverse events, drug shortages, and cross-product enforcement/recall reports). USE WHEN the user asks: "are there adverse events for [device / pacemaker / infusion pump / insulin pump]", "device recalls for [firm/product]", "supplement / vitamin / cosmetic adverse reactions", "is [drug] in shortage", "what injectables are on shortage", "veterinary / animal adverse events for [drug] in [dog/cat/horse]", "food recall for listeria", "MAUDE report for [device]", "CAERS reactions for [brand]". DO NOT USE for drug adverse-event SIGNAL detection or disproportionality (PRR / ROR / IC) or drug-AE association scoring — that is `tooluniverse-pharmacovigilance` / `tooluniverse-adverse-event-detection`. This skill is multi-product surveillance and retrieval, not drug-AE statistical signal mining.
tools
--- name: tooluniverse-phewas description: Cross-ancestry / cross-biobank phenome-wide association (PheWAS) and replication. Given ONE variant (rsID) or ONE gene, look up every phenotype it associates with across European/UK (UKB-TOPMed), Finnish (FinnGen), Japanese (BioBank Japan), and Taiwanese (TPMI) biobanks, plus exome-wide gene-burden PheWAS (Genebass), then judge whether an association replicates across ancestries or is population-specific. Use whenever the user asks "what else is this va
tools
Dereplicate a putative natural product and assign its chemical taxonomy. Use to answer "is [compound] a known natural product", "what microbe/organism produces [compound]", "what chemical class is [compound]", "dereplicate this metabolite (by formula/exact mass/InChIKey/SMILES)", or "classify this molecule into ChemOnt". Searches NPAtlas for known microbial natural products (producing organism + literature reference), assigns the ChemOnt kingdom→superclass→class→subclass hierarchy via ClassyFire, resolves systematic IUPAC names to structure via OPSIN, and cross-references identity in PubChem. NOT for general drug/compound identity or ADMET (use tooluniverse-chemical-compound-retrieval / tooluniverse-small-molecule-discovery) and NOT for metabolomics pathway/enrichment analysis (use tooluniverse-metabolomics skills).
tools
Genome-ASSEMBLY discovery, QC, and replicon mapping for any organism (bacteria, archaea, fungi, and beyond) using NCBI Datasets. Resolves an organism name or taxid to assemblies, picks the reference/representative or best-quality assembly, pulls assembly QC metrics (total length, contig/scaffold N50, contig count, GC%, assembly level, RefSeq category), enumerates chromosomes and plasmids via per-replicon sequence reports, and compares candidate assemblies on quality. Use for "what genomes are available for [organism]", "assembly stats / N50 / GC content for [GCF_/GCA_ accession]", "how many plasmids does [strain] have", "compare assemblies for [species]", "find the reference genome for [taxon]", "is this assembly Complete Genome or just contigs". NOT for gene-level orthology/synteny (use tooluniverse-comparative-genomics), plant gene structure (use tooluniverse-plant-genomics), de novo assembly from raw reads (no tool exists), or taxonomy-only name/lineage lookups.