skills/tooluniverse-cancer-classification/SKILL.md
Translate free-text tumor descriptions to OncoTree codes and resolve cancer subtypes/tissue hierarchy. Cross-references UMLS/NCI vocabularies. Use for standardizing cancer-type nomenclature in EHR free-text, building cohorts in OncoKB or GDC, mapping tumor-board notes to ontology codes, and ensuring consistent terminology across cancer-genomics pipelines.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-cancer-classificationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Standardize cancer type nomenclature using the OncoTree ontology. Resolves free-text tumor descriptions to structured codes with UMLS/NCI cross-references, enabling downstream use in OncoKB variant annotation and GDC cohort selection.
Apply when researcher asks about:
| Tool | Purpose | Key Params |
|------|---------|-----------|
| OncoTree_search | Free-text search for cancer types | query (tumor name or description) |
| OncoTree_get_type | Full details for a known OncoTree code | code (e.g., "LUAD", "AML") |
| OncoTree_list_tissues | List all 32 tissue categories | (no params) |
| OncoKB_annotate_variant | Variant annotation using OncoTree code | gene, variant, tumor_type |
| GDC_get_mutation_frequency | Pan-cancer mutation frequency (TCGA) | gene_symbol |
Start with free-text search to find matching OncoTree codes:
OncoTree_search(query="breast cancer")
-> Returns list: code, name, main_type, tissue, parent, level, external_references
Key response fields:
code: OncoTree code (e.g., "BRCA", "IBC") — use this in OncoKB callslevel: hierarchy depth (1=tissue, 2=main type, 3-5=subtypes)parent: parent node code for navigating the hierarchyexternal_references.UMLS: UMLS CUI listexternal_references.NCI: NCI thesaurus code listSearch tips:
Once you have a candidate code, retrieve full details:
OncoTree_get_type(code="LUAD")
-> Returns: name, main_type, tissue, color, parent, level, history, external_references
Note: Not all codes are valid. "GBM" returns 404 — correct code is "GB" (Glioblastoma, IDH-Wildtype).
Always validate via OncoTree_get_type before using in downstream tools.
When the user wants all cancers in a tissue category:
OncoTree_list_tissues()
-> Returns 32 tissue names: "Breast", "CNS/Brain", "Lung", "Myeloid", ...
OncoTree_search(query="CNS/Brain")
-> All cancer types with tissue="CNS/Brain"
Pass validated OncoTree code to OncoKB for cancer-type-specific therapeutic levels:
OncoKB_annotate_variant(gene="EGFR", variant="L858R", tumor_type="LUAD")
-> highestSensitiveLevel: "1" (FDA-approved therapy for this tumor+variant)
Without tumor_type, OncoKB returns pan-cancer levels which may be less specific.
| Tool | Required | Optional | Notes |
|------|---------|---------|-------|
| OncoTree_search | query | — | Free text; returns list sorted by relevance |
| OncoTree_get_type | code | — | Case-sensitive; "BRCA" not "brca". Returns 404 for invalid codes |
| OncoTree_list_tissues | — | — | No params; returns list of 32 tissue strings |
| OncoKB_annotate_variant | gene, variant | tumor_type | tumor_type is OncoTree code; omit for pan-cancer |
| GDC_get_mutation_frequency | gene_symbol | — | Pan-cancer TCGA only; no per-subtype breakdown |
| Code | Name | Tissue |
|------|------|--------|
| BRCA | Invasive Breast Carcinoma | Breast |
| LUAD | Lung Adenocarcinoma | Lung |
| LUSC | Lung Squamous Cell Carcinoma | Lung |
| MEL | Melanoma | Skin |
| CRC | Colorectal Cancer | Bowel |
| PAAD | Pancreatic Adenocarcinoma | Pancreas |
| GBM | (invalid — use GB) | CNS/Brain |
| GB | Glioblastoma, IDH-Wildtype | CNS/Brain |
| AML | Acute Myeloid Leukemia | Myeloid |
| PRAD | Prostate Adenocarcinoma | Prostate |
# Pattern: Resolve free-text to OncoTree code
results = OncoTree_search(query="pancreatic ductal adenocarcinoma")
# Pick result with lowest level number (most specific match)
code = results["data"][0]["code"] # e.g., "PAAD"
# Pattern: Get all subtypes within a main type
results = OncoTree_search(query="Glioma")
subtypes = [r for r in results["data"] if r["main_type"] == "Glioma"]
# Pattern: Validate code before OncoKB call
detail = OncoTree_get_type(code="GB")
if detail["status"] == "success":
OncoKB_annotate_variant(gene="IDH1", variant="R132H", tumor_type="GB")
LOOK UP DON'T GUESS -- tumor classification determines treatment. Always verify codes and biomarker interpretation via tools rather than relying on memory.
Tumors are classified on TWO axes -- both matter for treatment selection:
A tumor can be histologically identical to another but molecularly different, requiring different treatment. Example: two lung adenocarcinomas (both LUAD) but one is EGFR-mutant (targeted therapy) and another is KRAS-mutant (different targeted therapy). Always check both axes.
When interpreting cancer biomarkers, use OncoKB for actionability:
OncoKB_annotate_variant(gene="ERBB2", variant="Amplification", tumor_type="BRCA") for therapeutic levelOncoKB_annotate_variant(gene="Other Biomarkers", variant="TMB-H")OncoKB_annotate_variant(gene="Other Biomarkers", variant="MSI-H")After classifying the tumor, assess whether findings are clinically actionable:
| Grade | Criteria | Example |
|-------|----------|---------|
| Confirmed | Exact OncoTree code validated via OncoTree_get_type, UMLS + NCI cross-refs present | LUAD: validated, UMLS C0152013, NCI C3512 |
| Probable | OncoTree search returns match, but code not yet validated or missing cross-refs | Search for "cholangiocarcinoma" returns CHOL with partial external refs |
| Ambiguous | Multiple OncoTree codes match the description at different hierarchy levels | "Breast cancer" matches BRCA (invasive), BREAST (tissue), IBC (inflammatory) |
| Unresolved | No OncoTree match; tumor type too rare or novel for the ontology | Ultra-rare sarcoma subtype not in OncoTree |
OncoTree_get_type before downstream use. Some common acronyms (e.g., "GBM") are NOT valid OncoTree codes (correct code is "GB"). A validated code with UMLS and NCI cross-references is highest confidence.history field in OncoTree_get_type response shows prior names. Always use the current code.| Primary | Fallback | When |
|---------|---------|------|
| OncoTree_get_type(code="GBM") | OncoTree_search(query="glioblastoma") | 404 for common aliases |
| OncoTree_search (no results) | OncoTree_list_tissues + tissue-level search | Very rare/novel tumor types |
| OncoTree code for OncoKB | Omit tumor_type param | Code not recognized by OncoKB |
tools
Post-market safety surveillance and recall/adverse-event RETRIEVAL across the full spectrum of FDA-regulated products that are NOT covered by the drug-AE signal skills: medical devices, food / dietary supplements / cosmetics, veterinary drugs, and drug supply (shortages). Orchestrates openFDA endpoints (MAUDE device adverse events + device recalls + 510(k), CAERS food/supplement/ cosmetic adverse events, veterinary adverse events, drug shortages, and cross-product enforcement/recall reports). USE WHEN the user asks: "are there adverse events for [device / pacemaker / infusion pump / insulin pump]", "device recalls for [firm/product]", "supplement / vitamin / cosmetic adverse reactions", "is [drug] in shortage", "what injectables are on shortage", "veterinary / animal adverse events for [drug] in [dog/cat/horse]", "food recall for listeria", "MAUDE report for [device]", "CAERS reactions for [brand]". DO NOT USE for drug adverse-event SIGNAL detection or disproportionality (PRR / ROR / IC) or drug-AE association scoring — that is `tooluniverse-pharmacovigilance` / `tooluniverse-adverse-event-detection`. This skill is multi-product surveillance and retrieval, not drug-AE statistical signal mining.
tools
--- name: tooluniverse-phewas description: Cross-ancestry / cross-biobank phenome-wide association (PheWAS) and replication. Given ONE variant (rsID) or ONE gene, look up every phenotype it associates with across European/UK (UKB-TOPMed), Finnish (FinnGen), Japanese (BioBank Japan), and Taiwanese (TPMI) biobanks, plus exome-wide gene-burden PheWAS (Genebass), then judge whether an association replicates across ancestries or is population-specific. Use whenever the user asks "what else is this va
tools
Dereplicate a putative natural product and assign its chemical taxonomy. Use to answer "is [compound] a known natural product", "what microbe/organism produces [compound]", "what chemical class is [compound]", "dereplicate this metabolite (by formula/exact mass/InChIKey/SMILES)", or "classify this molecule into ChemOnt". Searches NPAtlas for known microbial natural products (producing organism + literature reference), assigns the ChemOnt kingdom→superclass→class→subclass hierarchy via ClassyFire, resolves systematic IUPAC names to structure via OPSIN, and cross-references identity in PubChem. NOT for general drug/compound identity or ADMET (use tooluniverse-chemical-compound-retrieval / tooluniverse-small-molecule-discovery) and NOT for metabolomics pathway/enrichment analysis (use tooluniverse-metabolomics skills).
tools
Genome-ASSEMBLY discovery, QC, and replicon mapping for any organism (bacteria, archaea, fungi, and beyond) using NCBI Datasets. Resolves an organism name or taxid to assemblies, picks the reference/representative or best-quality assembly, pulls assembly QC metrics (total length, contig/scaffold N50, contig count, GC%, assembly level, RefSeq category), enumerates chromosomes and plasmids via per-replicon sequence reports, and compares candidate assemblies on quality. Use for "what genomes are available for [organism]", "assembly stats / N50 / GC content for [GCF_/GCA_ accession]", "how many plasmids does [strain] have", "compare assemblies for [species]", "find the reference genome for [taxon]", "is this assembly Complete Genome or just contigs". NOT for gene-level orthology/synteny (use tooluniverse-comparative-genomics), plant gene structure (use tooluniverse-plant-genomics), de novo assembly from raw reads (no tool exists), or taxonomy-only name/lineage lookups.