skills/tooluniverse-hla-immunogenomics/SKILL.md
HLA gene-family analysis and MHC-peptide binding for transplant compatibility, vaccine epitope coverage, and cancer immunotherapy. Uses IMGT (HLA polymorphism), IEDB (epitope-MHC binding), UniProt (annotation), DGIdb (druggability). Use for HLA typing/imputation review, vaccine HLA coverage, and immunotherapy prediction biomarkers (HLA-LOH, neoantigen presentation).
npx skillsauth add mims-harvard/tooluniverse tooluniverse-hla-immunogenomicsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Pipeline for exploring HLA gene families, MHC-peptide binding, epitope associations, and their clinical implications in transplantation, vaccine development, and cancer immunotherapy. Bridges immunogenetic databases (IMGT, IEDB) with functional annotation (UniProt) and druggability data (DGIdb).
HLA analysis is fundamentally about peptide presentation: the polymorphism of HLA molecules determines which peptides are displayed to T cells, which in turn governs disease susceptibility, transplant rejection, drug hypersensitivity, and vaccine immunogenicity. HLA type affects disease susceptibility for autoimmune conditions (HLA-B27 and ankylosing spondylitis), transplant rejection (HLA mismatch drives alloresponse), drug hypersensitivity (abacavir causes severe hypersensitivity reactions only in HLA-B*57:01 carriers), and vaccine design (epitopes must be presented by the recipient's HLA alleles to elicit a T-cell response). Class I and Class II HLA molecules have fundamentally different binding grooves, peptide lengths, and T-cell partners — never conflate them. The absence of an epitope from IEDB means it has not been tested, not that it cannot bind.
LOOK UP DON'T GUESS: Never assume an allele's binding properties or population frequency — query IEDB for experimental binding data and IMGT for allele annotation. Do not guess which HLA alleles are common in a population; look up published frequency data via PubMed.
Guiding principles:
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Typical triggers:
Not this skill: For full neoantigen prediction pipelines, use tooluniverse-immunotherapy-response-prediction. For general gene function lookup, use tooluniverse-drug-target-validation.
| Database | Scope | Best For | |----------|-------|----------| | IMGT | International ImMunoGeneTics; HLA/MHC gene nomenclature and sequences | Authoritative HLA gene info, allele nomenclature, sequence data | | IEDB | Immune Epitope Database; experimentally validated epitope-MHC data | Epitope binding, MHC restriction, T-cell assay results | | BVBRC | BV-BRC (formerly PATRIC/IRD); pathogen epitopes | Pathogen-derived epitopes with host MHC context | | UniProt | Protein function and structure annotations | HLA protein features, domains, variants | | DGIdb | Drug-Gene Interaction Database | Druggability of HLA-pathway genes | | PubMed | Biomedical literature | Clinical HLA studies, transplant outcomes |
Phase 0: Query Parsing & HLA Disambiguation
Resolve allele names, identify MHC class, confirm species
|
Phase 1: HLA Gene Lookup
IMGT gene info, allele details, sequence data
|
Phase 2: MHC Binding & Restriction
IEDB MHC binding data, allele-specific peptide repertoire
|
Phase 3: Epitope-MHC Associations
IEDB/BVBRC epitope search, pathogen-specific epitopes
|
Phase 4: Functional Annotation
UniProt protein features, structural domains
|
Phase 5: Clinical & Therapeutic Context
DGIdb druggability, PubMed clinical evidence
|
Phase 6: Report Synthesis
Integrated immunogenomics report
Parse the user's input to identify:
HLA nomenclature quick reference:
HLA-A*02:01 = gene A, allele group 02, specific protein 01Objective: Get authoritative gene and allele information from IMGT.
Tools:
IMGT_search_genes -- search for HLA/MHC genes
query (gene name or keyword), optional species, locusIMGT_get_gene_info -- get detailed gene/allele information
gene_name (IMGT gene name)Workflow:
If allele not found: Check nomenclature -- older names may have been reassigned. Try searching by the gene name alone (e.g., "HLA-A") and filtering results.
Objective: Find what peptides bind to a specific MHC molecule, or what MHC molecules present a given peptide.
Tools:
iedb_search_mhc -- search for MHC molecules in IEDB
mhc_restriction (allele name), optional mhc_classiedb_get_epitope_mhc -- get MHC binding details for an epitope
epitope_id (IEDB epitope ID)Workflow:
Binding affinity interpretation (Class I):
Objective: Find epitopes from specific pathogens or antigens and their MHC restriction.
Tools:
iedb_search_epitopes -- search for experimentally validated epitopes
organism_name (source organism), source_antigen_name (protein name)BVBRC_search_epitopes -- search pathogen-derived epitopes
query (pathogen or antigen keyword), optional host, limitWorkflow:
Important: IEDB epitopes are experimentally validated, not predicted. The absence of an epitope does not mean it won't bind -- it may simply be untested.
Population coverage for vaccine design: When selecting epitopes for a vaccine, check how common the restricting HLA allele is in the target population. An epitope restricted to HLA-A*02:01 covers ~50% of Europeans but <15% of some African populations. For broad population coverage, select epitopes across multiple HLA supertypes (A2, A3, B7, B44 cover >95% of most populations).
Objective: Get protein-level features for HLA molecules and related proteins.
Tools:
UniProt_search -- search for HLA protein entries
query (protein/gene name), optional organism, limitWorkflow:
Objective: Connect HLA findings to drug interactions and clinical evidence.
Tools:
DGIdb_get_drug_gene_interactions -- find drugs targeting HLA-pathway genes
genes (list of gene names, e.g., ["HLA-A", "B2M"])PubMed_search_articles -- find clinical HLA studies
query (search term), optional limitWorkflow:
Well-known HLA-drug associations (for context, always verify with current data):
Structure the report as:
tools
Post-market safety surveillance and recall/adverse-event RETRIEVAL across the full spectrum of FDA-regulated products that are NOT covered by the drug-AE signal skills: medical devices, food / dietary supplements / cosmetics, veterinary drugs, and drug supply (shortages). Orchestrates openFDA endpoints (MAUDE device adverse events + device recalls + 510(k), CAERS food/supplement/ cosmetic adverse events, veterinary adverse events, drug shortages, and cross-product enforcement/recall reports). USE WHEN the user asks: "are there adverse events for [device / pacemaker / infusion pump / insulin pump]", "device recalls for [firm/product]", "supplement / vitamin / cosmetic adverse reactions", "is [drug] in shortage", "what injectables are on shortage", "veterinary / animal adverse events for [drug] in [dog/cat/horse]", "food recall for listeria", "MAUDE report for [device]", "CAERS reactions for [brand]". DO NOT USE for drug adverse-event SIGNAL detection or disproportionality (PRR / ROR / IC) or drug-AE association scoring — that is `tooluniverse-pharmacovigilance` / `tooluniverse-adverse-event-detection`. This skill is multi-product surveillance and retrieval, not drug-AE statistical signal mining.
tools
--- name: tooluniverse-phewas description: Cross-ancestry / cross-biobank phenome-wide association (PheWAS) and replication. Given ONE variant (rsID) or ONE gene, look up every phenotype it associates with across European/UK (UKB-TOPMed), Finnish (FinnGen), Japanese (BioBank Japan), and Taiwanese (TPMI) biobanks, plus exome-wide gene-burden PheWAS (Genebass), then judge whether an association replicates across ancestries or is population-specific. Use whenever the user asks "what else is this va
tools
Dereplicate a putative natural product and assign its chemical taxonomy. Use to answer "is [compound] a known natural product", "what microbe/organism produces [compound]", "what chemical class is [compound]", "dereplicate this metabolite (by formula/exact mass/InChIKey/SMILES)", or "classify this molecule into ChemOnt". Searches NPAtlas for known microbial natural products (producing organism + literature reference), assigns the ChemOnt kingdom→superclass→class→subclass hierarchy via ClassyFire, resolves systematic IUPAC names to structure via OPSIN, and cross-references identity in PubChem. NOT for general drug/compound identity or ADMET (use tooluniverse-chemical-compound-retrieval / tooluniverse-small-molecule-discovery) and NOT for metabolomics pathway/enrichment analysis (use tooluniverse-metabolomics skills).
tools
Genome-ASSEMBLY discovery, QC, and replicon mapping for any organism (bacteria, archaea, fungi, and beyond) using NCBI Datasets. Resolves an organism name or taxid to assemblies, picks the reference/representative or best-quality assembly, pulls assembly QC metrics (total length, contig/scaffold N50, contig count, GC%, assembly level, RefSeq category), enumerates chromosomes and plasmids via per-replicon sequence reports, and compares candidate assemblies on quality. Use for "what genomes are available for [organism]", "assembly stats / N50 / GC content for [GCF_/GCA_ accession]", "how many plasmids does [strain] have", "compare assemblies for [species]", "find the reference genome for [taxon]", "is this assembly Complete Genome or just contigs". NOT for gene-level orthology/synteny (use tooluniverse-comparative-genomics), plant gene structure (use tooluniverse-plant-genomics), de novo assembly from raw reads (no tool exists), or taxonomy-only name/lineage lookups.