plugin/skills/tooluniverse-structural-proteomics/SKILL.md
Structural biology plus proteomics integration for drug target validation. Combines PDB experimental structures, AlphaFold predictions, GPCRdb, SAbDab antibody structures, ProteinsPlus binding-site prediction, and BindingDB ligand-affinity data. Use for druggability assessment, binding-site characterization, ligand-pocket analysis, structural-confidence scoring (resolution, pLDDT), and antibody-target interface analysis.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-structural-proteomicsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Comprehensive structural data integration using ToolUniverse tools across PDB, AlphaFold, GPCRdb, SAbDab, and proteomics databases for drug target validation.
PDBeSIFTS_get_best_structures and RCSBGraphQL_get_structure_summaryalphafold_get_summaryPDBe_get_structure_ligands and BindingDB_get_ligands_by_uniprotProteinsPlus_predict_binding_sitesWhen analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Resolution determines valid conclusions: <2A = atom positions visible; 2-3A = side chains reliable, drug design supported; >3A = backbone only, binding site unreliable. Do not over-interpret low-resolution structures.
RCSBAdvSearch_search_structures (query_type, query_value, rows), RCSBData_get_entry (entry_id), RCSBGraphQL_get_structure_summary (pdb_id), RCSBGraphQL_get_ligand_info (pdb_id), RCSB_get_chemical_component (comp_id)
pdbe_get_entry_summary (pdb_id), PDBe_get_structure_ligands (pdb_id), PDBe_get_bound_molecules (pdb_id), PDBeSearch_search_structures (query, rows), PDBeSIFTS_get_best_structures (uniprot_id), PDBeSIFTS_get_all_structures (uniprot_id), PDBe_KB_get_ligand_sites (pdb_id), PDBe_KB_get_interface_residues (pdb_id), PDBeValidation_get_quality_scores (pdb_id)
PDBePISA_get_interfaces (pdb_id), PDBePISA_get_assemblies (pdb_id)
alphafold_get_prediction (qualifier=UniProt), alphafold_get_summary (qualifier), alphafold_get_annotations (qualifier)
ProteinsPlus_predict_binding_sites (pdb_id, chain), BindingDB_get_ligands_by_uniprot (uniprot_id), BindingDB_get_ligands_by_pdb (pdb_id), BindingDB_get_targets_by_compound (smiles)
Foldseek_search_structure (sequence, mode="tmalign"), Foldseek_get_result (ticket)
GPCRdb_get_protein (protein), GPCRdb_get_structures (protein), GPCRdb_get_ligands (protein), GPCRdb_get_mutations (protein). Accepts entry names, gene symbols (auto-converted to {symbol.lower()}_human), or UniProt accessions.
SAbDab_search_structures (query/antigen), SAbDab_get_structure (pdb_id), TheraSAbDab_search_therapeutics (query), TheraSAbDab_search_by_target (target)
InterPro_get_protein_domains (uniprot_id), Pfam_get_protein_annotations (uniprot_id), UniProt_get_entry_by_accession (accession)
ProteomeXchange_search_datasets (query), ProteomeXchange_get_dataset (dataset_id)
Phase 0: Resolve protein → UniProt ID, gene symbol, organism
Phase 1: PDBeSIFTS_get_best_structures → RCSBGraphQL_get_structure_summary → PDBeValidation
Phase 2: alphafold_get_prediction/summary → compare pLDDT with experimental coverage
Phase 3: IF GPCR → GPCRdb; IF antibody target → SAbDab/TheraSAbDab
Phase 4: InterPro/Pfam domain mapping → identify unresolved regions
Phase 5: Summary table (PDB ID, method, resolution, ligands, coverage, quality)
Decisions: Resolution <2.5A for drug design. X-ray > Cryo-EM > NMR > AlphaFold for binding sites. Holo > apo structures.
Phase 1: PDBe_get_structure_ligands + RCSBGraphQL_get_ligand_info + PDBe_KB_get_ligand_sites
Phase 2: ProteinsPlus_predict_binding_sites → druggability score, pocket residues
Phase 3: BindingDB_get_ligands_by_pdb/uniprot → Ki, Kd, IC50
Phase 4: RCSB_get_chemical_component for key ligands
Filter artifacts: GOL, EDO, SO4, PEG, ACT, CL, NA. Keep cofactors (ATP, NAD, HEM) and catalytic metals (ZN, MG) if relevant.
Phase 1: Find co-crystal structures → filter for drug/analogs
Phase 2: BindingDB affinity data (Ki, Kd, IC50)
Phase 3: ProteinsPlus + PDBe-KB binding site characterization
Phase 4: PDBeValidation quality → binding site well-resolved?
Phase 5: AlphaFold + Foldseek structural comparison
Phase 6: GPCR-specific (if applicable) → active/inactive states, pharmacology, resistance mutations
Phase 7: Antibody-specific (if applicable) → epitope mapping
Phase 8: Evidence integration
| Tool | Mistake | Correct |
|------|---------|---------|
| alphafold_get_prediction/summary | uniprot_id | qualifier |
| GPCRdb_get_protein | gene_name | protein |
| PDBeSIFTS_get_best_structures | gene symbol | uniprot_id (e.g., "P04637") |
| Foldseek_search_structure | mode="3diaa" | mode="tmalign" |
| SAbDab_search_structures | name | query or antigen |
| RCSB_get_chemical_component | ligand_id | comp_id |
| Tier | Confidence | |------|------------| | T1 | Co-crystal (<2.5A) + binding affinity data | | T2 | Experimental structure + computational prediction | | T3 | AlphaFold + pocket analysis + known ligand analogs | | T4 | Homology model or low-resolution only |
| Metric | High | Acceptable | Caution | |--------|------|-----------|---------| | Resolution | <2.0A (X-ray) / <3.0A (cryo-EM) | 2.0-2.5A / 3.0-4.0A | >3.0A / >4.5A | | R-free | <0.25 | 0.25-0.30 | >0.30 | | AlphaFold pLDDT | >90 | 70-90 | <70 (disordered) |
DoGSiteScorer >0.6 = druggable; <0.4 = unlikely druggable. PISA assemblies should be cross-validated with SEC-MALS/native MS.
operation is internal, not a public parametertools
Post-market safety surveillance and recall/adverse-event RETRIEVAL across the full spectrum of FDA-regulated products that are NOT covered by the drug-AE signal skills: medical devices, food / dietary supplements / cosmetics, veterinary drugs, and drug supply (shortages). Orchestrates openFDA endpoints (MAUDE device adverse events + device recalls + 510(k), CAERS food/supplement/ cosmetic adverse events, veterinary adverse events, drug shortages, and cross-product enforcement/recall reports). USE WHEN the user asks: "are there adverse events for [device / pacemaker / infusion pump / insulin pump]", "device recalls for [firm/product]", "supplement / vitamin / cosmetic adverse reactions", "is [drug] in shortage", "what injectables are on shortage", "veterinary / animal adverse events for [drug] in [dog/cat/horse]", "food recall for listeria", "MAUDE report for [device]", "CAERS reactions for [brand]". DO NOT USE for drug adverse-event SIGNAL detection or disproportionality (PRR / ROR / IC) or drug-AE association scoring — that is `tooluniverse-pharmacovigilance` / `tooluniverse-adverse-event-detection`. This skill is multi-product surveillance and retrieval, not drug-AE statistical signal mining.
tools
--- name: tooluniverse-phewas description: Cross-ancestry / cross-biobank phenome-wide association (PheWAS) and replication. Given ONE variant (rsID) or ONE gene, look up every phenotype it associates with across European/UK (UKB-TOPMed), Finnish (FinnGen), Japanese (BioBank Japan), and Taiwanese (TPMI) biobanks, plus exome-wide gene-burden PheWAS (Genebass), then judge whether an association replicates across ancestries or is population-specific. Use whenever the user asks "what else is this va
tools
Dereplicate a putative natural product and assign its chemical taxonomy. Use to answer "is [compound] a known natural product", "what microbe/organism produces [compound]", "what chemical class is [compound]", "dereplicate this metabolite (by formula/exact mass/InChIKey/SMILES)", or "classify this molecule into ChemOnt". Searches NPAtlas for known microbial natural products (producing organism + literature reference), assigns the ChemOnt kingdom→superclass→class→subclass hierarchy via ClassyFire, resolves systematic IUPAC names to structure via OPSIN, and cross-references identity in PubChem. NOT for general drug/compound identity or ADMET (use tooluniverse-chemical-compound-retrieval / tooluniverse-small-molecule-discovery) and NOT for metabolomics pathway/enrichment analysis (use tooluniverse-metabolomics skills).
tools
Genome-ASSEMBLY discovery, QC, and replicon mapping for any organism (bacteria, archaea, fungi, and beyond) using NCBI Datasets. Resolves an organism name or taxid to assemblies, picks the reference/representative or best-quality assembly, pulls assembly QC metrics (total length, contig/scaffold N50, contig count, GC%, assembly level, RefSeq category), enumerates chromosomes and plasmids via per-replicon sequence reports, and compares candidate assemblies on quality. Use for "what genomes are available for [organism]", "assembly stats / N50 / GC content for [GCF_/GCA_ accession]", "how many plasmids does [strain] have", "compare assemblies for [species]", "find the reference genome for [taxon]", "is this assembly Complete Genome or just contigs". NOT for gene-level orthology/synteny (use tooluniverse-comparative-genomics), plant gene structure (use tooluniverse-plant-genomics), de novo assembly from raw reads (no tool exists), or taxonomy-only name/lineage lookups.