plugin/skills/tooluniverse-electron-microscopy/SKILL.md
Search and analyze electron microscopy data — cryo-EM density maps (EMDB), fitted atomic models (PDB), raw micrograph datasets (EMPIAR), and cryo-electron tomography volumes (CryoET Data Portal). Use for finding 3D structural data on a protein/complex, comparing experimental EM resolution to AlphaFold confidence, and accessing raw EM data for re-processing.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-electron-microscopyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Pipeline for discovering and analyzing electron microscopy data across the full resolution spectrum: from 3D density maps (EMDB) to fitted atomic models (PDB), raw micrograph datasets (EMPIAR), and cryo-electron tomography volumes (CryoET Data Portal). Connects EM data to structural biology context via PDB and AlphaFold.
Guiding principles:
EM resolution determines what you can see. TEM resolves individual protein complexes (~2nm). Cryo-EM achieves near-atomic resolution (<4Å) for large complexes. SEM shows surface topology. Choose the right EM modality for the question.
When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory. A database-verified answer is always more reliable than a guess.
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Typical triggers:
Not this skill: For X-ray crystallography or NMR structures, use PDB search tools directly. For protein structure prediction, use tooluniverse-protein-structure.
| Database | Content | Best For | |----------|---------|----------| | EMDB | 3D EM density maps (>40K entries) | Finding processed maps, resolution data, fitting info | | EMPIAR | Raw micrograph/tilt series datasets | Accessing original image data for reprocessing | | CryoET Data Portal | Cryo-electron tomography data | Tomographic volumes, cellular context, in-situ structures | | PDB (RCSB) | Atomic models fitted to EM maps | Structural models derived from EM data | | AlphaFold | AI-predicted protein structures | Complementary models when EM resolution is limited |
Phase 0: Query Parsing
Identify target protein/complex, method preference, resolution needs
|
Phase 1: Map & Image Search (EMDB)
Find EM density maps, resolution, method, sample details
|
Phase 2: Structure Fitting (EMDB + PDB)
Identify fitted atomic models, fitting quality
|
Phase 3: Raw Data Access (EMPIAR)
Find raw micrographs, tilt series, particle stacks
|
Phase 4: Tomography (CryoET Data Portal)
Search cryo-ET datasets, reconstructed volumes
|
Phase 5: Cross-Reference & Context (PDB + AlphaFold)
Connect to atomic models, predicted structures, literature
|
Phase 6: Report Synthesis
Integrated EM data landscape for the target
Identify from the user's request:
Objective: Find EM density maps matching the query.
Tools:
EMDB_search_structures -- search EMDB by keyword, organism, resolution
query (search term), optional resolution_min, resolution_max, method, limitEMDB_get_structure -- get full details for an EMDB entry
emdb_id (e.g., "EMD-1234")EMDB_get_map_info -- get map-specific info (resolution, contour, dimensions)
emdb_idEMDB_get_sample_info -- get sample preparation details
emdb_idWorkflow:
Resolution interpretation:
8.0A: shape; overall architecture only
Objective: Find atomic models fitted into EM maps and assess fitting quality.
Tools:
EMDB_get_validation -- get fitting/validation data for an EMDB entry
emdb_idRCSBData_get_entry -- get PDB entry details
entry_id (PDB ID)RCSBAdvSearch_search_structures -- advanced PDB search
query (search term), optional experimental_method, resolution_max, limitWorkflow:
Fitting quality indicators:
Objective: Locate raw micrograph data for potential reprocessing.
Tools:
EMPIAR_search_entries -- search EMPIAR archive
query (search term), optional limitEMPIAR_get_entry -- get detailed entry information
empiar_id (e.g., "EMPIAR-10028")Workflow:
Data types in EMPIAR:
Objective: Find cryo-electron tomography datasets for cellular and in-situ structural biology.
Tools:
CryoET_list_datasets -- search CryoET Data Portal
query (search term), optional organism, limitCryoET_get_dataset -- get dataset details
dataset_idCryoET_list_runs -- search individual tomography runs
dataset_id or query, optional limitWorkflow:
Tomography vs single particle: Tomography preserves cellular context (in situ) but typically achieves lower resolution. Single particle gives higher resolution but requires purified samples.
Objective: Connect EM data to broader structural biology context.
Tools:
alphafold_get_prediction -- get AlphaFold predicted structure
qualifier (UniProt accession)PubMed_search_articles -- find publications describing the EM work
query (search term), optional limitWorkflow:
Don't just list maps — help the user choose the RIGHT map for their purpose.
Decision matrix: Which map should I use?
| Purpose | Best Resolution | Method | Priority Criteria | |---------|----------------|--------|-------------------| | Atomic model building | < 3.5A | Single particle | Highest resolution with fitted PDB model | | Drug binding site analysis | < 3.0A | Single particle | Must resolve side chains in binding pocket | | Domain architecture | 4-8A | Single particle or subtomogram avg | Large complexes where domains need fitting | | Conformational states | < 4.5A | Single particle (multiple classes) | Look for entries with multiple maps from same dataset | | Cellular context | 15-40A | Cryo-ET | Tomographic datasets showing in-situ arrangement | | Reprocessing | Any | Any | Must have EMPIAR raw data; prefer recent datasets (better detectors) |
Quality assessment checklist:
Resolution trend analysis: If multiple maps exist over time, note the resolution trajectory. Improvement from 6A (2015) to 2.8A (2023) suggests the sample is amenable to high-resolution single particle analysis with modern hardware.
Assemble findings into an actionable report:
| Pattern | Description | Key Phases | |---------|-------------|------------| | Structure Discovery | Find all EM data for a protein | 0, 1, 2, 5 | | Reprocessing Prep | Find raw data for re-analysis | 0, 1, 3 | | Tomography Survey | Explore in-situ structural data | 0, 4 | | Resolution Comparison | Track resolution improvements over time | 0, 1, 2 | | Map-Model Validation | Assess quality of fitted atomic models | 0, 1, 2, 5 |
RCSBAdvSearch_search_structures with method filtertools
Post-market safety surveillance and recall/adverse-event RETRIEVAL across the full spectrum of FDA-regulated products that are NOT covered by the drug-AE signal skills: medical devices, food / dietary supplements / cosmetics, veterinary drugs, and drug supply (shortages). Orchestrates openFDA endpoints (MAUDE device adverse events + device recalls + 510(k), CAERS food/supplement/ cosmetic adverse events, veterinary adverse events, drug shortages, and cross-product enforcement/recall reports). USE WHEN the user asks: "are there adverse events for [device / pacemaker / infusion pump / insulin pump]", "device recalls for [firm/product]", "supplement / vitamin / cosmetic adverse reactions", "is [drug] in shortage", "what injectables are on shortage", "veterinary / animal adverse events for [drug] in [dog/cat/horse]", "food recall for listeria", "MAUDE report for [device]", "CAERS reactions for [brand]". DO NOT USE for drug adverse-event SIGNAL detection or disproportionality (PRR / ROR / IC) or drug-AE association scoring — that is `tooluniverse-pharmacovigilance` / `tooluniverse-adverse-event-detection`. This skill is multi-product surveillance and retrieval, not drug-AE statistical signal mining.
tools
--- name: tooluniverse-phewas description: Cross-ancestry / cross-biobank phenome-wide association (PheWAS) and replication. Given ONE variant (rsID) or ONE gene, look up every phenotype it associates with across European/UK (UKB-TOPMed), Finnish (FinnGen), Japanese (BioBank Japan), and Taiwanese (TPMI) biobanks, plus exome-wide gene-burden PheWAS (Genebass), then judge whether an association replicates across ancestries or is population-specific. Use whenever the user asks "what else is this va
tools
Dereplicate a putative natural product and assign its chemical taxonomy. Use to answer "is [compound] a known natural product", "what microbe/organism produces [compound]", "what chemical class is [compound]", "dereplicate this metabolite (by formula/exact mass/InChIKey/SMILES)", or "classify this molecule into ChemOnt". Searches NPAtlas for known microbial natural products (producing organism + literature reference), assigns the ChemOnt kingdom→superclass→class→subclass hierarchy via ClassyFire, resolves systematic IUPAC names to structure via OPSIN, and cross-references identity in PubChem. NOT for general drug/compound identity or ADMET (use tooluniverse-chemical-compound-retrieval / tooluniverse-small-molecule-discovery) and NOT for metabolomics pathway/enrichment analysis (use tooluniverse-metabolomics skills).
tools
Genome-ASSEMBLY discovery, QC, and replicon mapping for any organism (bacteria, archaea, fungi, and beyond) using NCBI Datasets. Resolves an organism name or taxid to assemblies, picks the reference/representative or best-quality assembly, pulls assembly QC metrics (total length, contig/scaffold N50, contig count, GC%, assembly level, RefSeq category), enumerates chromosomes and plasmids via per-replicon sequence reports, and compares candidate assemblies on quality. Use for "what genomes are available for [organism]", "assembly stats / N50 / GC content for [GCF_/GCA_ accession]", "how many plasmids does [strain] have", "compare assemblies for [species]", "find the reference genome for [taxon]", "is this assembly Complete Genome or just contigs". NOT for gene-level orthology/synteny (use tooluniverse-comparative-genomics), plant gene structure (use tooluniverse-plant-genomics), de novo assembly from raw reads (no tool exists), or taxonomy-only name/lineage lookups.