skills/tooluniverse-clinical-data-integration/SKILL.md
End-to-end drug safety review integrating FDA labels, FAERS adverse event reports, PRR/ROR disproportionality, pharmacogenomic biomarkers, clinical trial data, and published literature. Use for regulatory drug safety reviews, comprehensive pharmacovigilance reports, label-vs-real-world AE comparison, and clinical decision support for drug safety.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-clinical-data-integrationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
End-to-end drug safety review pipeline that integrates FDA label information, FAERS spontaneous reports, disproportionality signal detection, pharmacogenomic biomarkers, clinical trial data, and published literature. Designed for regulatory assessments, pharmacovigilance, and clinical decision support.
Guiding principles:
Clinical data integration starts with data harmonization. Different hospitals code the same diagnosis differently (ICD-10 vs SNOMED). Before merging datasets, verify the coding system. Missing data is informative — a missing lab value may mean the test wasn't ordered (patient was stable) not that the result was normal.
When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory. A database-verified answer is always more reliable than a guess.
Differentiation: This skill emphasizes regulatory-grade data integration across the full drug lifecycle. For focused FAERS signal detection with quantitative scoring, see tooluniverse-adverse-event-detection. For general pharmacovigilance workflows, see tooluniverse-pharmacovigilance.
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Typical triggers:
| Source | Type | Best For | |--------|------|----------| | FDA Labels (DailyMed) | Regulatory | Approved safety information, boxed warnings, drug interactions | | FAERS | Spontaneous reports | Post-market adverse event signals, demographic patterns | | CPIC | Guidelines | Pharmacogenomic dosing recommendations | | FDA PGx Biomarkers | Regulatory | Approved pharmacogenomic labeling | | ClinicalTrials.gov | Trial registry | Ongoing/completed safety trials | | PubMed | Literature | Published safety studies, case reports |
Phase 0: Drug Identity & Context
Resolve drug name, get class, mechanism, indications
|
Phase 1: FDA Label Extraction
Boxed warnings, contraindications, adverse reactions, interactions
|
Phase 2: FAERS Signal Detection
Top adverse events, disproportionality (PRR/ROR), demographics
|
Phase 3: Pharmacogenomics
CPIC guidelines, FDA PGx biomarkers, genotype-specific risks
|
Phase 4: Clinical Trials
Safety-focused trials, risk evaluation programs
|
Phase 5: Literature Evidence
PubMed safety studies, case reports, meta-analyses
|
Phase 6: Integrated Safety Report
Synthesize all sources into a cohesive safety profile
Objective: Unambiguously identify the drug and establish baseline context.
Tools:
DailyMed_search_spls -- search Structured Product Labels
query (drug name)OpenFDA_get_approval_history -- get approval dates and supplements
drug_name (generic or brand name)Workflow:
Tip: FAERS uses medicinalproduct which can be brand or generic. Try both forms in Phase 2.
Objective: Extract all safety-relevant sections from the FDA-approved label.
Tools:
FDA_get_boxed_warning_info_by_drug_name -- boxed (black box) warnings
drug_name{error: {code: "NOT_FOUND"}} if none exists (normal)FDA_get_warnings_and_cautions_by_drug_name -- warnings and precautions section
drug_nameDailyMed_parse_adverse_reactions -- adverse reactions from label
setid (NOT set_id; from Phase 0 DailyMed search)DailyMed_parse_drug_interactions -- drug interaction section
setid (NOT set_id)Workflow:
NOT_FOUND response for boxed warnings is normal and means no boxed warning existsLabel section priority: Boxed Warning > Contraindications > Warnings/Precautions > Adverse Reactions > Drug Interactions
Objective: Identify post-market safety signals from spontaneous reports.
Tools:
FAERS_count_reactions_by_drug_event -- top adverse events by frequency
medicinalproduct (drug name, NOT drug_name)[{term, count}]FAERS_calculate_disproportionality -- PRR, ROR, IC for drug-event pair
drug_name, adverse_event{metrics: {PRR: {value, ci_95_lower, ci_95_upper}, ROR: {...}, IC: {...}}, signal_detection: {signal_detected, signal_strength}}FAERS_filter_serious_events -- filter by seriousness type
drug_name, seriousness_type (all/death/hospitalization/disability/life_threatening)FAERS_stratify_by_demographics -- age/sex/country stratification
drug_name, adverse_event (optional), stratify_by (sex/age/country)Workflow:
Important notes:
FAERS_count_reactions_by_drug_event uses medicinalproduct param, not drug_nameFAERS_calculate_disproportionality uses drug_name paramFAERS signal interpretation — what the numbers mean:
| Metric | Value | Interpretation | |--------|-------|---------------| | PRR (Proportional Reporting Ratio) | < 1.0 | Event reported LESS than expected (possible protective effect or underreporting) | | | 1.0-2.0 | No signal or weak signal | | | 2.0-5.0 | Moderate signal — warrants investigation | | | > 5.0 | Strong signal — likely real association (but still not proof of causation) | | ROR (Reporting Odds Ratio) | Similar to PRR but accounts for all other drugs | Same thresholds as PRR; slightly more robust | | IC (Information Component) | < 0 | No signal | | | 0-2 | Weak signal | | | > 2 | Strong signal |
Signal ≠ Causation: A strong FAERS signal means the drug-event pair is reported more often than expected. This could be due to:
How to assess signal credibility:
Objective: Identify genetic factors that modify drug safety.
Tools:
CPIC_list_guidelines -- get CPIC pharmacogenomic guidelines
gene, drug filtersfda_pharmacogenomic_biomarkers -- FDA-approved PGx biomarkers
drug_name, biomarker, limit (default 10; use limit=1000 for comprehensive results){count, shown, results} with biomarker, drug, therapeutic areaWorkflow:
Tip: Use limit=1000 with fda_pharmacogenomic_biomarkers to avoid missing entries (default limit is only 10).
Objective: Find ongoing or completed trials studying drug safety.
Tools:
search_clinical_trials -- search ClinicalTrials.gov
query_term (required), optional condition, intervention, pageSize{studies, nextPageToken, total_count} or string if no resultsWorkflow:
Query tip: Simple queries work best. Complex multi-word queries often return no results. Search "[drug name]" first, then filter by safety-related keywords in the results.
Objective: Find published safety studies, case reports, and meta-analyses.
Tools:
PubMed_search_articles -- search biomedical literature
query (search term), optional limit{articles: [...]})Workflow:
Synthesize all phases into a cohesive report:
Evidence grading:
| Pattern | Description | Key Phases | |---------|-------------|------------| | Full Safety Review | Comprehensive regulatory-style review | All (0-6) | | Label vs Real-World | Compare FDA label to FAERS signals | 0, 1, 2, 6 | | PGx Safety Assessment | Focus on pharmacogenomic risk factors | 0, 1, 3, 5 | | Signal Investigation | Deep-dive into a specific adverse event | 0, 1, 2, 5, 6 | | Drug Comparison | Head-to-head safety comparison of two drugs | Run phases 0-2 for each, compare in Phase 6 |
tools
Post-market safety surveillance and recall/adverse-event RETRIEVAL across the full spectrum of FDA-regulated products that are NOT covered by the drug-AE signal skills: medical devices, food / dietary supplements / cosmetics, veterinary drugs, and drug supply (shortages). Orchestrates openFDA endpoints (MAUDE device adverse events + device recalls + 510(k), CAERS food/supplement/ cosmetic adverse events, veterinary adverse events, drug shortages, and cross-product enforcement/recall reports). USE WHEN the user asks: "are there adverse events for [device / pacemaker / infusion pump / insulin pump]", "device recalls for [firm/product]", "supplement / vitamin / cosmetic adverse reactions", "is [drug] in shortage", "what injectables are on shortage", "veterinary / animal adverse events for [drug] in [dog/cat/horse]", "food recall for listeria", "MAUDE report for [device]", "CAERS reactions for [brand]". DO NOT USE for drug adverse-event SIGNAL detection or disproportionality (PRR / ROR / IC) or drug-AE association scoring — that is `tooluniverse-pharmacovigilance` / `tooluniverse-adverse-event-detection`. This skill is multi-product surveillance and retrieval, not drug-AE statistical signal mining.
tools
--- name: tooluniverse-phewas description: Cross-ancestry / cross-biobank phenome-wide association (PheWAS) and replication. Given ONE variant (rsID) or ONE gene, look up every phenotype it associates with across European/UK (UKB-TOPMed), Finnish (FinnGen), Japanese (BioBank Japan), and Taiwanese (TPMI) biobanks, plus exome-wide gene-burden PheWAS (Genebass), then judge whether an association replicates across ancestries or is population-specific. Use whenever the user asks "what else is this va
tools
Dereplicate a putative natural product and assign its chemical taxonomy. Use to answer "is [compound] a known natural product", "what microbe/organism produces [compound]", "what chemical class is [compound]", "dereplicate this metabolite (by formula/exact mass/InChIKey/SMILES)", or "classify this molecule into ChemOnt". Searches NPAtlas for known microbial natural products (producing organism + literature reference), assigns the ChemOnt kingdom→superclass→class→subclass hierarchy via ClassyFire, resolves systematic IUPAC names to structure via OPSIN, and cross-references identity in PubChem. NOT for general drug/compound identity or ADMET (use tooluniverse-chemical-compound-retrieval / tooluniverse-small-molecule-discovery) and NOT for metabolomics pathway/enrichment analysis (use tooluniverse-metabolomics skills).
tools
Genome-ASSEMBLY discovery, QC, and replicon mapping for any organism (bacteria, archaea, fungi, and beyond) using NCBI Datasets. Resolves an organism name or taxid to assemblies, picks the reference/representative or best-quality assembly, pulls assembly QC metrics (total length, contig/scaffold N50, contig count, GC%, assembly level, RefSeq category), enumerates chromosomes and plasmids via per-replicon sequence reports, and compares candidate assemblies on quality. Use for "what genomes are available for [organism]", "assembly stats / N50 / GC content for [GCF_/GCA_ accession]", "how many plasmids does [strain] have", "compare assemblies for [species]", "find the reference genome for [taxon]", "is this assembly Complete Genome or just contigs". NOT for gene-level orthology/synteny (use tooluniverse-comparative-genomics), plant gene structure (use tooluniverse-plant-genomics), de novo assembly from raw reads (no tool exists), or taxonomy-only name/lineage lookups.