skills/tooluniverse-literature-deep-research/SKILL.md
Deep literature review — PubMed, EuropePMC, bioRxiv preprints, citation networks, evidence synthesis. Disambiguates queries, runs collision-aware searches, grades evidence T1-T4, and produces structured reports. Use for systematic literature review, meta-analysis evidence collection, and detailed answer-with-citations workflows.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-literature-deep-researchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Systematic literature research: disambiguate, search with collision-aware queries, grade evidence, produce structured reports.
KEY PRINCIPLES: (1) Disambiguate first (2) Right-size deliverable (3) Grade every claim T1-T4 (4) All sections mandatory even if "limited evidence" (5) Source attribution for every claim (6) English-first queries, respond in user's language (7) Report = deliverable, not search log
Search PubMed/EuropePMC FIRST before reasoning. A published paper beats memory.
Factoid search strategy:
EuropePMC_search_articles(query="term1 term2 term3", limit=5)When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Phase 0: Clarify + Mode Select → Phase 1: Disambiguate + Profile → Phase 2: Literature Search → Phase 3: Report
| Mode | When | Deliverable | |------|------|-------------| | Factoid | Single concrete question | 1-page fact-check report + bibliography | | Mini-review | Narrow topic | 1-3 page narrative | | Full Deep-Research | Comprehensive overview | 15-section report + bibliography |
# [TOPIC]: Fact-check Report
## Question / ## Answer (with evidence rating) / ## Source(s) / ## Verification Notes / ## Limitations
| Pattern | Domain | Action | |---------|--------|--------| | Gene/protein symbol | Biological target | Full bio disambiguation | | Drug name | Drug | Drug disambiguation (1.5) | | Disease name | Disease | Disease disambiguation (1.6) | | CS/ML topic | General academic | Skip bio tools, literature-only | | Cross-domain | Interdisciplinary | Resolve each entity in its domain |
tooluniverse-target-researchtooluniverse-drug-researchtooluniverse-disease-researchUse this skill for literature synthesis. Use specialized skills for entity profiling. For max depth, run both.
UniProt_search → UniProt_get_entry_by_accession → UniProt_id_mapping
ensembl_lookup_gene → MyGene_get_gene_annotation
Check first 20 results. If >20% off-topic, build negative filter: NOT [collision1] NOT [collision2].
Gene family: "ADAR" NOT "ADAR2" NOT "ADARB1". Cross-domain: add context terms.
InterPro_get_protein_domains, UniProt_get_ptm_processing_by_accession, HPA_get_subcellular_location,
GTEx_get_median_gene_expression, GO_get_annotations_for_gene, Reactome_map_uniprot_to_pathways,
STRING_get_protein_interactions, intact_get_interactions, OpenTargets_get_target_tractability_by_ensemblID
GPCR targets: delegate to tooluniverse-target-research.
Identity: OpenTargets_get_drug_chembId_by_generic_name, ChEMBL_get_drug, PubChem_get_CID_by_compound_name, drugbank_get_drug_basic_info_by_drug_name_or_id
Targets: ChEMBL_get_drug_mechanisms, OpenTargets_get_associated_targets_by_drug_chemblId, DGIdb_get_drug_gene_interactions
Safety: OpenTargets_get_drug_adverse_events_by_chemblId, OpenTargets_get_drug_indications_by_chemblId, search_clinical_trials
OpenTargets disease search → EFO/MONDO IDs
DisGeNET_get_disease_genes, DisGeNET_search_disease
CTD_get_disease_chemicals
Resolve both entities, then cross-reference via CTD_get_chemical_gene_interactions, CTD_get_chemical_diseases, OpenTargets drug-target/drug-disease tools. Intersect shared targets/pathways.
Non-bio: skip bio tools, use ArXiv/DBLP/OSF. Cross-domain: resolve bio entities with 1.1-1.3, search CS/general in parallel, merge and cross-reference.
Methodology stays internal. Report shows findings, not process.
Step 1: Seeds (15-30 core papers): domain-specific title searches with date/sort filters.
Step 2: Citation expansion: PubMed_get_cited_by, EuropePMC_get_citations/references, PubMed_get_related, SemanticScholar_get_recommendations, OpenCitations_get_citations
Step 3: Collision-filtered broader queries: "[TERM]" AND ([context]) NOT [collision]
Run the core multi-field set on every review (catches what any single index misses), then add the domain rows that match the subject. Don't fire every source blindly — 6–10 well-chosen indexes beat 20 noisy ones.
ALWAYS run (core, all disciplines): PubMed_search_articles, EuropePMC_search_articles, openalex_search_works (query param search/query) or openalex_literature_search (query param search_keywords) — pick one and match its param; mixing them silently returns off-topic results — and SemanticScholar_search_papers
Then add by domain:
| Domain | Add these | Notes |
|--------|-----------|-------|
| Biomedical / clinical | PMC_search_papers (full text), PubTator3_LiteratureSearch (entity & relations: queries), PubMed_Guidelines_Search (clinical guidelines) | PubTator normalizes gene/drug/disease entities |
| Biology (ecology/evolution/plant) | EuropePMC as PRIMARY + OpenAlex | PubMed returns 0–1 for non-clinical biology |
| CS / ML / AI | ArXiv_search_papers, DBLP_search_publications | arXiv + CS bibliography |
| Physics / HEP / astro | InspireHEP_search_papers | 1.6M+ particle/astro records |
| Broad / hard-to-find / OA | Crossref_search_works, CORE_search_papers, DOAJ_search_articles, Fatcat_search_scholar | DOI registry + OA aggregators + Internet Archive Scholar |
| Regional / EU-funded | OpenAIRE_search_publications, HAL_search_archive | EU open science + French national archive |
| Datasets / software / outputs | Figshare_search_articles, Zenodo_search_records | Citable DOIs for data & code |
| Preprints (latest) | EuropePMC_search_articles(source='PPR'), OSF_search_preprints, BioRxiv_get_preprint/MedRxiv_get_preprint (DOI lookup) | bioRxiv/medRxiv/PsyArXiv etc. |
Multi-source: advanced_literature_search_agent (12+ DBs; needs Azure key -- fallback: query the core set individually).
Citation impact: iCite_search_publications (RCR/APT), iCite_get_publications (by PMID), scite_get_tallies (support/contradict). PubMed-only; for CS use SemanticScholar.
A domain-specific index returning 0 (e.g. ArXiv on a pure-clinical topic) is normal — only worry if the whole core set is empty.
Full-text: see FULLTEXT_STRATEGY.md for three-tier strategy.
CRITICAL: PubMed returns 0 for ~30% of valid queries. Always retry with EuropePMC when PubMed returns empty. This is not optional.
Retry once -> fallback tool. Key fallbacks: PubMed_get_cited_by -> EuropePMC_get_citations -> OpenCitations. OA: Unpaywall if configured, else Europe PMC/PMC/OpenAlex flags.
| Tier | Label | Bio Example | CS/ML Example | |------|-------|-------------|---------------| | T1 | Mechanistic | CRISPR KO + rescue, RCT | Formal proof, controlled ablation | | T2 | Functional | siRNA knockdown phenotype | Benchmark with baselines | | T3 | Association | GWAS, screen hit | Observational, case study | | T4 | Mention | Review article | Survey, workshop abstract |
Inline: Target X regulates Y [T1: PMID:12345678]. Per theme: summarize evidence distribution.
| File | Mode |
|------|------|
| [topic]_report.md | Full |
| [topic]_factcheck_report.md | Factoid |
| [topic]_bibliography.json + .csv | All |
Progressive update: create report with all section headers immediately. Fill after each phase. Write Executive Summary LAST.
Use 15-section template from REPORT_TEMPLATE.md. Domain adaptations: bio (architecture/expression/GO/disease), drug (properties/MOA/PK/safety), disease (epi/patho/genes/treatments), general (history/theories/evidence/applications).
Brief progress updates only: "Resolving identifiers...", "Building paper set...", "Grading evidence..." Do NOT expose: raw tool outputs, dedup counts, search round details.
TOOL_NAMES_REFERENCE.md -- 123 tools with parametersREPORT_TEMPLATE.md -- template, domain adaptations, bibliography, completeness checklistFULLTEXT_STRATEGY.md -- three-tier full-text verificationWORKFLOW.md -- compact cheat-sheetEXAMPLES.md -- worked examplestools
Post-market safety surveillance and recall/adverse-event RETRIEVAL across the full spectrum of FDA-regulated products that are NOT covered by the drug-AE signal skills: medical devices, food / dietary supplements / cosmetics, veterinary drugs, and drug supply (shortages). Orchestrates openFDA endpoints (MAUDE device adverse events + device recalls + 510(k), CAERS food/supplement/ cosmetic adverse events, veterinary adverse events, drug shortages, and cross-product enforcement/recall reports). USE WHEN the user asks: "are there adverse events for [device / pacemaker / infusion pump / insulin pump]", "device recalls for [firm/product]", "supplement / vitamin / cosmetic adverse reactions", "is [drug] in shortage", "what injectables are on shortage", "veterinary / animal adverse events for [drug] in [dog/cat/horse]", "food recall for listeria", "MAUDE report for [device]", "CAERS reactions for [brand]". DO NOT USE for drug adverse-event SIGNAL detection or disproportionality (PRR / ROR / IC) or drug-AE association scoring — that is `tooluniverse-pharmacovigilance` / `tooluniverse-adverse-event-detection`. This skill is multi-product surveillance and retrieval, not drug-AE statistical signal mining.
tools
--- name: tooluniverse-phewas description: Cross-ancestry / cross-biobank phenome-wide association (PheWAS) and replication. Given ONE variant (rsID) or ONE gene, look up every phenotype it associates with across European/UK (UKB-TOPMed), Finnish (FinnGen), Japanese (BioBank Japan), and Taiwanese (TPMI) biobanks, plus exome-wide gene-burden PheWAS (Genebass), then judge whether an association replicates across ancestries or is population-specific. Use whenever the user asks "what else is this va
tools
Dereplicate a putative natural product and assign its chemical taxonomy. Use to answer "is [compound] a known natural product", "what microbe/organism produces [compound]", "what chemical class is [compound]", "dereplicate this metabolite (by formula/exact mass/InChIKey/SMILES)", or "classify this molecule into ChemOnt". Searches NPAtlas for known microbial natural products (producing organism + literature reference), assigns the ChemOnt kingdom→superclass→class→subclass hierarchy via ClassyFire, resolves systematic IUPAC names to structure via OPSIN, and cross-references identity in PubChem. NOT for general drug/compound identity or ADMET (use tooluniverse-chemical-compound-retrieval / tooluniverse-small-molecule-discovery) and NOT for metabolomics pathway/enrichment analysis (use tooluniverse-metabolomics skills).
tools
Genome-ASSEMBLY discovery, QC, and replicon mapping for any organism (bacteria, archaea, fungi, and beyond) using NCBI Datasets. Resolves an organism name or taxid to assemblies, picks the reference/representative or best-quality assembly, pulls assembly QC metrics (total length, contig/scaffold N50, contig count, GC%, assembly level, RefSeq category), enumerates chromosomes and plasmids via per-replicon sequence reports, and compares candidate assemblies on quality. Use for "what genomes are available for [organism]", "assembly stats / N50 / GC content for [GCF_/GCA_ accession]", "how many plasmids does [strain] have", "compare assemblies for [species]", "find the reference genome for [taxon]", "is this assembly Complete Genome or just contigs". NOT for gene-level orthology/synteny (use tooluniverse-comparative-genomics), plant gene structure (use tooluniverse-plant-genomics), de novo assembly from raw reads (no tool exists), or taxonomy-only name/lineage lookups.