skills/43-wentorai-research-plugins/skills/research/deep-research/llm-scientific-discovery-guide/SKILL.md
Survey of LLM agents for biomedical scientific discovery
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research llm-scientific-discovery-guideInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A curated survey of how LLM-based agents are being applied to scientific discovery, with a focus on biomedical research. Covers hypothesis generation, experiment design, lab automation, literature synthesis, and multi-agent scientific collaboration. Tracks papers, tools, and frameworks across the spectrum from fully autonomous to human-in-the-loop systems.
LLM Agents for Scientific Discovery
├── Hypothesis Generation
│ ├── Literature-based (gap identification)
│ ├── Data-driven (pattern discovery)
│ └── Analogy-based (cross-domain transfer)
├── Experiment Design
│ ├── Protocol generation
│ ├── Parameter optimization
│ └── Control selection
├── Lab Automation
│ ├── Robot control (self-driving labs)
│ ├── Equipment programming
│ └── Data collection orchestration
├── Analysis & Interpretation
│ ├── Statistical analysis
│ ├── Visualization
│ └── Result interpretation
└── Communication
├── Paper writing
├── Presentation generation
└── Peer review simulation
| System | Domain | Capability | |--------|--------|-----------| | AI Scientist | ML/AI | Full paper generation pipeline | | ChemCrow | Chemistry | Tool-augmented chemical reasoning | | Coscientist | Chemistry | Autonomous experiment execution | | BioPlanner | Biology | Experiment protocol generation | | MedAgent | Medicine | Clinical trial analysis | | GenAgent | Genomics | Gene expression analysis | | DrugAgent | Pharma | Drug interaction prediction |
# LLM-based hypothesis generation pattern
from scientific_agent import HypothesisGenerator
generator = HypothesisGenerator(
llm_provider="anthropic",
knowledge_sources=["pubmed", "openalex"],
)
hypotheses = generator.generate(
domain="oncology",
context="Recent findings show that gut microbiome "
"composition correlates with immunotherapy response",
constraints=[
"Must be testable in vitro",
"Should involve specific bacterial species",
"Must have measurable endpoints",
],
num_hypotheses=5,
)
for h in hypotheses:
print(f"\nHypothesis: {h.statement}")
print(f" Rationale: {h.rationale}")
print(f" Supporting evidence: {len(h.evidence)} papers")
print(f" Novelty score: {h.novelty_score:.2f}")
print(f" Feasibility: {h.feasibility}")
# Agent controlling automated experiments
from scientific_agent import LabAgent
agent = LabAgent(
llm_provider="anthropic",
equipment=["plate_reader", "liquid_handler", "incubator"],
safety_constraints=["bsl2", "max_volume_1ml"],
)
# Design and run experiment
result = agent.run_experiment(
objective="Determine IC50 of compound X against cell line Y",
protocol_type="dose_response",
parameters={
"compound": "Compound_X",
"cell_line": "HeLa",
"concentrations": "serial_dilution",
"replicates": 3,
"readout": "cell_viability",
},
)
print(f"IC50: {result.ic50:.2f} uM")
print(f"R-squared: {result.r_squared:.3f}")
result.plot_dose_response("dose_response.pdf")
# Agents with different scientific roles
from scientific_agent import ScientificTeam
team = ScientificTeam(
agents={
"PI": {"role": "research_director",
"expertise": "oncology"},
"Experimentalist": {"role": "experiment_design",
"expertise": "cell_biology"},
"Analyst": {"role": "data_analysis",
"expertise": "biostatistics"},
"Writer": {"role": "manuscript_writing",
"expertise": "scientific_communication"},
},
)
# Collaborative research cycle
project = team.start_project(
title="Microbiome-immunotherapy interaction study",
timeline_weeks=12,
)
# Agents collaborate: PI directs → Experimentalist designs →
# Analyst processes → Writer documents
### Foundational Papers
1. "The AI Scientist" (Lu et al., 2024) — Fully automated ML research
2. "ChemCrow" (Bran et al., 2023) — Chemistry tool-use agent
3. "Coscientist" (Boiko et al., 2023) — Autonomous chemical research
4. "BioPlanner" (Biswas et al., 2024) — Biology protocol generation
### Surveys
5. "Scientific Discovery in the Age of AI" (Wang et al., 2023)
6. "Foundation Models for Science" (Bommasani et al., 2022)
7. "LLM Agents: A Survey" (multiple, 2024)
### Ethics & Limitations
8. "Dual-use concerns of AI in biology" (Sandbrink, 2023)
9. "Can LLMs Generate Novel Research Ideas?" (Si et al., 2024)
tools
Show mcp-stata identity, connected tools, and status. Use when the user asks if mcp-stata is available, asks about access to the toolkit, or asks what Stata tools are connected.
tools
Activate when users mention Stata commands, .do files, regressions, econometrics, stored results, graphs, dataset inspection, replication, or Stata errors. Route the task through mcp-stata tools and the specialized research skills instead of treating it as plain text coding.
development
Build and review paper-ready regression, balance, and summary tables from Stata outputs. Use when the user needs a clean table for a draft, appendix, or coauthor share-out.
tools
Install, configure, update, or verify mcp-stata across Claude Code, Codex, Gemini CLI, Cursor, Windsurf, and VS Code. Activate when users ask to set up the Stata toolkit or troubleshoot the installation.