Version Compatibility

Reference examples tested with: R RBesT 1.7+ (Roche), OncoBayes2 0.8+ (Novartis), BOIN 2.7+, dfcrm 0.2-2+, escalation 0.1+, trialr 0.1.6+, bayesDP, psborrow2 (FDA-supported), rstan / cmdstanr, brms. Legacy: JAGS, WinBUGS.

Before using code patterns, verify installed versions match. If versions differ:

R: packageVersion('<pkg>') then ?function_name
Confirmatory regulatory work: validate against pinned package versions in submission

If code throws an error, introspect the installed package and adapt the example to match the actual API rather than retrying.

Bayesian Clinical Trials

"Design a Bayesian clinical trial" -> Specify a prior, likelihood, and decision rule with frequentist operating characteristics demonstrated via simulation; for dose-finding use FDA-endorsed BOIN; for borrowing use robust MAP priors; for adaptive platforms use posterior probability of efficacy stopping with simulation-calibrated thresholds.

Regulatory Status -- The 2024-2026 Bayesian Pivot

FDA 2010 CDRH Bayesian Devices Guidance (Feb 5 2010): the only Bayesian-specific FDA guidance until January 2026. Why devices were ahead: CDRH's PMA pathway permits one pivotal trial and accepts borrowing from prior/OUS data more readily than CDER. Example: Edwards SAPIEN (PARTNER B, PMA P100041, Nov 2011) was approved on a single randomized pivotal trial (TAVR vs standard therapy in inoperable patients); the later SAPIEN 3 intermediate-risk PMA used a propensity-score comparison of a single-arm cohort against PARTNER IIA surgical controls -- illustrating CDRH's acceptance of non-randomized/borrowed comparisons.

FDA January 2026 CDER Bayesian Methodology Draft (FDA-2025-D-3217; comment period closed March 13 2026): first-ever drug-side Bayesian guidance. Explicit that Bayesian primary inference in pivotals is acceptable provided:

Prospective specification
Simulation-based operating characteristics (including frequentist Type-I error under null scenarios — agency still wants calibration)
Justified priors
Code/data sufficient for FDA replication

Project Optimus (FDA OCE, launched 2021; final dose-optimisation guidance Aug 2024): rewrites Phase I/II oncology by requiring randomised dose comparison before registration. Has made multi-arm randomised dose-finding (BOIN-12, gBOIN-ET) much more important than classic MTD-finding.

FDA BOIN Fit-for-Purpose qualification (December 2021): first formal FDA endorsement of a specific dose-finding design under the Drug Development Tools program.

ICH E20 (Step 2b/3 draft June 2025; NOT final) treats Bayesian as a legitimate analytic framework but requires demonstration of acceptable frequentist operating characteristics (Type-I, power) over a pre-specified parameter space.

Algorithmic Taxonomy

| Method | Use case | Software | Strength | Fails when | |--------|----------|----------|----------|------------| | BOIN | Phase I MTD | R BOIN (Yuan) | FDA Fit-for-Purpose 2021; pre-tabulated decisions; no bedside Bayesian software | Statistically less efficient than CRM under correct skeleton | | mTPI-2 / Keyboard | Phase I MTD | R escalation; R Keyboard | Default replacement for mTPI; fixes Ockham bias | Tabulated; transparency | | CRM | Phase I MTD | R dfcrm, trialr | Most efficient under correct skeleton | Skeleton mis-specification biases MTD | | EWOC | Phase I MTD | R ewoc, dfcrm | Explicit overdose-control constraint (P(dose>MTD) <= 0.25) | More conservative than CRM in small trials | | BOIN-12 / gBOIN-ET | Phase 1b dose-optimisation (Project Optimus) | R BOIN extensions | Multi-arm randomised dose comparison | Requires explicit efficacy + toxicity scoring | | MAP prior | Borrowing from historical control arms | R RBesT::gMAP | Industry-standard borrowing | Sample-size of MAP prior must be calibrated (Schmidli 2014) | | Robust MAP | Borrowing with prior-data conflict protection | R RBesT::robustify | Adds vague component (weight 0.1-0.3) to detach if conflict | Mixture weight choice affects borrowing | | EXNEX | Basket trial across rare-disease strata | R bhmbasket; OncoBayes2 | Avoids HM catastrophic borrowing; mixture 0.5/0.5 default (Neuenschwander 2016) | Default weights may over-borrow | | Dixon-Simon shrinkage | Subgroup analysis | Custom Stan/brms | Honest about no qualitative interaction prior | Prior on tau drives results | | Berry-Berry 3-level hierarchical | AE multiplicity (AE within PT within SOC) | R c212; JMP Clinical | Tames safety multiplicity | Spike-and-slab tuning matters | | Posterior probability stopping | Adaptive sequential | Custom; FACTS commercial | Bayesian likelihood-principle compatible | Threshold calibration via simulation | | Predictive probability of success | End-of-Phase-2 go/no-go | Custom Stan | Decision-theoretic; integrates over posterior | Requires Phase 3 design specified | | Spiegelhalter skeptical/enthusiastic prior | Sensitivity for regulatory pivotals | Custom | Frames regulator-vs-sponsor evidence | Prior elicitation effort | | Power prior | Pediatric extrapolation borrowing from adults | R bayesDP, psborrow2 | Partial borrowing with discount gamma | gamma choice (Jan 2026 FDA draft: 0.3-0.6) |

Postdoc reading list:

FDA 2010 Guidance for Industry: Use of Bayesian Statistics in Medical Device Clinical Trials (Feb 5 2010)
FDA 2026 Draft Use of Bayesian Methodology in Clinical Trials (FDA-2025-D-3217, Jan 2026)
Berry SM, Carlin BP, Lee JJ, Müller P 2010 Bayesian Adaptive Methods for Clinical Trials (CRC)
Schmidli H, Gsteiger S, Roychoudhury S, O'Hagan A, Spiegelhalter D, Neuenschwander B 2014 Biometrics 70:1023 (MAP + robust MAP)
Weber S, Li Y, Seaman J, Kakizume T, Schmidli H 2021 J Stat Softw 100:19 (RBesT)
Neuenschwander B, Wandel S, Roychoudhury S, Bailey S 2016 Pharm Stat 15:123 (EXNEX)
Liu S, Yuan Y 2015 J R Stat Soc C 64:507 (BOIN)
O'Quigley J, Pepe M, Fisher L 1990 Biometrics 46:33 (CRM)
Babb J, Rogatko A, Zacks S 1998 Stat Med 17:1103 (EWOC)
Ji Y, Liu P, Li Y, Bekele BN 2010 Clin Trials 7:653 (mTPI)
Guo W, Wang SJ, Yang S, Lynn H, Ji Y 2017 Contemp Clin Trials 58:23 (mTPI-2 / Keyboard)
Berry SM, Broglio KR, Groshen S, Berry DA 2013 Clin Trials 10:720 (basket trial hierarchical)
Berry SM, Berry DA 2004 Biometrics 60:418 (three-level AE hierarchical)
Spiegelhalter DJ, Freedman LS, Parmar MKB 1994 JRSS-A 157:357 (skeptical/enthusiastic prior framework)
Rugo HS et al 2016 NEJM 375:23 (I-SPY 2 veliparib-carboplatin)
Angus DC et al 2020 JAMA (REMAP-CAP COVID rationale)

Decision Tree by Scenario

| Scenario | Recommended approach | Why | |----------|---------------------|-----| | Phase 1 oncology, single-agent MTD | BOIN with target DLT 30%; cohort size 3 | FDA Fit-for-Purpose 2021; tabulated escalation | | Phase 1 oncology, combination (2 agents) | BLRM with EXNEX in OncoBayes2 | Multi-dimensional dose; industry standard at Novartis/Roche | | Phase 1b/2 dose-optimisation (Project Optimus) | BOIN-12 or gBOIN-ET; randomised 2-dose comparison | Aug 2024 FDA dose-optimisation guidance | | Phase 3 with historical control arms available | Robust MAP via RBesT; gMAP() + robustify() | Industry standard borrowing with prior-data conflict protection | | Basket trial across rare-disease strata | EXNEX (0.5 EX / 0.5 NEX mixture) via OncoBayes2 | Avoids HM catastrophic borrowing | | Pediatric extrapolation from adult data | Power prior with discount gamma 0.3-0.6 | working convention; the FDA Bayesian Jan 2026 draft does not prescribe a specific gamma range -- check the draft for the current language before quoting | | Phase 3 trial with single arm + RWE comparator | Propensity-score-integrated power prior via psborrow2 | FDA-supported package for external controls | | Adaptive trial wanting posterior-probability stopping | Custom Stan model + simulation-calibrated threshold | Bayesian likelihood-principle compatible; no penalty for repeated looks | | End-of-Phase-2 go/no-go | Predictive probability of success in Phase 3 | Integrates posterior over Phase 3 design | | Hypothesis-generating safety AE analysis (>100 PTs) | Berry-Berry 3-level hierarchical (AE within PT within SOC) | Tames multiplicity; spike-and-slab on log OR | | Subgroup analysis post-signal | Bayesian shrinkage (Dixon-Simon, RBesT) | Hemmings-Koch 2019: shrinkage for replication planning, NOT signal generation | | Regulatory pivotal sensitivity | Spiegelhalter skeptical-prior framework | Frames "evidence for regulators" vs "evidence for sponsor" |

Phase I Dose-Finding -- BOIN, CRM, mTPI-2

BOIN (FDA-preferred operational)

library(BOIN)

# Generate escalation table for protocol
boundary_table <- get.boundary(
    target = 0.30,           # target DLT rate
    ncohort = 10,            # 10 cohorts -> max 30 patients with size 3
    cohortsize = 3,
    n.earlystop = 12,        # stop early at lowest dose if 12 patients show futility
    p.saf = 0.6 * 0.30,      # "safe" escalation boundary
    p.tox = 1.4 * 0.30       # "toxic" de-escalation boundary
)
print(boundary_table)
# Pre-printed at investigator desk; no bedside Bayesian software

# Operating characteristics simulation
oc_boin <- get.oc(
    target = 0.30,
    p.true = c(0.05, 0.10, 0.20, 0.30, 0.40, 0.55),  # true DLT per dose
    ncohort = 10,
    cohortsize = 3,
    ntrial = 1000
)
print(oc_boin)
# Reports: MTD selection accuracy, overdose risk, average sample size

BOIN's transparency-over-modelling philosophy: unlike CRM, BOIN does NOT use information from intermediate dose levels in a model-based way. The Jin-Yuan vs Neuenschwander/Mozgunov debate (Stat Med, Pharm Stat, since ~2018): BLRM/CRM are statistically more efficient under correct model; BOIN is operationally simpler and more transparent.

CRM with calibrated skeleton

library(dfcrm)

prior_skeleton <- getprior(halfwidth = 0.05, target = 0.30, nu = 3, nlevel = 6)
# Lee-Cheung 2009 indifference-interval calibration

crm_sim <- crmsim(
    PI = c(0.05, 0.10, 0.20, 0.30, 0.40, 0.55),
    prior = prior_skeleton,
    target = 0.30,
    n = 30,
    x0 = 1,                  # starting dose
    nsim = 1000,
    method = 'bayes',
    model = 'logistic'
)
print(crm_sim)

Skeleton mis-specification is the canonical CRM failure mode. Lee-Cheung 2009 indifference-interval method gives a systematic calibration approach.

EWOC (overdose control)

# Babb-Rogatko-Zacks 1998: explicit P(dose > MTD) <= alpha (default 0.25)
# Implementation in dfcrm::ewoc; or `ewoc` package

MAP Priors and RBesT

Schmidli et al 2014 Biometrics 70:1023: Meta-Analytic-Predictive prior. Fit random-effects meta-analysis of historical control arms; derive predictive distribution for new control arm; use as informative prior. Effective sample size from history typically 20-80% of new control arm.

library(RBesT)

# Historical control data (4 prior studies)
historical_data <- data.frame(
    study = c('s1', 's2', 's3', 's4'),
    n = c(40, 35, 50, 45),
    r = c(8, 6, 12, 9)         # responders
)

# Fit MAP via gMAP (Stan-based random-effects meta-analysis)
map_prior <- gMAP(
    cbind(r, n - r) ~ 1 | study,
    data = historical_data,
    family = binomial,
    tau.dist = 'HalfNormal',
    tau.prior = 0.5,           # between-study SD prior
    beta.prior = cbind(0, 2)    # weakly informative on logit response
)
print(map_prior)

# Approximate posterior with mixture for downstream computation
map_mix <- automixfit(map_prior, Nc = 2)
print(map_mix)

# Effective sample size
ess(map_mix)

# Robust MAP: add vague mixture component (weight 0.1-0.3) to guard against prior-data conflict
robust_map <- robustify(map_mix, weight = 0.2, mean = 0.5, n = 1)
print(robust_map)
ess(robust_map)

Robust MAP rationale: if the new data disagree with historical (prior-data conflict), the mixture down-weights the informative component automatically. The mixture weight on the informative component is a tuning choice and should be varied in a pre-specified sensitivity analysis.

EXNEX for Basket Trials

Neuenschwander, Wandel, Roychoudhury, Bailey 2016 Pharm Stat 15:123: Mixture of exchangeable (shared mean+variance) + non-exchangeable (per-basket independent), typically weighted 0.5/0.5. Avoids HM catastrophic borrowing when one basket truly different.

library(OncoBayes2)  # Novartis-developed; canonical EXNEX implementation

# Or simplified via bhmbasket
library(bhmbasket)

# Conceptual: each basket has its own posterior, with shrinkage governed by exchangeability mixture
# Default weights 0.5 EX / 0.5 NEX
# Sensitivity over weights (0.1, 0.3, 0.5, 0.7, 0.9) is essential

Bayesian Platform Trials

I-SPY 2 (Rugo et al 2016 NEJM 375:23)

Neoadjuvant breast cancer; 10 biomarker-defined subtypes × multiple arms; Bayesian RAR; graduation criterion = posterior predictive probability of success in 300-patient Phase 3 ≥ 0.85. Berry Consultants designed engine.

# Conceptual implementation requires custom Stan or FACTS (Berry Consultants commercial)

# Pseudocode:
# 1. Fit hierarchical model to platform data: response ~ arm + biomarker_subtype + arm:subtype
# 2. Posterior draws of treatment effect by subtype
# 3. For each draw, simulate Phase 3 trial: n=300, treatment vs control, observed effect
# 4. Compute proportion of draws meeting Phase 3 success criterion
# 5. If proportion >= 0.85, arm graduates

REMAP-CAP (Angus 2020 JAMA)

Severe pneumonia, repurposed for COVID-19; Bayesian factorial multi-domain design. Generated corticosteroid signal independently of RECOVERY.

Drop-the-loser vs promising-the-winner

Adaptive arm-dropping (futility): posterior P(beating control) drops below threshold -> close. Mathematically straightforward.
"Promising-the-winner": selection bias. Bias-adjusted estimators (Robertson 2023; conditional MLE) standard in I-SPY 2 reports.

Hierarchical Models for Safety Multiplicity (Berry-Berry 2004)

Berry SM, Berry DA 2004 Biometrics 60:418: three-level hierarchical model for AE multiplicity (AE within MedDRA PT within SOC); spike-and-slab on the log OR. Tames the FDA-feared multiplicity in safety summaries.

library(c212)  # Berry-Berry implementation

# Conceptual: each AE has log OR drawn from spike-and-slab prior
# Spike at 0 (no effect); slab as N(mu_SOC, sigma_SOC)
# SOC-level parameters from N(mu_overall, sigma_overall)
# Borrowing within SOC; shrinkage toward 0 if no evidence

# JMP Clinical also implements this for industry use

Power Priors for Borrowing

library(bayesDP)
library(psborrow2)  # FDA-supported package

# Power prior: combines current data L(theta | D_current) with historical L(theta | D_hist)^gamma
# gamma in [0, 1]; gamma = 0 = no borrowing; gamma = 1 = full pooling

# Typical pediatric extrapolation: gamma = 0.3 to 0.6 per FDA Bayesian Jan 2026 draft

External Control Arms and Real-World Evidence (RWE)

The 2024-2026 regulatory shift: FDA has materially expanded acceptance of external/historical/synthetic control arms in rare disease, paediatric, and accelerated-approval settings. Key documents: FDA 2018 RWE Framework (and 2024 enhancements), FDA 2023 Considerations for Use of RWE/RWD for Regulatory Decisions, EMA Reflection Paper on Use of RWE in Regulatory Decision-Making (effective 2024). Bayesian methods are the natural fit because historical data become prior information rather than concurrent control.

Methodology taxonomy

| Method | Borrowing mechanism | Discount control | When to use | |--------|---------------------|------------------|-------------| | Power prior (Ibrahim-Chen 2000) | Likelihood of historical data raised to power gamma | gamma in [0, 1] fixed or modelled | When historical data is single source; gamma ~ Beta in adaptive power prior | | Robust MAP (Schmidli 2014) | Meta-analytic-predictive prior + vague mixture | Mixture weight (typ 0.1-0.3) | Multiple historical control arms; standard for borrowing | | Commensurate prior (Hobbs 2011) | Conditional model on agreement parameter | Tau estimated from data | When agreement between historical and current is data-determined | | Propensity-integrated power prior | Power prior weighted by PS overlap | gamma * (PS-trimmed overlap) | RWE comparator with covariate imbalance | | Doubly robust ATT via causal inference | IPW + outcome regression | n/a | RWE comparator; identifies marginal ATT |

psborrow2 — the FDA-supported RWE framework

The psborrow2 package (Genentech / Bayer / FDA-Janssen collaboration; CRAN 2024+) is the canonical R implementation for propensity-score-integrated Bayesian Dynamic Borrowing. The skeleton below illustrates the workflow conceptually; verify exact function names and arguments against the current psborrow2 vignette before use (the package API has evolved through 2024-2026).

library(psborrow2)

# Define external and internal data
ext_data <- data.frame(usubjid = ..., trt = 0, outcome = ..., covariates = ...)
int_data <- data.frame(usubjid = ..., trt = 0 | 1, outcome = ..., covariates = ...)

# Create borrowing design
borrowing_design <- borrowing_full(
    method_name = "BDB",  # Bayesian Dynamic Borrowing
    ext_flag_col = "ext",
    tau_prior = prior_gamma(0.001, 0.001)  # weakly informative on borrowing
)

# Outcome model (Cox for TTE; logistic for binary)
outcome_model <- outcome_surv_exponential(
    time_var = "time",
    cens_var = "cens",
    baseline_prior = prior_normal(0, 100),
    trt_prior = prior_normal(0, 100)
)

# Run Bayesian analysis with covariate adjustment + borrowing
result <- create_analysis_obj(
    data_matrix = borrow_obj,
    outcome = outcome_model,
    borrowing = borrowing_design,
    covariates = c("age", "ecog", "baseline_severity")
)
mcmc_result <- mcmc_sample(result, n_chains = 4, n_iter = 4000)

Operational rules (FDA 2024-2025 RWE practice)

Pre-specify the RWE source and document acquisition (registry, EHR, claims, RWD vendor)
Demonstrate comparability via propensity-score overlap (standardised mean differences <0.25 for key prognostic factors)
Apply discount priors — full pooling (gamma=1) is regulatory-rejected; typical discount gamma 0.3-0.6
Sensitivity over borrowing strength — report results at multiple gamma or mixture weights
Tipping-point analysis on prior-data agreement — at what discount does the conclusion flip?
E-value or bound for unmeasured confounding (VanderWeele-Ding 2017) — required for FDA submissions; reports the minimum strength of unmeasured confounding that could overturn the result

When RWE is NOT acceptable

Trial sponsor and RWE source have meaningful incentive misalignment (e.g., RWE from non-disinterested source)
RWE captured before standard-of-care evolved (constancy violation, similar to NI biocreep)
Outcome definitions differ between RWE and current trial (variable harmonisation impossible)
Censoring patterns in RWE differ structurally from trial (administrative vs disease-driven)
Highly variable baseline characteristics impossible to balance via propensity weighting

Recent decisive cases (2024-2026)

Zynteglo (FDA 2022, ongoing post-market): beta-thalassemia gene therapy; single-arm trial vs natural history RWE comparator
Skysona (FDA 2022): cerebral adrenoleukodystrophy; RWE natural-history comparator
Multiple ultra-rare disease accelerated approvals 2024-2025: RWE/external control increasingly accepted in <100-patient trials

Spiegelhalter Skeptical/Enthusiastic Priors

Spiegelhalter, Freedman, Parmar 1994 JRSS-A 157:357: the trip-wire / skeptical-prior framework. Pre-specify a skeptical prior centred at the null and an enthusiastic prior centred at the alternative; stopping requires the skeptic to be convinced (posterior under skeptical prior exceeds threshold).

Frames "evidence for regulators" vs "evidence for sponsor" in Bayesian language; still cited in modern Bayesian-trial protocols.

# Skeptical prior: N(0, sd_sk) — centred at null
# Enthusiastic prior: N(delta_alt, sd_en) — centred at clinically meaningful effect
# Decision: stop for efficacy if P(theta > 0 | skeptical posterior) > 0.975
#           stop for futility if P(theta < delta_alt | enthusiastic posterior) > 0.80

Per-Method Failure Modes

CRM with mis-calibrated skeleton

Trigger: Default or arbitrary skeleton without indifference-interval calibration.
Mechanism: Skeleton dictates target dose; mis-calibration biases MTD.
Symptom: MTD selection differs systematically from clinical expectation.
Fix: Calibrate via Lee-Cheung 2009; or switch to BOIN.

MAP prior with prior-data conflict

Trigger: Historical control rate differs substantially from observed current control.
Mechanism: Informative MAP prior pulls toward historical; current data poorly fit.
Symptom: Posterior dominated by prior; current data evidence under-weighted.
Fix: Robust MAP with mixture weight 0.2-0.3; verify prior-data conflict via posterior predictive checks.

EXNEX with default 0.5/0.5 weights

Trigger: Default mixture weights without sensitivity.
Mechanism: 50% EX weight allows substantial borrowing even when basket differs.
Symptom: Detected differential basket "softened" by borrowing.
Fix: Sensitivity analysis over weights (0.1, 0.3, 0.5, 0.7, 0.9); report range.

Posterior probability stopping without simulation-calibrated threshold

Trigger: Stopping rule P(theta > 0 | data) > 0.975 applied without Type-I simulation.
Mechanism: Bayesian rule may not control frequentist Type-I in regulatory sense.
Symptom: FDA review flags lack of Type-I demonstration.
Fix: Simulate under null; calibrate threshold so frequentist Type-I = nominal.

I-SPY 2 graduation criterion without bias correction

Trigger: Graduated arm's effect estimate reported uncorrected.
Mechanism: Selection on PP > 0.85 inflates estimate.
Symptom: Phase 3 confirmation finds smaller effect than platform suggested.
Fix: Bias-correction via conditional MLE or hierarchical Bayesian; cite Robertson 2023.

Bayesian shrinkage for signal discovery (Dane vs Hemmings)

Trigger: Hierarchical model fit during signal discovery rather than replication planning.
Mechanism: Shrinkage pre-emptively damps heterogeneity being searched for.
Symptom: Signal detected by causal forest gets shrunken to null in shrinkage analysis.
Fix: Hemmings-Koch 2019 position — shrinkage for replication planning, not signal generation; cite Dane et al 2019 EFSPI white paper + critique.

Power prior with gamma = 1 (full pooling)

Trigger: Full pooling of historical and current data.
Mechanism: Ignores between-study heterogeneity; biases estimate.
Symptom: Overconfident posterior; cross-validation reveals poor fit.
Fix: Working-convention discount gamma 0.3-0.6 (the FDA Bayesian Jan 2026 draft does not prescribe a specific range); sensitivity over gamma.

WinBUGS reproducibility

Trigger: Submission contains WinBUGS code without containerised environment.
Mechanism: Older Windows-only software; reproducibility fragile.
Symptom: Reviewer cannot replicate analysis.
Fix: Migrate to Stan (rstan/cmdstanr); Docker/renv-pinned environment; include seeds + posterior diagnostics (R-hat <1.01, ESS >1000 per chain).

Quantitative Thresholds

| Threshold | Source | Rationale | |-----------|--------|-----------| | FDA BOIN Fit-for-Purpose qualification (Dec 2021) | FDA Drug Development Tools program | First formal FDA dose-finding endorsement | | Target DLT rate 30% (Phase 1 oncology) | Standard convention | Modal target across oncology Phase 1 | | MAP prior effective sample size 20-80% of new control | Schmidli 2014 | Borrowing strength typical range | | Robust MAP mixture weight 0.1-0.3 | Schmidli 2014 | Guards against prior-data conflict | | EXNEX default 0.5 EX / 0.5 NEX | Neuenschwander 2016 | Standard starting weight; sensitivity required | | I-SPY 2 graduation PP >= 0.85 | I-SPY 2 operational reports (Rugo/Park 2016) | Bayesian platform standard | | Power prior gamma 0.3-0.6 for pediatric extrapolation | working convention; the FDA Bayesian Jan 2026 draft does not prescribe a specific range | Partial borrowing default | | Stan R-hat <1.01, ESS >1000 per chain | Vehtari 2021 Bayesian Analysis | Posterior convergence criteria | | EWOC overdose constraint P(dose > MTD) <= 0.25 | Babb-Rogatko-Zacks 1998 | Safety floor |

Common Errors

| Error / symptom | Cause | Solution | |-----------------|-------|----------| | CRM with arbitrary skeleton | No calibration | Lee-Cheung 2009 indifference-interval; or BOIN | | MAP without prior-data conflict check | Posterior dominated by prior | Robust MAP; PP-check; sensitivity over mixture weight | | EXNEX with single weight scheme | No sensitivity | Weights 0.1, 0.3, 0.5, 0.7, 0.9; report range | | Posterior probability stopping without Type-I sim | Regulatory rejection | Simulate under null; calibrate threshold | | I-SPY 2 graduated arm reported uncorrected | Selection bias | Conditional MLE; cite Robertson 2023 | | Bayesian shrinkage for signal discovery | Hemmings-Koch critique | Shrinkage for replication only | | Power prior gamma = 1 | Full pooling | Discount 0.3-0.6 per FDA 2026 draft | | WinBUGS without containerisation | Reproducibility | Stan + Docker/renv-pinned | | BOIN vs CRM comparison without simulation OCs | Apples-to-oranges | Compare OCs over same true DLT rates | | FDA cited for Bayesian drugs guidance pre-2026 | Confusion | FDA 2010 is DEVICES; FDA 2026 (draft) is drugs |

Anticipated Reviewer Pushback

| Pushback | Response | |----------|----------| | "Type-I error control?" | Simulation under null demonstrates frequentist Type-I = nominal at threshold chosen; documented in SAP appendix | | "Prior justification?" | MAP from historical control arms via gMAP; robust mixture weight 0.2 for prior-data conflict; sensitivity over prior provided | | "Why BOIN over CRM?" | BOIN Fit-for-Purpose qualified Dec 2021; pre-tabulated escalation; no bedside Bayesian software; OCs comparable to CRM in simulation | | "EXNEX weight sensitivity?" | Reported over weights 0.1, 0.3, 0.5, 0.7, 0.9; results stable; primary at 0.5/0.5 per Neuenschwander 2016 | | "Power prior gamma?" | Discount 0.5 per FDA Bayesian Jan 2026 draft; sensitivity over 0.3-0.7 provided | | "Posterior probability threshold?" | Calibrated via simulation to frequentist Type-I 0.025 one-sided; cite Berry 2010 | | "Stan reproducibility?" | Docker container + renv-pinned R + Stan version; seeds provided; R-hat <1.01, ESS >2000 per parameter | | "Bias correction on platform graduation?" | Conditional MLE applied to estimate Phase 3 effect; cite Robertson 2023 | | "Why not frequentist instead?" | Bayesian framework permits borrowing (rare disease, pediatric); working convention; the FDA Bayesian Jan 2026 draft does not prescribe a specific gamma range -- check the draft for the current language before quoting primary inference with simulation calibration |

References

Babb J, Rogatko A, Zacks S. 1998. Cancer Phase I clinical trials: efficient dose escalation with overdose control. Stat Med 17:1103-1120.
Berry SM, Berry DA. 2004. Accounting for multiplicities in assessing drug safety: a three-level hierarchical mixture model. Biometrics 60:418-426.
Berry SM, Broglio KR, Groshen S, Berry DA. 2013. Bayesian hierarchical modeling of patient subpopulations: efficient designs of Phase II oncology clinical trials. Clin Trials 10:720-734.
Berry SM, Carlin BP, Lee JJ, Müller P. 2010. Bayesian Adaptive Methods for Clinical Trials. CRC.
Dane A, Spencer A, Rosenkranz G, Lipkovich I, Parke T. 2019. Subgroup analysis and interpretation for phase 3 confirmatory trials: EFSPI/PSI white paper. Pharm Stat 18:126-139.
FDA. 2010. Guidance for Industry: Use of Bayesian Statistics in Medical Device Clinical Trials.
FDA. 2021. BOIN Drug Development Tool Fit-for-Purpose Qualification.
FDA. 2026. Use of Bayesian Methodology in Clinical Trials. Draft Guidance (FDA-2025-D-3217).
Guo W, Wang SJ, Yang S, Lynn H, Ji Y. 2017. A Bayesian interval dose-finding design addressing Ockham's razor: mTPI-2. Contemp Clin Trials 58:23-33.
Hemmings R, Koch A. 2019. Commentary on Dane et al. Pharm Stat 18:140-144.
Ji Y, Liu P, Li Y, Bekele BN. 2010. A modified toxicity probability interval method for dose-finding trials. Clin Trials 7:653-663.
Liu S, Yuan Y. 2015. Bayesian optimal interval designs for phase I clinical trials. JRSS-C 64:507-523.
Neuenschwander B, Wandel S, Roychoudhury S, Bailey S. 2016. Robust exchangeability designs for early phase clinical trials with multiple strata. Pharm Stat 15:123-134.
O'Quigley J, Pepe M, Fisher L. 1990. Continual reassessment method: a practical design for phase 1 clinical trials in cancer. Biometrics 46:33-48.
Rugo HS et al. 2016. Adaptive randomization of veliparib-carboplatin treatment in breast cancer. NEJM 375:23-34.
Robertson DS, Lee KM, López-Kolkovska BC, Villar SS. 2023. Response-adaptive randomization in clinical trials: from myths to practical considerations. Stat Sci 38:185-208.
Schmidli H, Gsteiger S, Roychoudhury S, O'Hagan A, Spiegelhalter D, Neuenschwander B. 2014. Robust meta-analytic-predictive priors in clinical trials with historical control information. Biometrics 70:1023-1032.
Spiegelhalter DJ, Freedman LS, Parmar MKB. 1994. Bayesian approaches to randomized trials. JRSS-A 157:357-387.
Vehtari A et al. 2021. Rank-normalization, folding, and localization: an improved R-hat for assessing convergence. Bayesian Analysis.
Weber S, Li Y, Seaman J, Kakizume T, Schmidli H. 2021. Applying meta-analytic-predictive priors with the R Bayesian evidence synthesis tools. J Stat Softw 100:19.

Related Skills

clinical-biostatistics/adaptive-designs - Group-sequential, SSR, platform trials
clinical-biostatistics/subgroup-analysis - Bayesian shrinkage for HTE (Dixon-Simon, Berry)
clinical-biostatistics/power-and-sample-size - Bayesian SS via predictive probability of success
clinical-biostatistics/multiplicity-graphical - Berry-Berry AE hierarchical
clinical-biostatistics/trial-reporting - Bayesian inference reporting per CONSORT 2025
clinical-biostatistics/missing-data-sensitivity - Bayesian rbmi imputation
machine-learning/biomarker-discovery - Bayesian HTE for biomarker subgroups
experimental-design/sample-size - General methods

Version Compatibility

Before using code patterns, verify installed versions match. If versions differ:

R: packageVersion('<pkg>') then ?function_name
Confirmatory regulatory work: validate against pinned package versions in submission

If code throws an error, introspect the installed package and adapt the example to match the actual API rather than retrying.

Bayesian Clinical Trials

Regulatory Status -- The 2024-2026 Bayesian Pivot

Prospective specification
Simulation-based operating characteristics (including frequentist Type-I error under null scenarios — agency still wants calibration)
Justified priors
Code/data sufficient for FDA replication

FDA BOIN Fit-for-Purpose qualification (December 2021): first formal FDA endorsement of a specific dose-finding design under the Drug Development Tools program.

Algorithmic Taxonomy

Postdoc reading list:

FDA 2010 Guidance for Industry: Use of Bayesian Statistics in Medical Device Clinical Trials (Feb 5 2010)
FDA 2026 Draft Use of Bayesian Methodology in Clinical Trials (FDA-2025-D-3217, Jan 2026)
Berry SM, Carlin BP, Lee JJ, Müller P 2010 Bayesian Adaptive Methods for Clinical Trials (CRC)
Schmidli H, Gsteiger S, Roychoudhury S, O'Hagan A, Spiegelhalter D, Neuenschwander B 2014 Biometrics 70:1023 (MAP + robust MAP)
Weber S, Li Y, Seaman J, Kakizume T, Schmidli H 2021 J Stat Softw 100:19 (RBesT)
Neuenschwander B, Wandel S, Roychoudhury S, Bailey S 2016 Pharm Stat 15:123 (EXNEX)
Liu S, Yuan Y 2015 J R Stat Soc C 64:507 (BOIN)
O'Quigley J, Pepe M, Fisher L 1990 Biometrics 46:33 (CRM)
Babb J, Rogatko A, Zacks S 1998 Stat Med 17:1103 (EWOC)
Ji Y, Liu P, Li Y, Bekele BN 2010 Clin Trials 7:653 (mTPI)
Guo W, Wang SJ, Yang S, Lynn H, Ji Y 2017 Contemp Clin Trials 58:23 (mTPI-2 / Keyboard)
Berry SM, Broglio KR, Groshen S, Berry DA 2013 Clin Trials 10:720 (basket trial hierarchical)
Berry SM, Berry DA 2004 Biometrics 60:418 (three-level AE hierarchical)
Spiegelhalter DJ, Freedman LS, Parmar MKB 1994 JRSS-A 157:357 (skeptical/enthusiastic prior framework)
Rugo HS et al 2016 NEJM 375:23 (I-SPY 2 veliparib-carboplatin)
Angus DC et al 2020 JAMA (REMAP-CAP COVID rationale)

Decision Tree by Scenario

Phase I Dose-Finding -- BOIN, CRM, mTPI-2

BOIN (FDA-preferred operational)

library(BOIN)

# Generate escalation table for protocol
boundary_table <- get.boundary(
    target = 0.30,           # target DLT rate
    ncohort = 10,            # 10 cohorts -> max 30 patients with size 3
    cohortsize = 3,
    n.earlystop = 12,        # stop early at lowest dose if 12 patients show futility
    p.saf = 0.6 * 0.30,      # "safe" escalation boundary
    p.tox = 1.4 * 0.30       # "toxic" de-escalation boundary
)
print(boundary_table)
# Pre-printed at investigator desk; no bedside Bayesian software

# Operating characteristics simulation
oc_boin <- get.oc(
    target = 0.30,
    p.true = c(0.05, 0.10, 0.20, 0.30, 0.40, 0.55),  # true DLT per dose
    ncohort = 10,
    cohortsize = 3,
    ntrial = 1000
)
print(oc_boin)
# Reports: MTD selection accuracy, overdose risk, average sample size

CRM with calibrated skeleton

library(dfcrm)

prior_skeleton <- getprior(halfwidth = 0.05, target = 0.30, nu = 3, nlevel = 6)
# Lee-Cheung 2009 indifference-interval calibration

crm_sim <- crmsim(
    PI = c(0.05, 0.10, 0.20, 0.30, 0.40, 0.55),
    prior = prior_skeleton,
    target = 0.30,
    n = 30,
    x0 = 1,                  # starting dose
    nsim = 1000,
    method = 'bayes',
    model = 'logistic'
)
print(crm_sim)

Skeleton mis-specification is the canonical CRM failure mode. Lee-Cheung 2009 indifference-interval method gives a systematic calibration approach.

EWOC (overdose control)

# Babb-Rogatko-Zacks 1998: explicit P(dose > MTD) <= alpha (default 0.25)
# Implementation in dfcrm::ewoc; or `ewoc` package

MAP Priors and RBesT

library(RBesT)

# Historical control data (4 prior studies)
historical_data <- data.frame(
    study = c('s1', 's2', 's3', 's4'),
    n = c(40, 35, 50, 45),
    r = c(8, 6, 12, 9)         # responders
)

# Fit MAP via gMAP (Stan-based random-effects meta-analysis)
map_prior <- gMAP(
    cbind(r, n - r) ~ 1 | study,
    data = historical_data,
    family = binomial,
    tau.dist = 'HalfNormal',
    tau.prior = 0.5,           # between-study SD prior
    beta.prior = cbind(0, 2)    # weakly informative on logit response
)
print(map_prior)

# Approximate posterior with mixture for downstream computation
map_mix <- automixfit(map_prior, Nc = 2)
print(map_mix)

# Effective sample size
ess(map_mix)

# Robust MAP: add vague mixture component (weight 0.1-0.3) to guard against prior-data conflict
robust_map <- robustify(map_mix, weight = 0.2, mean = 0.5, n = 1)
print(robust_map)
ess(robust_map)

EXNEX for Basket Trials

library(OncoBayes2)  # Novartis-developed; canonical EXNEX implementation

# Or simplified via bhmbasket
library(bhmbasket)

# Conceptual: each basket has its own posterior, with shrinkage governed by exchangeability mixture
# Default weights 0.5 EX / 0.5 NEX
# Sensitivity over weights (0.1, 0.3, 0.5, 0.7, 0.9) is essential

Bayesian Platform Trials

I-SPY 2 (Rugo et al 2016 NEJM 375:23)

# Conceptual implementation requires custom Stan or FACTS (Berry Consultants commercial)

# Pseudocode:
# 1. Fit hierarchical model to platform data: response ~ arm + biomarker_subtype + arm:subtype
# 2. Posterior draws of treatment effect by subtype
# 3. For each draw, simulate Phase 3 trial: n=300, treatment vs control, observed effect
# 4. Compute proportion of draws meeting Phase 3 success criterion
# 5. If proportion >= 0.85, arm graduates

REMAP-CAP (Angus 2020 JAMA)

Severe pneumonia, repurposed for COVID-19; Bayesian factorial multi-domain design. Generated corticosteroid signal independently of RECOVERY.

Drop-the-loser vs promising-the-winner

Adaptive arm-dropping (futility): posterior P(beating control) drops below threshold -> close. Mathematically straightforward.
"Promising-the-winner": selection bias. Bias-adjusted estimators (Robertson 2023; conditional MLE) standard in I-SPY 2 reports.

Hierarchical Models for Safety Multiplicity (Berry-Berry 2004)

library(c212)  # Berry-Berry implementation

# Conceptual: each AE has log OR drawn from spike-and-slab prior
# Spike at 0 (no effect); slab as N(mu_SOC, sigma_SOC)
# SOC-level parameters from N(mu_overall, sigma_overall)
# Borrowing within SOC; shrinkage toward 0 if no evidence

# JMP Clinical also implements this for industry use

Power Priors for Borrowing

library(bayesDP)
library(psborrow2)  # FDA-supported package

# Power prior: combines current data L(theta | D_current) with historical L(theta | D_hist)^gamma
# gamma in [0, 1]; gamma = 0 = no borrowing; gamma = 1 = full pooling

# Typical pediatric extrapolation: gamma = 0.3 to 0.6 per FDA Bayesian Jan 2026 draft

External Control Arms and Real-World Evidence (RWE)

Methodology taxonomy

psborrow2 — the FDA-supported RWE framework

library(psborrow2)

# Define external and internal data
ext_data <- data.frame(usubjid = ..., trt = 0, outcome = ..., covariates = ...)
int_data <- data.frame(usubjid = ..., trt = 0 | 1, outcome = ..., covariates = ...)

# Create borrowing design
borrowing_design <- borrowing_full(
    method_name = "BDB",  # Bayesian Dynamic Borrowing
    ext_flag_col = "ext",
    tau_prior = prior_gamma(0.001, 0.001)  # weakly informative on borrowing
)

# Outcome model (Cox for TTE; logistic for binary)
outcome_model <- outcome_surv_exponential(
    time_var = "time",
    cens_var = "cens",
    baseline_prior = prior_normal(0, 100),
    trt_prior = prior_normal(0, 100)
)

# Run Bayesian analysis with covariate adjustment + borrowing
result <- create_analysis_obj(
    data_matrix = borrow_obj,
    outcome = outcome_model,
    borrowing = borrowing_design,
    covariates = c("age", "ecog", "baseline_severity")
)
mcmc_result <- mcmc_sample(result, n_chains = 4, n_iter = 4000)

Operational rules (FDA 2024-2025 RWE practice)

Pre-specify the RWE source and document acquisition (registry, EHR, claims, RWD vendor)
Demonstrate comparability via propensity-score overlap (standardised mean differences <0.25 for key prognostic factors)
Apply discount priors — full pooling (gamma=1) is regulatory-rejected; typical discount gamma 0.3-0.6
Sensitivity over borrowing strength — report results at multiple gamma or mixture weights
Tipping-point analysis on prior-data agreement — at what discount does the conclusion flip?
E-value or bound for unmeasured confounding (VanderWeele-Ding 2017) — required for FDA submissions; reports the minimum strength of unmeasured confounding that could overturn the result

When RWE is NOT acceptable

Trial sponsor and RWE source have meaningful incentive misalignment (e.g., RWE from non-disinterested source)
RWE captured before standard-of-care evolved (constancy violation, similar to NI biocreep)
Outcome definitions differ between RWE and current trial (variable harmonisation impossible)
Censoring patterns in RWE differ structurally from trial (administrative vs disease-driven)
Highly variable baseline characteristics impossible to balance via propensity weighting

Recent decisive cases (2024-2026)

Zynteglo (FDA 2022, ongoing post-market): beta-thalassemia gene therapy; single-arm trial vs natural history RWE comparator
Skysona (FDA 2022): cerebral adrenoleukodystrophy; RWE natural-history comparator
Multiple ultra-rare disease accelerated approvals 2024-2025: RWE/external control increasingly accepted in <100-patient trials

Spiegelhalter Skeptical/Enthusiastic Priors

Frames "evidence for regulators" vs "evidence for sponsor" in Bayesian language; still cited in modern Bayesian-trial protocols.

# Skeptical prior: N(0, sd_sk) — centred at null
# Enthusiastic prior: N(delta_alt, sd_en) — centred at clinically meaningful effect
# Decision: stop for efficacy if P(theta > 0 | skeptical posterior) > 0.975
#           stop for futility if P(theta < delta_alt | enthusiastic posterior) > 0.80

Per-Method Failure Modes

CRM with mis-calibrated skeleton

Trigger: Default or arbitrary skeleton without indifference-interval calibration.
Mechanism: Skeleton dictates target dose; mis-calibration biases MTD.
Symptom: MTD selection differs systematically from clinical expectation.
Fix: Calibrate via Lee-Cheung 2009; or switch to BOIN.

MAP prior with prior-data conflict

Trigger: Historical control rate differs substantially from observed current control.
Mechanism: Informative MAP prior pulls toward historical; current data poorly fit.
Symptom: Posterior dominated by prior; current data evidence under-weighted.
Fix: Robust MAP with mixture weight 0.2-0.3; verify prior-data conflict via posterior predictive checks.

EXNEX with default 0.5/0.5 weights

Trigger: Default mixture weights without sensitivity.
Mechanism: 50% EX weight allows substantial borrowing even when basket differs.
Symptom: Detected differential basket "softened" by borrowing.
Fix: Sensitivity analysis over weights (0.1, 0.3, 0.5, 0.7, 0.9); report range.

Posterior probability stopping without simulation-calibrated threshold

Trigger: Stopping rule P(theta > 0 | data) > 0.975 applied without Type-I simulation.
Mechanism: Bayesian rule may not control frequentist Type-I in regulatory sense.
Symptom: FDA review flags lack of Type-I demonstration.
Fix: Simulate under null; calibrate threshold so frequentist Type-I = nominal.

I-SPY 2 graduation criterion without bias correction

Trigger: Graduated arm's effect estimate reported uncorrected.
Mechanism: Selection on PP > 0.85 inflates estimate.
Symptom: Phase 3 confirmation finds smaller effect than platform suggested.
Fix: Bias-correction via conditional MLE or hierarchical Bayesian; cite Robertson 2023.

Bayesian shrinkage for signal discovery (Dane vs Hemmings)

Trigger: Hierarchical model fit during signal discovery rather than replication planning.
Mechanism: Shrinkage pre-emptively damps heterogeneity being searched for.
Symptom: Signal detected by causal forest gets shrunken to null in shrinkage analysis.
Fix: Hemmings-Koch 2019 position — shrinkage for replication planning, not signal generation; cite Dane et al 2019 EFSPI white paper + critique.

Power prior with gamma = 1 (full pooling)

Trigger: Full pooling of historical and current data.
Mechanism: Ignores between-study heterogeneity; biases estimate.
Symptom: Overconfident posterior; cross-validation reveals poor fit.
Fix: Working-convention discount gamma 0.3-0.6 (the FDA Bayesian Jan 2026 draft does not prescribe a specific range); sensitivity over gamma.

WinBUGS reproducibility

Trigger: Submission contains WinBUGS code without containerised environment.
Mechanism: Older Windows-only software; reproducibility fragile.
Symptom: Reviewer cannot replicate analysis.
Fix: Migrate to Stan (rstan/cmdstanr); Docker/renv-pinned environment; include seeds + posterior diagnostics (R-hat <1.01, ESS >1000 per chain).

Quantitative Thresholds

Common Errors

Anticipated Reviewer Pushback

References

Babb J, Rogatko A, Zacks S. 1998. Cancer Phase I clinical trials: efficient dose escalation with overdose control. Stat Med 17:1103-1120.
Berry SM, Berry DA. 2004. Accounting for multiplicities in assessing drug safety: a three-level hierarchical mixture model. Biometrics 60:418-426.
Berry SM, Broglio KR, Groshen S, Berry DA. 2013. Bayesian hierarchical modeling of patient subpopulations: efficient designs of Phase II oncology clinical trials. Clin Trials 10:720-734.
Berry SM, Carlin BP, Lee JJ, Müller P. 2010. Bayesian Adaptive Methods for Clinical Trials. CRC.
Dane A, Spencer A, Rosenkranz G, Lipkovich I, Parke T. 2019. Subgroup analysis and interpretation for phase 3 confirmatory trials: EFSPI/PSI white paper. Pharm Stat 18:126-139.
FDA. 2010. Guidance for Industry: Use of Bayesian Statistics in Medical Device Clinical Trials.
FDA. 2021. BOIN Drug Development Tool Fit-for-Purpose Qualification.
FDA. 2026. Use of Bayesian Methodology in Clinical Trials. Draft Guidance (FDA-2025-D-3217).
Guo W, Wang SJ, Yang S, Lynn H, Ji Y. 2017. A Bayesian interval dose-finding design addressing Ockham's razor: mTPI-2. Contemp Clin Trials 58:23-33.
Hemmings R, Koch A. 2019. Commentary on Dane et al. Pharm Stat 18:140-144.
Ji Y, Liu P, Li Y, Bekele BN. 2010. A modified toxicity probability interval method for dose-finding trials. Clin Trials 7:653-663.
Liu S, Yuan Y. 2015. Bayesian optimal interval designs for phase I clinical trials. JRSS-C 64:507-523.
Neuenschwander B, Wandel S, Roychoudhury S, Bailey S. 2016. Robust exchangeability designs for early phase clinical trials with multiple strata. Pharm Stat 15:123-134.
O'Quigley J, Pepe M, Fisher L. 1990. Continual reassessment method: a practical design for phase 1 clinical trials in cancer. Biometrics 46:33-48.
Rugo HS et al. 2016. Adaptive randomization of veliparib-carboplatin treatment in breast cancer. NEJM 375:23-34.
Robertson DS, Lee KM, López-Kolkovska BC, Villar SS. 2023. Response-adaptive randomization in clinical trials: from myths to practical considerations. Stat Sci 38:185-208.
Schmidli H, Gsteiger S, Roychoudhury S, O'Hagan A, Spiegelhalter D, Neuenschwander B. 2014. Robust meta-analytic-predictive priors in clinical trials with historical control information. Biometrics 70:1023-1032.
Spiegelhalter DJ, Freedman LS, Parmar MKB. 1994. Bayesian approaches to randomized trials. JRSS-A 157:357-387.
Vehtari A et al. 2021. Rank-normalization, folding, and localization: an improved R-hat for assessing convergence. Bayesian Analysis.
Weber S, Li Y, Seaman J, Kakizume T, Schmidli H. 2021. Applying meta-analytic-predictive priors with the R Bayesian evidence synthesis tools. J Stat Softw 100:19.

Related Skills

clinical-biostatistics/adaptive-designs - Group-sequential, SSR, platform trials
clinical-biostatistics/subgroup-analysis - Bayesian shrinkage for HTE (Dixon-Simon, Berry)
clinical-biostatistics/power-and-sample-size - Bayesian SS via predictive probability of success
clinical-biostatistics/multiplicity-graphical - Berry-Berry AE hierarchical
clinical-biostatistics/trial-reporting - Bayesian inference reporting per CONSORT 2025
clinical-biostatistics/missing-data-sensitivity - Bayesian rbmi imputation
machine-learning/biomarker-discovery - Bayesian HTE for biomarker subgroups
experimental-design/sample-size - General methods

Adoption

GPTomics/bio-clinical-biostatistics-bayesian-trials

$ install --global

Security Scan Results

SKILL.md

Version Compatibility

Bayesian Clinical Trials

Regulatory Status -- The 2024-2026 Bayesian Pivot

Algorithmic Taxonomy

Decision Tree by Scenario

Phase I Dose-Finding -- BOIN, CRM, mTPI-2

BOIN (FDA-preferred operational)

CRM with calibrated skeleton

EWOC (overdose control)

MAP Priors and RBesT

EXNEX for Basket Trials

Bayesian Platform Trials

I-SPY 2 (Rugo et al 2016 NEJM 375:23)

REMAP-CAP (Angus 2020 JAMA)

Drop-the-loser vs promising-the-winner

Hierarchical Models for Safety Multiplicity (Berry-Berry 2004)

Power Priors for Borrowing

External Control Arms and Real-World Evidence (RWE)

Methodology taxonomy

psborrow2 — the FDA-supported RWE framework

Operational rules (FDA 2024-2025 RWE practice)

When RWE is NOT acceptable

Recent decisive cases (2024-2026)

Spiegelhalter Skeptical/Enthusiastic Priors

Per-Method Failure Modes

CRM with mis-calibrated skeleton

MAP prior with prior-data conflict

EXNEX with default 0.5/0.5 weights

Posterior probability stopping without simulation-calibrated threshold

I-SPY 2 graduation criterion without bias correction

Bayesian shrinkage for signal discovery (Dane vs Hemmings)

Power prior with gamma = 1 (full pooling)

WinBUGS reproducibility

Quantitative Thresholds

Common Errors

Anticipated Reviewer Pushback

References

Related Skills

Related Skills

GPTomics/bio-workflows-clip-pipeline

GPTomics/bio-comparative-genomics-whole-genome-duplication

GPTomics/bio-comparative-genomics-whole-genome-alignment

GPTomics/bio-comparative-genomics-synteny-analysis

GPTomics/bio-clinical-biostatistics-bayesian-trials

$ install --global

Security Scan Results

SKILL.md

Version Compatibility

Bayesian Clinical Trials

Regulatory Status -- The 2024-2026 Bayesian Pivot

Algorithmic Taxonomy

Decision Tree by Scenario

Phase I Dose-Finding -- BOIN, CRM, mTPI-2

BOIN (FDA-preferred operational)

CRM with calibrated skeleton

EWOC (overdose control)

MAP Priors and RBesT

EXNEX for Basket Trials

Bayesian Platform Trials

I-SPY 2 (Rugo et al 2016 NEJM 375:23)

REMAP-CAP (Angus 2020 JAMA)

Drop-the-loser vs promising-the-winner

Hierarchical Models for Safety Multiplicity (Berry-Berry 2004)

Power Priors for Borrowing

External Control Arms and Real-World Evidence (RWE)

Methodology taxonomy

psborrow2 — the FDA-supported RWE framework

Operational rules (FDA 2024-2025 RWE practice)

When RWE is NOT acceptable

Recent decisive cases (2024-2026)

Spiegelhalter Skeptical/Enthusiastic Priors

Per-Method Failure Modes

CRM with mis-calibrated skeleton

MAP prior with prior-data conflict

EXNEX with default 0.5/0.5 weights