clinical-biostatistics/trial-reporting/SKILL.md
Prepares statistical reports for clinical trials following CONSORT 2025, SPIRIT 2025, ICH E9(R1) estimands, and FDA 2023 covariate adjustment guidance. Covers Table 1 generation, analysis populations (ITT/FAS/PP/Safety), the 5 ICH E9(R1) intercurrent-event strategies, MMRM under MAR (mmrm), reference-based MI (rbmi J2R/CR/CIR), Permutt tipping-point sensitivity, and Rubin's-rules vs frequentist variance debate. Use when preparing regulatory submissions, defining estimands, or implementing missing-data sensitivity analyses.
npx skillsauth add GPTomics/bioSkills bio-clinical-biostatistics-trial-reportingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: tableone 0.9+, statsmodels 0.14+, scikit-learn 1.4+, pandas 2.1+, numpy 1.26+. R packages cited (essential for current regulatory work): mmrm 0.3+ (Roche/openpharma), rbmi 1.5+ (Roche/Bayer via insightsengineering), gMCP, RBesT.
Before using code patterns, verify installed versions match. If versions differ:
pip show <package> then help(module.function) to check signaturespackageVersion('<pkg>') then ?function_nameIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
"Prepare a clinical trial statistical report" -> Define the estimand explicitly per ICH E9(R1); execute a covariate-adjusted primary analysis targeting the right summary measure; pre-specify the missing-data strategy and run regulatory-grade sensitivity analyses; structure the output per CONSORT 2025 and the new SPIRIT 2025 alignment.
Kahan, Cro, Li, Harhay 2023 Am J Epidemiol 192:987 ("Eliminating Ambiguous Treatment Effects Using Estimands"): 98% of published trial reports do not describe what the reported treatment effect represents. 54% of trials: impossible to deduce the estimand from reported methods. In 74% of trials submitted for regulatory approval 1996-2017, "what-if" hypothetical effects were used but only 2 trials explained this.
The framework: ICH E9(R1) Addendum (November 2019, EMA effective Feb 2020, FDA May 2021) defines an estimand as the precise specification of what is being estimated, via five attributes:
The order is non-negotiable: specify the estimand BEFORE choosing the statistical method. Choosing MMRM and retrofitting the estimand to match is the canonical error.
| Strategy | What it does | Typical implementation | Identification cost | Regulatory pattern | |----------|--------------|------------------------|---------------------|---------------------| | Treatment policy | Include all data regardless of ICE | ANCOVA on observed value (retrieved-dropout data); ITT-like | Trivially identified; needs full follow-up regardless of ICE | FDA-preferred default for cardio/HF/CV-safety; Fleming 2025 endorsement | | Hypothetical | What would have been observed had ICE not occurred | MMRM under MAR; g-computation; IPCW; reference-based MI under MAR | Sequential ignorability (causal); MAR (missing-data shorthand) | Heavy use in CNS, diabetes, respiratory; EMA more accepting than FDA | | Composite | Incorporate ICE into endpoint | Death = non-responder; PFS = composite of progression OR death; MACE | Identified from observed data; embeds a ranking choice | Standard in oncology PFS; acceptable when ICE has clinical signal | | While-on-treatment | Use only pre-ICE values | Censor at discontinuation (for TTE); analyse last pre-ICE value (for repeated measures) | Estimates conditional quantity | Safety endpoints (AE rate per time on drug); FDA cautious for efficacy | | Principal stratum | Confine to latent stratum (e.g. tolerators) | Bayesian estimation under monotonicity/principal ignorability | Latent membership; unverifiable assumptions | Rare as primary; some oncology/vaccine acceptance |
Postdoc reading list:
| Trial scenario | Recommended estimand strategy | Why | |----------------|-------------------------------|-----| | Continuous endpoint, monotone missingness, MAR plausible | Hypothetical via MMRM (mmrm + KR) | Standard FDA-favoured MAR analysis; cite Mallinckrodt 2008/2014 | | Continuous endpoint, ICE = treatment discontinuation, sponsor wants effectiveness | Treatment policy via retrieved-dropout MI | If post-ICE data available; ITT-respecting | | Continuous endpoint, treatment-policy primary with ICE-related missingness | Hybrid: J2R for discontinuation ICEs, MMRM-MAR for other missingness | Aprocitentan precedent; de facto FDA standard 2024-2025 | | Binary endpoint, RCT, FDA 2023-compliant | Marginal RD via g-computation; conditional OR supportive | See clinical-biostatistics/logistic-regression for g-computation | | Oncology OS with crossover | Treatment policy as primary; hypothetical (RPSFT/IPCW) as sensitivity | Sotorasib CodeBreaK 200 precedent | | Oncology PFS | Treatment policy with composite for subsequent therapy ICE | Lewis 2023 framework; Fleming 2025 | | Weight management / chronic disease | Retrieved-dropout MI (Wegovy STEP precedent); J2R supportive | FDA 2025 obesity draft guidance explicitly endorses | | Long-term safety endpoint | While-on-treatment for rate; treatment policy for cumulative incidence | Standard ICH E2A practice | | AlloSCT in hematologic oncology | Composite "event-free survival" treating alloSCT as event | Rufibach 2020; "no alloSCT" hypothetical is clinically meaningless | | Symptomatic palliative endpoint with high dropout | Composite with worst-rank for dropouts (Permutt trimmed means) | Permutt 2017 Pharm Stat 16:20 |
Mallinckrodt 2008/2014, codified in DIA Scientific Working Group "three pillars" doctrine: for continuous longitudinal endpoints under monotone (or near-monotone) MAR, an MMRM with treatment + visit + treatment-by-visit + baseline + baseline-by-visit, unstructured (UN) within-subject covariance, REML, contrast at the primary timepoint -- is the consistent and FDA-preferred analysis. LOCF is biased even under MCAR because it discards imputation uncertainty and assumes a flat post-withdrawal trajectory.
library(mmrm)
fit <- mmrm(
formula = change_from_baseline ~ baseline + arm * visit + us(visit | subject),
data = trial_data,
method = "Kenward-Roger", # or "Satterthwaite", "Kenward-Roger-Linear"
reml = TRUE
)
summary(fit) # treatment-by-visit contrast at primary timepoint
The Kenward-Roger flavour question: method = "Kenward-Roger" uses full second-order Kenward-Roger (Kenward-Roger 1997 Biometrics 53:983), which inflates SE for fixed-effect contrasts using an adjusted covariance estimator with second-order Taylor terms. method = "Kenward-Roger-Linear" drops the second-order Cholesky-derivative term to match SAS PROC MIXED bit-for-bit. Most submissions use Kenward-Roger-Linear to maintain SAS-R reproducibility.
Unstructured (UN) covariance has p(p+1)/2 parameters for p visits. With ~30-50 patients per arm by week 12 and 6+ visits, UN can fail to converge. The industry-standard fallback hierarchy per pre-specified SAP:
Each step down imposes more structure and the structure can be wrong — biasing both SEs and point estimates. CS imposes equal correlation across time which is rarely true for treatment-ramp-up endpoints (HbA1c, BP). Pre-specify the fallback in the SAP, not at analysis time.
Olarte Parra, Bartlett, Daniel 2022 Stat Biopharm Res: under specific identifying assumptions, MMRM under MAR IS a causal hypothetical estimand via g-formula equivalence. The "issue" is articulation, not statistical machinery — MMRM-MAR implicitly answers a hypothetical estimand whose hypothetical scenario must be made explicit in the SAP (e.g., "what would the mean response at week 24 be had all patients continued randomised treatment and remained observable?").
Carpenter, Roger, Kenward 2013 J Biopharm Stat 23:1352 — the canonical paper. Reference-based MI operationalises MNAR sensitivity not as a numeric delta but as a clinical narrative:
rbmi R package (Wolbers et al 2022 Pharm Stat 21(6):1246-1257; CRAN; insightsengineering):
library(rbmi)
# Draws -> Impute -> Analyse -> Pool pipeline
draws <- draws(data = trial_data, vars = vars,
method = method_bayes(n_samples = 100))
imputed <- impute(draws, references = c('Active' = 'Placebo', 'Placebo' = 'Placebo'))
analyses <- analyse(imputed, fun = ancova,
vars = list(outcome = 'change', visit = 'avisit',
group = 'arm', covariates = c('baseline')))
result <- pool(analyses) # Rubin's rules pooling
Four inference engines:
The single most active methodological argument in current biostatistics.
Cro/Carpenter/Kenward 2019 JRSS-A 182:623 ("Information-Anchored Sensitivity Analysis"): proved that Rubin's-rules variance applied to J2R/CR/CIR is approximately information-anchored — the relative loss of information from missingness in the sensitivity analysis matches the relative loss in the MAR primary analysis. True repeated-sampling variance is "information positive" because reference-based imputation borrows from the reference arm and reduces the marginal variance of the active arm BELOW what an MAR analysis with the same missingness would give.
Philosophical position: a sensitivity analysis should not import information the primary analysis did not have; if borrowing from placebo makes the active-arm CI narrower, the analysis is no longer "anchored" to the same information state.
Bartlett 2021 Stat Biopharm Res 15(1):178 + Wolbers 2022 Pharm Stat counter: if J2R is the actual sampling model under which inference is made, then the correct frequentist variance is the one that delivers nominal Type-I error and CI coverage under that model -- the jackknife/bootstrap variance, NOT Rubin's. Simulations in rbmi vignettes: Bayesian MI with Rubin's gives Type-I error 0.9-2.5% (over-conservative); CMI+jackknife gives 4.84-4.96% (nominal) under J2R; Bayesian MI loses real power.
Regulatory practice 2024-2025 is bifurcating: EMA tolerates either; FDA reviewers increasingly flag Rubin's-rules variance under reference-based MI as needing a frequentist sensitivity analysis in addition. What postdocs argue about: whether Type-I inflation under bootstrap is the price of correct inference, or evidence J2R was never coherent as a true sampling model.
Permutt 2016 Stat Med 35:2876 (Permutt was head of FDA Division of Biometrics IV): the regulator's question is not "what is a reasonable MNAR adjustment?" but "how bad would the missing data have to be in the active arm to overturn the significant primary result?"
Delta-adjustment patterns:
# rbmi with delta adjustment
delta <- delta_template(imputed, delta = c(0, 5, 10, 15, 20), dlag = c(1, 1, 1, 1))
adjusted <- analyse(imputed, delta = delta, ...)
# Report: minimum delta that flips p-value below 0.05
The regulator then judges whether the tipping delta is clinically plausible — larger than the active-arm treatment effect itself? Larger than the MCID? FDA-preferred report: tipping delta in units of residual SD (for cross-trial comparison), not raw outcome units.
Aducanumab (Biogen BLA 761178, 2021): EMERGE and ENGAGE studies both stopped early for futility; EMERGE high-dose positive, ENGAGE negative. MMRM-MAR primary. FDA Office of Biostatistics (Tristan Massie review) argued futility-stop-induced missingness was not MAR (differential ARIA-driven unblinding); 6-1 AdCom against approval. Textbook case showing MAR-based primary in trial with high differential missingness is regulator-divisive.
Aprocitentan (Idorsia PRECISION trial, FDA approval 2024): documented in Mathur 2025 Pharm Stat (PMC12753554). FDA pushed back on sponsor's MMRM-MAR primary; MAR was not credible for treatment-discontinuers. Accepted compromise: stratified imputation — J2R for treatment-discontinuation ICEs, MAR-MMRM for other missingness. This hybrid is now de facto FDA standard for treatment-policy estimand.
Wegovy/Ozempic STEP trials (Wilding 2021 NEJM; NDA 215256): retrieved-dropout MI as primary for treatment-policy. Missing body weight at week 68 imputed by sampling from observed week-68 measurements among "retrieved dropouts" (patients who discontinued semaglutide but remained in follow-up). J2R-MI as supportive. RD-MI now standard for chronic weight management. FDA 2025 obesity guidance explicitly endorses MI as primary.
CONSORT 2010 discouraged baseline significance tests because randomisation is a known mechanism, not a hypothesis. Many journals still require them.
from tableone import TableOne
columns = ['age', 'sex', 'race', 'bmi', 'baseline_score', 'disease_stage']
categorical = ['sex', 'race', 'disease_stage']
table1 = TableOne(df, columns=columns, categorical=categorical,
groupby='ARM', pval=True, smd=True,
missing=True, overall=True)
print(table1.tabulate(tablefmt='github'))
table1.to_excel('table1.xlsx')
Use standardised mean differences (SMD) rather than p-values: SMD > 0.1 suggests meaningful imbalance regardless of statistical significance.
Senn's "balance testing is incoherent" (1994 Stat Med 13:1715; Altman 1985): balancing via randomisation, testing balance, then adjusting only when the test fails is a selection rule that destroys nominal Type-I error. Pre-specify covariates in the SAP; do not condition adjustment on observed imbalance.
| Population | Definition | Bias direction | Primary use | |-----------|------------|----------------|-------------| | ITT | All randomised, as randomised | Conservative (toward null) | Primary efficacy per ICH E9 | | FAS (Full Analysis Set) | ITT excluding eligibility failures + subjects with no post-baseline data | Middle ground; close to ITT | Common practical primary; ICH E9 "as complete as possible while remaining unbiased" | | Per-Protocol | Completed treatment per protocol without major violations | Anti-conservative (inflates effect) | Sensitivity analysis only | | Safety | All received at least one dose | n/a | AE analysis | | mITT | Sponsor-defined modified ITT | Variable | Pre-specify and justify |
FAS vs ITT distinction is critical for regulatory submissions — FAS may exclude post-randomisation subjects (ineligibility, no post-baseline efficacy); ITT cannot. Sponsors often equate them on the SAP only to discover at submission that FDA expected stricter ITT. Pre-specification in protocol is essential.
itt = dm.copy()
pp = dm[dm['USUBJID'].isin(completers) & ~dm['USUBJID'].isin(protocol_violators)]
dosed = ex[ex['EXDOSE'] > 0]['USUBJID'].unique()
safety = dm[dm['USUBJID'].isin(dosed)]
| Mechanism | Definition | Testable? | Valid method | |-----------|------------|-----------|--------------| | MCAR | Independent of all data | Partially (Little's test) | Complete-case unbiased but loses power | | MAR | Depends on observed data only | NO (assumption) | MMRM under MAR; MI under MAR | | MNAR | Depends on unobserved values | NO | Requires sensitivity analysis (J2R, CR, CIR, tipping point) |
MAR vs MNAR cannot be distinguished from observed data alone — this is a fundamental limitation. Pre-specify the assumed mechanism in the SAP; pre-specify the sensitivity analysis under MNAR (NRC 2010 Recommendation 15: "examining sensitivity to assumptions about the missing-data mechanism should be a mandatory component of reporting").
Clinical reasoning beyond the abstraction: examine the DS (Disposition) domain to tabulate reasons for discontinuation by treatment arm. If discontinuation rates or reasons differ between arms, missing data is likely informative and MNAR sensitivity analyses are mandatory.
Goal: Execute the FDA-preferred primary analysis for a continuous longitudinal endpoint under MAR with valid Type-I in small/moderate trials.
Approach: Fit MMRM with unstructured covariance + Kenward-Roger via the Roche/openpharma mmrm R package; in Python, use statsmodels.mixedlm as exploratory-only (lacks KR).
# Python is weak for MMRM — current state of the art is R `mmrm`
# For Python users, statsmodels.mixedlm is the closest alternative but lacks
# Kenward-Roger; consider rpy2 to call R from Python for confirmatory work.
import statsmodels.formula.api as smf
import pandas as pd
# Random intercept LMM (suboptimal vs MMRM but Python-native)
model = smf.mixedlm(
'change ~ baseline + C(ARM) * C(VISIT)',
data=df_long,
groups=df_long['USUBJID']
).fit(reml=True)
# WARNING: this is NOT FDA-equivalent to MMRM with UN+KR
# For confirmatory work, use R `mmrm` package via rpy2 or fit in R directly
from sklearn.experimental import enable_iterative_imputer
from sklearn.impute import IterativeImputer
import statsmodels.formula.api as smf
import numpy as np
n_imputations = 20 # rule: m >= 100 * FMI (fraction of missing info)
imputer = IterativeImputer(max_iter=10, random_state=0, sample_posterior=True)
results = []
for i in range(n_imputations):
imputer.set_params(random_state=i)
imputed = pd.DataFrame(imputer.fit_transform(df[numeric_cols]), columns=numeric_cols)
for col in ['ARM', 'sex']:
imputed[col] = df[col].values
model = smf.logit(
'outcome ~ C(ARM, Treatment(reference="Placebo")) + age', data=imputed
).fit(disp=0)
results.append({'coef': model.params.iloc[1], 'se': model.bse.iloc[1]})
# Rubin's rules
pooled_coef = np.mean([r['coef'] for r in results])
within_var = np.mean([r['se']**2 for r in results])
between_var = np.var([r['coef'] for r in results], ddof=1)
total_var = within_var + (1 + 1/n_imputations) * between_var
pooled_se = np.sqrt(total_var)
pooled_or = np.exp(pooled_coef)
Critical sklearn caveats:
sample_posterior=True is essential — without it all m imputations are nearly identicalsample_posterior=True only works with BayesianRidge (the default estimator). If estimator is changed (e.g., RandomForestRegressor), parameter is silently ignored and MI degenerates to single imputationmiceforest for mixed types or move to R mice/rbmiCongeniality (Meng 1994): imputation model must be at least as flexible as analysis model. If analysis includes treatment-by-covariate interactions, imputation model should include them. Uncongenial imputation biases estimates and invalidates variance pooling.
| Method | Approach | Conservatism | |--------|----------|--------------| | Bonferroni | alpha / m | Most conservative | | Hierarchical (gatekeeping) | Pre-specified order; proceed only if previous rejects | Moderate; full alpha for first | | Graphical procedure (Bretz-Maurer) | Directed graph; alpha propagates on rejection | Flexible; standard in modern SAPs | | Hochberg / Hommel step-up | Ordered p-values vs alpha/(m-k+1) | Less conservative than Bonferroni; requires PRDS |
See clinical-biostatistics/multiplicity-graphical for the full Bretz-Maurer-Hommel treatment with gMCP.
| Pattern | Likely cause | Action | |---------|--------------|--------| | MMRM-MAR primary p < 0.05; reference-based MI sensitivity p > 0.05 | MNAR mechanism after treatment discontinuation; J2R imputes active arm toward placebo | Decide which estimand is regulatory primary; if treatment policy, switch to reference-based MI as primary (Aprocitentan precedent) | | Retrieved-dropout MI vs J2R-MI differ | RD-MI uses observed post-ICE data; J2R uses reference-arm-based assumption | If post-ICE data available, RD-MI is empirically grounded (preferred); J2R as supportive | | Cro information-anchored variance vs Wolbers frequentist variance differ | Information-anchored Rubin's pools within+between; frequentist jackknife/bootstrap captures borrowing-induced variance reduction | Report both; cite Cro 2019 and Wolbers 2022 (Bartlett 2021 frequentist critique); 2024-2025 FDA practice accepts both with one as supportive | | ITT primary p < 0.05; PP secondary p > 0.05 | Per-protocol excludes non-completers (often differentially); PP-only effect inflated when significant; non-significance in PP is signal of fragility | ITT remains primary (ICH E9); PP as sensitivity flag; investigate completer pattern | | Statistical significance achieved but effect estimate below MCID | Powered for δ << MCID; or δ = MCID with no precision margin | Pre-specify δ >= 1.5 × MCID in SAP (postdoc rule); report against pre-specified MCID, not 0 | | Estimand strategies (hypothetical vs treatment-policy) give different effect sizes | Different ICE-handling strategies target different parameters | Report all pre-specified estimands; pick PRIMARY for the regulatory question (not the "winner"); cite Kahan 2023 | | Subgroup effect estimate >2x main effect | Winner's curse (Sun 2010 documented median 2.4x inflation) | Apply Bayesian shrinkage (Dixon-Simon, RBesT) for corrected estimate; cite as discovery, not confirmatory |
Citation: Hopewell, Chan, Collins et al 2025. CONSORT 2025 statement. Lancet 405:1633-1640. E&E in BMJ 2025;389:e081124.
30-item checklist (was 25), 7 new items, 3 substantially revised, 1 deleted. Key new items relevant to statistical reporting:
Estimands did NOT make consensus for mandatory inclusion — they appear in Box 1 (terminology only). For regulatory submissions, ICH E9(R1) is the operative estimand standard, NOT CONSORT 2025. Sponsors should follow ICH E9(R1) directly for the 5-attribute estimand statement.
No DOORS framework in CONSORT 2025 — diversity/equity additions are happening via a separate SAGER-SPIRIT-CONSORT alignment workstream (GENDRO/EASE).
flow = {
'screened': len(screening_log),
'eligible': len(screening_log[screening_log['eligible']]),
'randomized': len(dm),
'allocated_drug': len(dm[dm['ARM'] == 'Drug']),
'allocated_placebo': len(dm[dm['ARM'] == 'Placebo']),
'completed_drug': len(dm[(dm['ARM'] == 'Drug') & dm['USUBJID'].isin(completers)]),
'completed_placebo': len(dm[(dm['ARM'] == 'Placebo') & dm['USUBJID'].isin(completers)]),
'analyzed_itt': len(itt),
'analyzed_fas': len(fas),
'analyzed_pp': len(pp),
'analyzed_safety': len(safety),
}
CONSORT 2025 templates on consort-spirit.org now explicitly accommodate non-1:1 allocation, cluster, multi-arm, and crossover variants.
sample_posterior=True; verify estimator is BayesianRidge; consider miceforest or R mice for mixed types.| Threshold | Source | Rationale | |-----------|--------|-----------| | m >= 100 * FMI imputations | von Hippel 2020 J Roy Stat Soc A | Adequate for stable pooled SE; with 40% missingness and FMI ~0.3, m=30 needed | | SMD > 0.1 = meaningful imbalance | Austin 2009 Stat Med 28:3083 | Beyond what randomisation would normally produce | | Missing data > 40% on key variable | NRC 2010 | Above this, MI under MAR is unreliable; treat as hypothesis-generating | | Kenward-Roger DF correction for MMRM with UN | Kenward-Roger 1997 | Without it, MMRM-REML under-covers in small/moderate trials; Type-I inflates 1-2 pp | | Information-anchored vs frequentist variance for reference-based MI | Cro 2019 vs Wolbers 2022 | Active regulatory debate; report both for safety | | Tipping delta in residual SD units, not raw | FDA Division of Biometrics preference | Cross-trial comparison; report adjacent to raw | | Treatment policy default for cardio/HF/CV | Fleming 2025 Stat Med | Only strategy preserving randomisation; FDA-favoured |
| Error / symptom | Cause | Solution |
|-----------------|-------|----------|
| Reviewer: "what is the estimand?" with no answer | Method-before-estimand | Pre-specify 5 attributes in protocol; cite Kahan 2023 finding 98% don't articulate |
| LOCF used as primary | Inertia or "conservative" misconception | LOCF is biased even under MCAR (Mallinckrodt 2008); switch to MMRM or MI |
| MMRM with CS covariance treated as equivalent to UN | Convergence forced fallback without pre-specification | Pre-specify fallback hierarchy; document deviation if invoked |
| Rubin's variance for J2R with no sensitivity | Cro 2019 information-anchored argument applied without acknowledgement | Cite Bartlett 2021 + Wolbers 2022; report frequentist variance via CMI+jackknife |
| Tipping delta in raw outcome units only | Hard to compare cross-trial | Also report in residual SD units (FDA preference) |
| ITT and FAS conflated | SAP vague on distinction | Pre-specify both with explicit criteria for FAS exclusion |
| Per-protocol significant, ITT not — sponsor highlights PP | Post-randomisation bias inflation | ITT as primary; PP as sensitivity with explicit caveat |
| Imputation only of outcome | Throws away covariates | Joint imputation of outcome + covariates; cite Carpenter-Roger 2013 |
| sample_posterior silently ignored | Non-default estimator | Verify with imputer.estimator is BayesianRidge or use rbmi/mice |
| Pushback | Response | |----------|----------| | "What is the estimand?" | Articulate 5 ICH E9(R1) attributes; cite Kahan 2023; state ICE strategy with mechanism (not just label). | | "Is MAR plausible?" | Examine DS for differential discontinuation patterns; if differential, switch to treatment-policy with retrieved-dropout MI or J2R sensitivity. | | "Why MMRM not LOCF?" | Mallinckrodt 2008/2014 case for MMRM; LOCF is biased even under MCAR. Cite DIASWG three-pillars doctrine. | | "Why Rubin's not frequentist variance for J2R?" | Cite Cro 2019 information-anchored argument as rationale; report frequentist (CMI+jackknife) as supportive per Wolbers 2022. | | "Tipping point analysis result?" | Minimum delta in active arm that flips p > 0.05; report in residual SD units; judge plausibility against MCID and treatment effect. | | "FAS vs ITT?" | Pre-specified in protocol; FAS excludes [explicit criteria]; both populations analysed; reconciliation table provided. | | "Has the SAP been registered?" | Yes — clinicaltrials.gov NCTxxxxx with full SAP appended; EU CTR EUCTxxxx; SPIRIT 2025 compliant. | | "How is subsequent therapy handled?" | Pre-specified per ICH E9(R1); composite strategy (e.g., subsequent therapy = treatment failure) or treatment-policy (include all data). | | "Where is the multiplicity adjustment for co-primary / key secondary?" | Bretz-Maurer graphical procedure via gMCP pre-specified in SAP; alpha allocation diagram in CSR appendix; cite CONSORT 2025 item 20 + FDA Multiple Endpoints Final Oct 2022; see clinical-biostatistics/multiplicity-graphical. | | "How are missing baseline covariates handled in ANCOVA?" | Complete-case ANCOVA is unbiased under MCAR baseline missingness (Kahan-Morris 2012); proportion with complete baseline reported in flow diagram; if >5% missing, multiple imputation of baseline as sensitivity analysis. |
tools
--- name: bio-phasing-imputation-foundations description: Frames the phasing/imputation pipeline before any tool runs: phasing and imputation are one Li-Stephens copying HMM (recombination is the transition, mutation the emission, the genetic map and Ne set the rates), imputation's honest output is a dosage with a self-estimated quality (INFO/R2/DR2) not a hard genotype, and the stages are ordered and each fails silently (QC, align build and strand to the panel, phase, impute per chromosome, fil
tools
Chooses the enrichment generation before any tool runs, mapping the input shape to a method class - a pre-selected gene list plus a background to over-representation analysis (ORA, hypergeometric), a ranked statistic for all genes to gene set enrichment (GSEA), a signed signaling topology to pathway-topology (SPIA) - then making the null explicit (competitive vs self-contained, gene vs subject sampling) and running a trustworthiness checklist (testable-gene universe, FDR, redundancy collapse, leading-edge check, version reporting). Covers why every clusterProfiler GSEA is the inter-gene-correlation-uncorrected competitive null, why the background not the gene list decides ORA significance, and why no method is universally best. Use when deciding ORA vs GSEA vs topology, which gene-set DB, whether a result is trustworthy, or which null a tool computes. For ORA see go-enrichment, GSEA see gsea, databases kegg-pathways/reactome-pathways/wikipathways; the ranking comes from differential-expression/de-results.
testing
End-to-end GWAS workflow from VCF to association results. Covers PLINK QC, population structure correction, and association testing for case-control or quantitative traits. Use when running genome-wide association studies.
development
Orchestrates the full path from differential expression results to redundancy-collapsed functional enrichment: choose ORA vs GSEA, convert gene IDs per method, run enrichGO/enrichKEGG/enrichPathway/enrichWP or gseGO/gseKEGG (clusterProfiler, ReactomePA, rWikiPathways), and visualize. Routes the ORA-vs-GSEA generation fork and the null/universe/reproducibility theory to pathway-analysis/enrichment-foundations. Use when a DESeq2/edgeR/limma result must become enriched GO terms, KEGG/Reactome/WikiPathways pathways, or a GSEA leading edge; when deciding whether a ranking exists for all genes (GSEA, named decreasing vector) or only a pre-selected list (ORA plus a defensible background universe); or when assembling DE-to-pathway end to end. The DE list and ranking statistic come from differential-expression/de-results; per-method nuance lives in the pathway-analysis skills.