data-visualization/forest-funnel-plots/SKILL.md
Build forest plots (HR, OR, RR, beta-coefficient summaries with CIs) and funnel plots (meta-analysis publication-bias diagnostics) using forestplot, metafor, ggforest, and MendelianRandomization with proper axis-scaling, summary-diamond placement, subgroup nesting, and Egger / trim-and-fill asymmetry tests. Use when summarizing effects across subgroups, trials, or instruments — meta-analysis, Mendelian randomization, subgroup HRs.
npx skillsauth add GPTomics/bioSkills bio-data-visualization-forest-funnel-plotsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: metafor 4.4+, forestplot 3.1+, ggforestplot 0.1+ (subgroup forests), ggforest from survminer 0.4.9+, MendelianRandomization 0.10+.
Before using code patterns, verify installed versions match. If versions differ:
packageVersion('<pkg>') then ?function_nameIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
"Summarize effects across studies / subgroups" -> Render each effect estimate (HR, OR, RR, β) as a square (size = inverse variance / weight), horizontal bar (95% CI), and label, with an optional summary diamond at the bottom from a meta-analysis pool (fixed-effect or random-effects). The funnel plot diagnoses publication bias by plotting effect size vs precision; asymmetry indicates missing small-study-with-null-result publications (Egger 1997).
metafor::forest, metafor::funnel, forestplot::forestplot, survminer::ggforest (Cox HR forests), MendelianRandomization::mr_forestA pooled effect estimate is meaningless if the underlying studies are heterogeneous. The Higgins I² statistic (Higgins-Thompson 2002 Stat Med 21:1539) quantifies between-study variance; the conventional 25/50/75% interpretation tiers come from the Cochrane Handbook §10.10.2 (Higgins et al editors), NOT the original Higgins-Thompson paper which cautioned against rigid cutoffs. A meta-analysis with I² > 75% and a pooled effect must explain the heterogeneity (subgroup analysis, meta-regression) — pooling without explanation is statistically defensible but biologically unhelpful.
A forest plot's bottom must report: pooled estimate + 95% CI + I² + τ² (between-study variance) + Q-test p-value. Without these, the plot is a list of effects, not a meta-analysis.
| Analysis | Tool | Pooling model | Forest method | |----------|------|---------------|---------------| | Single-trial subgroup HRs | survminer::ggforest | None (subgroup display) | Coxph object | | Meta-analysis of binary outcomes | metafor::rma -> forest() | DerSimonian-Laird or REML random-effects | Standard forest | | Meta-analysis of continuous outcomes | metafor::rma(yi, vi) | REML random-effects | Standard forest | | Mendelian randomization | MendelianRandomization::mr_forest | Multiple MR methods | MR-specific forest | | Subgroup forest with interaction p | metafor::rma + addpoly + interaction model | Subgroup REML | Nested forest | | Network meta-analysis | netmeta::forest.netmeta | Bayesian or frequentist | Network forest | | Cumulative meta-analysis (over time) | metafor::cumul + forest | – | Cumulative forest |
| Model | Assumption | When appropriate | Pooled estimate weight | |-------|------------|-------------------|------------------------| | Fixed-effect (Mantel-Haenszel, IVS) | All studies estimate the same true effect | Single mechanism, homogeneous design | 1 / within-study variance | | Random-effects (DerSimonian-Laird, REML) | Studies estimate distinct true effects from a common distribution | Heterogeneous designs / populations | 1 / (within + between variance) |
Use random-effects by default. Fixed-effect assumes all studies estimate the same parameter, which is almost never true across multi-center trials with different populations. REML is the modern default (Viechtbauer 2005); DerSimonian-Laird is the older default still commonly seen.
Goal: Pool study-level effect estimates with random-effects meta-analysis; render a forest plot with study weights, individual effect+CI, and pooled summary diamond.
Approach: Compute per-study yi (effect) and vi (sampling variance); fit REML random-effects model; pass to forest() with prediction interval if heterogeneity is non-trivial.
library(metafor)
# Input: per-study effect (yi) and variance (vi)
# For OR: yi = log(OR), vi = SE(log(OR))^2
# For HR: yi = log(HR), vi = SE(log(HR))^2
res <- rma(yi = log_or, vi = log_or_se^2,
data = studies, slab = paste(author, year),
method = 'REML')
# I^2 and tau^2 in the summary
summary(res)
# I^2 (residual heterogeneity)
# tau^2 (estimated amount of (residual) heterogeneity)
# Q-test for heterogeneity
forest(res,
atransf = exp, # display OR on natural scale
at = log(c(0.25, 0.5, 1, 2, 4)), # ticks at meaningful OR values
refline = 0, # log(1) for OR/HR/RR
xlab = 'Odds Ratio (95% CI)',
header = c('Study', 'OR [95% CI]'),
mlab = bquote(paste('RE Model (Q = ', .(round(res$QE, 2)),
', df = ', .(res$k - 1),
', p = ', .(format.pval(res$QEp, digits = 2)),
'; ', I^2, ' = ', .(round(res$I2, 1)), '%)')),
addpred = TRUE) # prediction interval per Higgins 2009
addpred = TRUE adds a 95% prediction interval — where a new study's effect is expected to fall (Higgins-Thompson-Spiegelhalter 2009 JRSS-A). This is the most honest summary when I² > 30%.
library(survminer)
fit <- coxph(Surv(time, status) ~ treatment + age + sex + stage, data = df)
ggforest(fit,
data = df,
main = 'Subgroup HRs',
cpositions = c(0.02, 0.22, 0.4),
fontsize = 0.7,
refLabel = 'Reference',
noDigits = 2)
ggforest produces a publication-ready subgroup forest from a coxph object. For pre-specified subgroup analyses (treatment × subgroup interaction), test interaction explicitly and annotate the p-value.
For k < 5 studies, the REML-based 95% CI from metafor::rma() is severely anti-conservative — it relies on chi-square asymptotics that fail with few studies. Use Hartung-Knapp-Sidik-Jonkman (HKSJ) adjustment (test = 'knha'):
res_hksj <- rma(yi = log_or, vi = log_or_se^2, data = studies,
method = 'REML', test = 'knha') # HKSJ for k<5
HKSJ uses a t-distribution with k-1 degrees of freedom and adjusts SE via the Q-statistic — well-calibrated even at k=3. For k < 3 a meta-analysis is not advisable; report individual study effects in a forest plot without a pooled summary.
Also: I² is uninterpretable below k = 5 (Borenstein 2017 Res Synth Methods 8:5); the point estimate has wide CI dominated by k itself, not heterogeneity. Do not report I² for k < 5.
Goal: Diagnose publication bias by visual asymmetry of effect size vs precision.
Approach: Plot effect (x) vs SE (inverted y); under no bias, points form a symmetric inverted funnel with the pooled estimate at the apex. Asymmetry suggests missing small-N null-result studies. Egger's regression test (Egger 1997 BMJ 315:629) formalizes the asymmetry.
funnel(res,
xlab = 'log(OR)',
refline = res$b)
# Egger's test
regtest(res, model = 'lm', predictor = 'sei')
# significant p indicates asymmetry; suggests publication bias
# Trim-and-fill (Duval-Tweedie 2000) -- adjusts for asymmetry
res_tf <- trimfill(res)
forest(res_tf)
funnel(res_tf)
Contour-enhanced funnel plot (Peters 2008 J Clin Epidemiol 61:991) overlays significance contours (p < 0.10, < 0.05, < 0.01); asymmetry concentrated in "non-significant" regions indicates publication bias more specifically than generic asymmetry.
funnel(res, level = c(90, 95, 99), shade = c('white', 'gray55', 'gray75'),
refline = 0, legend = TRUE)
Trigger: Random-effects meta-analysis pooled with I² > 75%; no subgroup or meta-regression.
Mechanism: Pooled estimate is a weighted average across substantively different effects; biologically meaningless.
Symptom: Pooled OR = 1.5 with 95% CI (1.2-1.8) but per-study effects range 0.3-5.0.
Fix: Run subgroup analysis or meta-regression to explain heterogeneity; report I², τ², and prediction interval; do NOT report a single pooled effect as the answer.
Trigger: Default fixed-effect model on multi-population data.
Mechanism: Fixed-effect weights = 1/within-study variance, ignoring between-study variance.
Symptom: CI is misleadingly narrow; reviewer asks "why fixed effect with high I²?"
Fix: Switch to REML random-effects (method = 'REML'); document the choice.
Trigger: k < 10 studies; significant Egger p taken as definitive publication bias.
Mechanism: Egger's test is underpowered with few studies; sensitive to single outliers.
Symptom: Conclusion "publication bias" from k=6 trials.
Fix: Egger requires k ≥ 10 (Sterne 2011 BMJ 343:d4002); for fewer studies, visual funnel + contour-enhanced funnel is more reliable.
Trigger: Reporting trim-and-fill adjusted estimate as "the answer."
Mechanism: Trim-and-fill is a sensitivity analysis; imputed studies are hypothetical.
Symptom: Original pooled OR = 2.0; trim-and-fill adjusted to 1.5; report says "adjusted estimate is 1.5."
Fix: Present original AND trim-and-fill side-by-side; trim-and-fill is sensitivity, not primary.
Trigger: Subgroup HRs plotted; conclusion "treatment works in subgroup X."
Mechanism: Visual differences across subgroups don't establish significant interaction.
Symptom: Subgroup forest shows HR=0.5 in subgroup A, HR=1.0 in subgroup B; no formal test.
Fix: Add treatment × subgroup interaction term to the model; report interaction p; cite Brookes 2001 / Wang 2007 for subgroup analysis caveats.
Trigger: OR/HR/RR plotted with linear x-axis.
Mechanism: Ratios are multiplicatively symmetric; linear axis compresses < 1 effects.
Symptom: OR = 0.5 (halving) appears smaller than OR = 2 (doubling) on a linear scale, even though they are biologically equivalent.
Fix: Always log-scale the x-axis for ratios. metafor's atransf = exp + at = log(c(0.25, 0.5, 1, 2, 4)) is the canonical pattern.
Trigger: Default forestplot package without weight encoding.
Mechanism: Reader cannot tell study influence on pool.
Symptom: A 5-patient pilot looks visually equivalent to a 5000-patient trial.
Fix: Use metafor forest() which auto-encodes weight via box size. forestplot package needs boxsize = argument.
| Pattern | Cause | Action | |---------|-------|--------| | Fixed-effect significant; random-effects n.s. | High heterogeneity inflates RE variance | Trust random-effects when I² > 30% | | Egger n.s. but funnel looks asymmetric | k < 10 -> Egger underpowered | Trust visual; report contour-enhanced funnel | | Trim-and-fill imputes many studies | Severe asymmetry | Caution; sensitivity, not primary | | Subgroup forest suggests effect modification; interaction test n.s. | Visual difference does not establish formal interaction | Trust interaction test | | MR forest shows divergent estimates across methods | Pleiotropy or weak instruments | Run MR-Egger, weighted median, mode-based (sensitivity); cite Bowden 2015 |
| Threshold | Value | Source | |-----------|-------|--------| | I² substantial heterogeneity | > 50% | Cochrane Handbook §10.10.2 (Higgins et al editors) | | I² considerable heterogeneity | > 75% | Cochrane Handbook §10.10.2 (Higgins et al editors) | | Egger test min k | ≥ 10 | Sterne 2011 BMJ | | Prediction interval (where new study lands) | report when I² > 30% | Higgins-Thompson-Spiegelhalter 2009 | | Forest x-axis | log scale for ratios | Convention |
| Error / symptom | Cause | Solution |
|-----------------|-------|----------|
| Pooled estimate with I² = 90% | No heterogeneity exploration | Subgroup / meta-regression |
| Linear x-axis for OR forest | Ratios should be log-symmetric | atransf = exp, at = log(...) |
| Uniform point sizes | Weights not encoded | metafor::forest auto-encodes; forestplot needs boxsize |
| Egger from k=5 | Underpowered | k ≥ 10 for Egger |
| Trim-and-fill as primary | Sensitivity, not primary | Present both; document |
| Subgroup effect without interaction test | Visual ≠ test | Add interaction term |
| MR forest with single method | Pleiotropy risk | Triangulate methods |
tools
--- name: bio-phasing-imputation-foundations description: Frames the phasing/imputation pipeline before any tool runs: phasing and imputation are one Li-Stephens copying HMM (recombination is the transition, mutation the emission, the genetic map and Ne set the rates), imputation's honest output is a dosage with a self-estimated quality (INFO/R2/DR2) not a hard genotype, and the stages are ordered and each fails silently (QC, align build and strand to the panel, phase, impute per chromosome, fil
tools
Chooses the enrichment generation before any tool runs, mapping the input shape to a method class - a pre-selected gene list plus a background to over-representation analysis (ORA, hypergeometric), a ranked statistic for all genes to gene set enrichment (GSEA), a signed signaling topology to pathway-topology (SPIA) - then making the null explicit (competitive vs self-contained, gene vs subject sampling) and running a trustworthiness checklist (testable-gene universe, FDR, redundancy collapse, leading-edge check, version reporting). Covers why every clusterProfiler GSEA is the inter-gene-correlation-uncorrected competitive null, why the background not the gene list decides ORA significance, and why no method is universally best. Use when deciding ORA vs GSEA vs topology, which gene-set DB, whether a result is trustworthy, or which null a tool computes. For ORA see go-enrichment, GSEA see gsea, databases kegg-pathways/reactome-pathways/wikipathways; the ranking comes from differential-expression/de-results.
testing
End-to-end GWAS workflow from VCF to association results. Covers PLINK QC, population structure correction, and association testing for case-control or quantitative traits. Use when running genome-wide association studies.
development
Orchestrates the full path from differential expression results to redundancy-collapsed functional enrichment: choose ORA vs GSEA, convert gene IDs per method, run enrichGO/enrichKEGG/enrichPathway/enrichWP or gseGO/gseKEGG (clusterProfiler, ReactomePA, rWikiPathways), and visualize. Routes the ORA-vs-GSEA generation fork and the null/universe/reproducibility theory to pathway-analysis/enrichment-foundations. Use when a DESeq2/edgeR/limma result must become enriched GO terms, KEGG/Reactome/WikiPathways pathways, or a GSEA leading edge; when deciding whether a ranking exists for all genes (GSEA, named decreasing vector) or only a pre-selected list (ORA plus a defensible background universe); or when assembling DE-to-pathway end to end. The DE list and ranking statistic come from differential-expression/de-results; per-method nuance lives in the pathway-analysis skills.