plugin/skills/tooluniverse-meta-analysis/SKILL.md
Meta-analysis / evidence synthesis — pool effect sizes across studies (odds ratios, risk ratios, hazard ratios, mean differences, correlations, GWAS betas) with fixed- or random-effects models, quantify heterogeneity (Q, I², τ²), and build a forest plot. Use when you have results from MULTIPLE studies and need a single pooled estimate, or to synthesize evidence from a systematic review / multiple GWAS / replicated experiments. Handles the error-prone effect-size + standard-error preparation (converting OR/HR/CI, two-group means±SD, proportions, and correlations into the (effect, SE) the pooling step needs).
npx skillsauth add mims-harvard/tooluniverse tooluniverse-meta-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Pool quantitative results from multiple studies into one estimate, and judge how consistent the studies are. This is the statistical half of a systematic review (the literature-collection half is tooluniverse-literature-deep-research).
Do NOT use it to find the studies — use tooluniverse-literature-deep-research / the literature tools for that, then bring the extracted numbers here. Before trusting any input study, consider checking it with Crossref_check_retraction.
1. Extract (effect, SE) per study ← THE ERROR-PRONE STEP
2. Pick fixed vs random effects
3. Pool: MetaAnalysis_run
4. Read heterogeneity (I², Q, τ²)
5. Forest plot + interpret
The pooling step needs an effect size on an additive scale and its standard error. Ratio measures (OR/RR/HR) must be log-transformed first. Most reported numbers give you a CI, not an SE — derive the SE from the CI.
| What the paper reports | effect_size | se |
|---|---|---|
| OR / RR / HR with 95% CI [L, U] | ln(point) | (ln(U) − ln(L)) / (2 × 1.96) |
| OR / RR / HR with a p-value (no CI) | ln(point) | |ln(point)| / z_from_p (two-sided z) |
| GWAS / regression β with SE | β (as reported) | the reported SE |
| Two groups, means + SDs + n₁,n₂ | Hedges' g (see script) | SE of g (see script) |
| Single proportion p, n | logit(p)=ln(p/(1−p)) | sqrt(1/(np) + 1/(n(1−p))) |
| Pearson correlation r, n | Fisher z = atanh(r) | 1 / sqrt(n − 3) |
Critical rules
ln(OR) and exponentiate the pooled result back. The script does this for you.2 × 1.96 assumes a 95% CI; use 2 × 1.645 for 90%, 2 × 2.576 for 99%.The helper script does these conversions — prefer it over hand math:
python skills/tooluniverse-meta-analysis/scripts/meta_analysis.py --input studies.csv
# studies.csv columns (use the set that matches your data):
# name, or, ci_low, ci_high (ratio + CI)
# name, beta, se (already on log/linear scale)
# name, mean1, sd1, n1, mean2, sd2, n2 (two-group means -> Hedges' g)
# name, r, n (correlation -> Fisher z)
| Use fixed-effects when | Use random-effects when | |---|---| | Studies estimate the same true effect (e.g. exact replications, one trial split by site) | Studies differ in population/design/dose (the usual real-world case) | | I² is low (<25%) | I² is moderate–high, or studies are clinically heterogeneous |
When unsure, report random-effects (DerSimonian–Laird) as primary — it is the conservative default and widens the CI to reflect between-study variance.
tu run MetaAnalysis_run '{"method":"random","studies":[
{"name":"Smith 2019","effect_size":0.41,"se":0.12},
{"name":"Lee 2021","effect_size":0.67,"se":0.18},
{"name":"Garcia 2023","effect_size":0.33,"se":0.10}]}'
Returns pooled_effect, pooled_se, pooled_ci_lower/upper, pooled_z, pooled_p_value, a heterogeneity block (Q, Q_df, Q_p_value, I_squared, tau_squared), and per_study weights + CIs.
Scale foot-gun — read this.
MetaAnalysis_runpools whatever scale you hand it and has no idea your inputs were ratios. For an OR/RR/HR you MUST pass the log-transformedeffect_size+sefrom Step 1 (e.g.ln(1.42)=0.351, not1.42) — feeding raw ratios silently produces a wrong pooled value with no error. And the values it returns — including its proseinterpretationstring — are on that same log scale. So: ignore the tool'sinterpretationfield for ratios, andexp()thepooled_effectand CI bounds back to the OR/RR/HR scale yourself before reporting. The helper script avoids all of this — it takes raw ORs, tracks the scale, and prints results already back-transformed.
| I² | Heterogeneity | What it means | |---|---|---| | 0–25% | Low | Studies largely agree; fixed-effects is defensible | | 25–50% | Moderate | Prefer random-effects; note the variability | | 50–75% | Substantial | Random-effects; investigate sources (subgroup / meta-regression) | | >75% | Considerable | Pooling may be inappropriate — explain why studies differ instead |
Q_p_value < 0.10 → statistically significant heterogeneity (Q is low-powered, so 0.10 not 0.05).tau_squared is the between-study variance on the effect scale; > 0 is what random-effects adds over fixed.Q_p_value ≫ 0.10), especially with few studies: fixed and random-effects converge — report random-effects as primary and note that fixed-effects agrees. Don't agonize over the model choice when both give essentially the same pooled estimate.The script prints a text forest plot (per-study effect, CI, weight%, and the pooled diamond). Report, in order:
Example: "Across 3 cohorts (N=4,210), the pooled OR was 1.51 (95% CI 1.33–1.72, p=3.4×10⁻⁶), random-effects. Heterogeneity was substantial (I²=55%, Q p=0.11), so the random-effects model is reported; all three studies showed the same direction of effect."
Crossref_check_retraction) first.tooluniverse-literature-deep-research — find and grade the studies to feed in.tooluniverse-statistical-modeling — single-study regression, Cox, ORs (see references/cox_regression.md for HR extraction).tooluniverse-gwas-study-explorer / tooluniverse-gwas-finemapping — GWAS-specific multi-cohort analysis.tools
PCR / qPCR primer and oligo design — design forward/reverse primers for a target region (SantaLucia nearest-neighbor thermodynamics), compute melting temperature (Tm) and annealing temperature (Ta), check GC content, and screen an oligo for hairpins and primer-dimers. Use when you need primers for a sequence, want to QC an existing primer pair, or need the Tm of an oligo. Covers the primer-design rules (Tm matching, GC clamp, 3'-end, length) and the tools' constraint quirks.
tools
Pharmacokinetic (PK) analysis of concentration-time data — non-compartmental analysis (NCA) for Cmax, Tmax, AUC (0-t and 0-∞), terminal half-life, clearance (CL), volume of distribution (Vd), MRT, and absolute bioavailability (F). Also one-compartment fitting. Use when you have plasma/serum drug concentrations over time after a dose and need PK parameters, or to compute bioavailability from IV + oral AUCs. NOT for ADMET property prediction from structure (use tooluniverse-admet-prediction).
tools
Molecular cloning assembly design — Gibson Assembly (overlap design for seamless multi-fragment joining) and Golden Gate Assembly (Type IIS / BsaI / BbsI design with unique 4-bp fusion overhangs). Use when you need to plan how to join DNA fragments into a construct, design assembly overlaps/overhangs, or decide between cloning methods. Covers the domestication (internal-site removal), overhang-uniqueness, and overlap-Tm rules. For PCR primers to generate the fragments, see tooluniverse-primer-design.
tools
Meta-analysis / evidence synthesis — pool effect sizes across studies (odds ratios, risk ratios, hazard ratios, mean differences, correlations, GWAS betas) with fixed- or random-effects models, quantify heterogeneity (Q, I², τ²), and build a forest plot. Use when you have results from MULTIPLE studies and need a single pooled estimate, or to synthesize evidence from a systematic review / multiple GWAS / replicated experiments. Handles the error-prone effect-size + standard-error preparation (converting OR/HR/CI, two-group means±SD, proportions, and correlations into the (effect, SE) the pooling step needs).