Version Compatibility

Reference examples tested with: LDSC v1.0.1+ (Python 3; prefer abdenlab/ldsc-python3 v2.0.0 -- belowlab/ldsc v3.0.1 README states the --h2 / --rg / --h2-cts CLI is broken; use Docker jtb114/ldsc:latest for the belowlab fallback; original bulik/ldsc is Python 2.7 unmaintained since 2019), HDL 1.4.0+ (R; GitHub zhenin/HDL), LAVA 0.1.0+ (R; GitHub josefin-werme/LAVA), HESS 0.5.4+ (Python; huwenboshi/hess), Popcorn 1.0+ (Python; brielin/Popcorn), GCTA 1.94+ (GREML-bivariate), baselineLD_v2.2 / eur_w_ld_chr LD-score panels from alkesgroup.broadinstitute.org/LDSCORE, UKB-array SVD eigen reference for HDL.

Before using code patterns, verify installed versions match. If versions differ:

Python: pip show <package> then python -c 'import <module>; help(<module>)'
R: packageVersion('<pkg>') then ?function_name
CLI: <tool> --version then <tool> --help

If code throws an LD-score "category not found" error, an HDL reference-panel mismatch, or a LAVA locus-ID lookup failure, introspect the installed LD-score column headers and the supplied partitioning file rather than retrying with default flags.

Genetic Correlation

"Estimate the genetic correlation between two traits from GWAS summary statistics" -> Decompose the bivariate genetic architecture into a single global rg (cross-trait LDSC, HDL), per-locus local rg (LAVA, rho-HESS, HDL-L), or cross-population rg (Popcorn). Genetic correlation is the central cross-trait statistic in causal genomics: it quantifies shared etiology, motivates CHP-aware MR sensitivity when high, gates LCV's gcp partial-causation parameter, and feeds into multi-trait analysis frameworks (MTAG, GenomicSEM). Tool choice is a decision about the regime (sumstats vs individual-level; global vs local; same-ancestry vs trans-ancestry) and the sample-overlap structure between input GWAS.

CLI (LDSC, robust to overlap): ldsc.py --rg trait1.sumstats.gz,trait2.sumstats.gz --ref-ld-chr eur_w_ld_chr/ --w-ld-chr eur_w_ld_chr/ --out rg
R (HDL, lower variance, requires independent samples): HDL.rg(gwas1.df, gwas2.df, LD.path = 'UKB_array_SVD_eigen90_extraction', N0 = 0)
R (LAVA, local rg per locus): process.input() -> run.univ() -> run.bivar(input, locus_id) over ~2495 LDetect-derived loci
CLI (rho-HESS, locus-level): hess.py --local-rhog t1.sumstats.gz t2.sumstats.gz --bfile <ref> --partition <part>.bed --chrom <chr>
CLI (Popcorn, trans-ancestry): popcorn fit -v 1 --cfile cross_pop_scores.txt --sfile1 pop1.txt --sfile2 pop2.txt out

Algorithmic Taxonomy

| Method | Model | Input | Output | Strength | Fails when | |--------|-------|-------|--------|----------|------------| | Cross-trait LDSC (Bulik-Sullivan 2015 Nat Genet 47:1236) | Bivariate LD-score regression; off-diagonal absorbs rg, intercept absorbs sample overlap | Sumstats + ancestry-matched LD scores | rg, SE, intercept (overlap proxy) | Robust to sample overlap (intercept absorbs it without biasing rg); fast; calibrated EUR | Mean chi-square < 1.02 in either trait (underpowered); non-EUR sumstats with EUR LD scores | | HDL (Ning 2020 Nat Genet 52:859) | High-Definition Likelihood; eigen-decomposition of full LD with closed-form variance | Sumstats + UKB-array SVD eigen reference (EUR N=336k) | rg, SE | ~60% lower variance than LDSC; equivalent to ~2.5x sample size; preferred when both GWAS truly independent | Sample overlap > 5% biases likelihood; only public reference panel is EUR UKB-array | | LAVA (Werme 2022 Nat Genet 54:274) | Semi-parametric local genetic correlation per locus; PC-projected SNP effects under a local null | Sumstats + LDetect partitioning (~2495 loci) | Per-locus univariate h2 + bivariate rg + p-value | Detects heterogeneous rg masked by global cancellation; conditional + partial rg supported | Locus has too few SNPs (< 50) or low local h2 (univariate p > 0.05 in either trait); LD reference mismatch | | rho-HESS (Shi 2017 AJHG 101:737) | Quadratic form on LD-projected effect estimates per locus | Sumstats + LDetect partition + LD reference | Per-locus rho_g + bivariate local rg | Earliest locus-level rg method; complements LAVA | Locus < 1000 SNPs; LD ref must match in-sample structure | | HDL-L (Li Y et al 2025 Nat Genet) | HDL likelihood applied to local windows | Sumstats + windowed LD reference | Per-window local rg | Lower variance than rho-HESS at locus level | Same sample-overlap caveat as HDL; reference-panel coverage limited | | GREML-bivariate (Lee 2012 Bioinformatics 28:2540) | Joint REML on bivariate GRM | Individual-level genotypes + both phenotypes | rg + SE | Gold standard at individual level; better precision than sumstats methods | Needs individual-level data on overlapping individuals OR carefully matched two-cohort; population stratification leaks | | Popcorn (Brown 2016 AJHG 99:76) | Cross-population genetic effect (rho_ge) and impact (rho_gi) correlation under MAF-LD model | Sumstats per population + cross-population LD scores | Trans-ancestry rg + per-pop h2 | Quantifies shared causal architecture across ancestries | Effective N per population < 5000; cross-population LD score reference mismatched to GWAS ancestry | | Cross-pop causal-effect rg (Galinsky KJ et al 2019 Genet Epidemiol 43:180) | Cross-population genetic correlation of causal effect sizes | Sumstats per population + cross-population reference | Trans-ancestry causal-effect rg + SE | Estimates cross-population correlation of causal effects; complements Popcorn | Same data-volume limit as Popcorn | | GenomicSEM ldsc() (Grotzinger 2019 Nat Hum Behav 3:513) | LDSC wrapper feeding into SEM | Multiple sumstats | Genetic covariance matrix + multivariable SEM | Multi-trait extension of LDSC; common-factor and bifactor models | Same per-pair limits as LDSC; SEM identification problems | | SUPERGNOVA (Zhang Y et al 2021 Genome Biol 22:262) | LD-block local rg via eigen-decomposition of the LD matrix | Sumstats + LD-block partition | Per-locus rg + p; orthogonal philosophy from LAVA | Different LD partitioning than LDetect; useful as triangulation against LAVA | Same chi-square floor as LDSC; non-EUR coverage limited | | KGGSEE gene-based conditional heritability (Miao L et al 2023 AJHG) | Gene-based conditional heritability via effective heritability estimation (EHE) | Sumstats + gene annotation | Per-gene conditional h2 | Java pipeline; gene-level conditional heritability | NOT a local-rg method -- answers a different question than LAVA |

Methodology evolves; benchmark consensus shifts. Verify against the alkesgroup LDSC tutorial (current as of release), Werme 2022 LAVA paper + GitHub, and Speed 2020 Nat Genet model-comparison work before locking a primary method. When a claim depends on the model assumption (e.g. enrichment in shared loci), report at least two methods (e.g. LDSC global + LAVA local).

Cross-Trait LDSC Intercept: Sample Overlap is Absorbed, Not a Bias

The most common postdoc-level misreading: Treating a non-zero cross-trait LDSC intercept as evidence of bias in the rg estimate.

The bivariate LDSC regression has the form: E[Z1 Z2] = sqrt(N1 N2) * rg * h2-product / M * LD_score + rho_overlap

The intercept (rho_overlap) ABSORBS the contribution of sample overlap (correlated trait residuals on shared individuals). The slope (which carries rg) is unbiased even when overlap is non-zero. A non-zero intercept is the expected signature of sample overlap and is informative (it estimates phenotypic correlation among overlapping individuals), but it does NOT indicate that rg is contaminated.

Operational rule: Report the intercept alongside rg. When intercept is non-zero, document the overlap inferred (intercept = rho_phenotypic * sqrt(N_shared / N1 / N2) approximately) but do not discount rg. HDL, in contrast, assumes truly independent samples and does become biased above ~5% overlap; switch to LDSC under any non-trivial overlap.

HDL Bias Under Sample Overlap

The mirror image trap: Running HDL on two GWAS that share controls or come from the same biobank.

HDL maximizes a likelihood that assumes independence of the two trait residuals after marginalizing genetics. With sample overlap, the residual correlation is non-zero and the likelihood is misspecified; bias is typically toward the rg estimate that corresponds to phenotypic correlation in the overlapping subset.

Operational rule: Use HDL only when sample overlap < 5%. When in doubt about overlap (e.g. two UKB-derived GWAS), compute LDSC intercept first; if intercept is materially non-zero, switch to LDSC for the primary rg estimate.

Relationship to MR Causal Inference

Genetic correlation between an exposure and an outcome is a screening statistic, not a causal claim. High |rg| has three biological explanations:

| Explanation | Manifestation | |-------------|----------------| | Direct causation X -> Y | All causal SNPs of X feed through to Y; rg reflects mediated covariance | | Shared heritable confounder (CHP) | A latent factor causes both; rg captures the shared variance with no direct edge | | Reverse causation Y -> X | Symmetric structure; rg cannot resolve direction |

LCV's gcp parameter (O'Connor & Price 2018 Nat Genet 50:1728) attempts to distinguish partial from full causation: gcp = 0 is pure correlation (no causation in tested direction), gcp = 1 is full causation, 0 < gcp < 1 is partial causation. LCV uses the LDSC-style bivariate moments and is complementary to instrument-based MR.

Operational rule for any MR analysis where |rg| > 0.3: The IVW + Egger + MR-PRESSO triple is insufficient because all three are blind to CHP (Morrison 2020 Nat Genet 52:740). Add CAUSE (if sig SNPs >= 100) or LHC-MR. See causal-genomics/pleiotropy-detection for the full CHP-aware battery.

Decision Tree by Scenario

| Scenario | Recommended | Why | |----------|-------------|-----| | Standard EUR-EUR rg from sumstats | Cross-trait LDSC | Robust to sample overlap via intercept; standard ENCODE-equivalent default | | Truly independent EUR samples, want maximum precision | HDL | ~60% lower variance than LDSC; equivalent to 2.5x sample size | | Suspect heterogeneous rg across genome (e.g. neuropsychiatric pair with weak global rg) | LAVA local rg | Detects loci of shared etiology hidden by global cancellation | | Locus-level rg with explicit LD partitioning | rho-HESS (or LAVA) | LAVA is newer and better-supported; rho-HESS remains the original framework | | Cross-population (trans-ancestry) rg | Popcorn | Within-population LDSC fails cross-pop; Popcorn models ancestry-specific causal architecture | | Individual-level genotypes available | GREML-bivariate (GCTA) | Better precision than sumstats; gold standard at individual level | | MR validity check before running MR | LDSC rg + LCV gcp | If |rg| > 0.3 add CHP-aware MR sensitivity (CAUSE / LHC-MR) | | Multi-trait modeling (many traits jointly) | GenomicSEM ldsc() | SEM extension of LDSC; common-factor, bifactor, and network models | | Sumstats with mean chi-square < 1.02 | Defer; meta-analyze to ~50k effective N first | LDSC variance explodes below this; nothing else fixes underpower | | Confirmatory after a single LAVA hit | LDSC global + bidirectional MR + colocalization | Triangulate; LAVA flags shared etiology but does not establish causation |

Per-Method Failure Modes

Cross-trait LDSC intercept misread as bias

Trigger: Reader (collaborator, reviewer) sees a non-zero LDSC intercept and reports the rg as "biased by sample overlap".

Mechanism: The bivariate LDSC model partitions covariance between traits into a slope (rg-driven, scales with LD score) and an intercept (overlap-driven, constant in LD score). The slope is what carries rg, and it remains unbiased regardless of intercept value (Bulik-Sullivan 2015 Nat Genet 47:1236, Methods). Confusing the intercept with bias on the slope is a routine misinterpretation.

Symptom: Reviewer comment requesting "correction for sample overlap" when LDSC was already used; collaborator suggesting switching to HDL because intercept is non-zero.

Fix: Report rg with SE and the intercept as a separate statistic; cite Bulik-Sullivan 2015 Methods explicitly; do NOT switch to HDL (which is the wrong direction since HDL is the method that breaks under overlap, not LDSC).

HDL bias with sample overlap

Trigger: Running HDL on two GWAS that share > 5% of individuals (e.g. two UKB-derived GWAS, two MVP-derived GWAS, GWAS reusing controls).

Mechanism: HDL likelihood assumes independent trait residuals after marginalizing genetics. Overlap induces non-zero residual correlation; the likelihood is misspecified and the estimate is pulled toward phenotypic correlation in the shared subset (Ning 2020 Nat Genet 52:859 Supplement).

Symptom: HDL rg differs substantially from cross-trait LDSC rg; HDL CI is narrower than expected from N alone; LDSC intercept (which absorbs overlap) is materially non-zero.

Fix: Compute LDSC intercept first as an overlap diagnostic; if non-zero, switch to LDSC as primary. HDL remains valid only when the two GWAS draw from non-overlapping cohorts (verify by cohort identifier, not just by file source).

Non-EUR ancestry mismatch with EUR LD scores

Trigger: Running LDSC on a non-EUR GWAS (or admixed sample) with the default eur_w_ld_chr/ reference.

Mechanism: LD scores are ancestry-specific; mean LD per SNP differs across populations and bivariate moments use the wrong null. h2 and rg estimates are systematically biased; intercept can be inflated.

Symptom: LDSC ratio is unusually high (>0.2); per-chromosome estimates wildly heterogeneous; total h2 mismatches independent estimates from the same cohort.

Fix: Use ancestry-matched LD scores from alkesgroup (eas_w_ld_chr, afr_w_ld_chr) or compute custom LD scores from in-sample LD reference. For trans-ancestry rg, switch to Popcorn.

Global rg masks local rg variation

Trigger: Two traits with biologically plausible shared etiology return global rg ~ 0 in cross-trait LDSC.

Mechanism: Global rg averages over the genome; loci with positive local rg can cancel against loci with negative local rg, particularly for traits with antagonistic pleiotropy (e.g. autoimmune vs infectious-disease susceptibility), or when shared etiology is confined to a small fraction of the genome.

Symptom: Well-powered (mean chi-square >> 1.02) global LDSC rg near zero with wide CI overlapping zero, while domain biology, prior co-occurrence studies, or shared-pathway analyses strongly suggest shared etiology.

Fix: Run LAVA over the standard ~2495 LDetect loci; per-locus Bonferroni-significant rg at any locus is evidence of localized shared etiology. Annotate hit loci with overlapping GWAS catalog signals and pathway/tissue enrichment.

Cross-population rg below 1 even at causal level

Trigger: Computing rg between same-trait GWAS in two ancestries (e.g. EUR T2D vs EAS T2D).

Mechanism: Even when the trait has the same biological definition, causal variant identity and effect sizes can differ across populations due to gene-environment interaction, allele-frequency divergence, and population-specific epistasis. Popcorn 2016 demonstrated that rg(cross-pop) < 1 is common and biologically real, not a methodological artifact.

Symptom: Trans-ancestry rg point estimate around 0.6-0.9 with CI excluding 1 for a trait expected to be "the same disease".

Fix: Use Popcorn (within-pop LDSC is invalid for cross-pop rg); interpret rg(cross-pop) < 1 as quantifying population-specific architecture rather than as bias; report rho_ge (effect correlation) and rho_gi (impact correlation) separately.

Same-Trait Cross-Cohort rg as Consistency Check

Use case: Meta-analysis QC -- two same-trait GWAS (e.g. UKB IBD vs FinnGen IBD) yield rg < 1 with CI excluding 1. This is distinct from the cross-population analog above (handled by Popcorn); here both cohorts are same-ancestry but different studies.

Interpretation: (a) population-substructure differences, (b) phenotype-definition heterogeneity (e.g. different ICD coding, self-report vs registry), (c) genuine biology (founder effects in isolates like FinnGen).

Decision rule: rg ~ 0.9-1.0 with CI overlapping 1 -> consistent enough to meta-analyze; rg ~ 0.7-0.9 -> moderate heterogeneity, consider sensitivity meta with random effects; rg < 0.7 -> re-examine phenotype definitions before meta-analyzing.

Low chi-square mean (underpowered GWAS)

Trigger: Mean chi-square in either input GWAS is below 1.02 (heuristic LDSC threshold).

Mechanism: LDSC, HDL, and LAVA all depend on bivariate moments of Z-scores against LD score; weak signal means the slope is dominated by noise.

Symptom: LDSC rg SE > 0.2; intercept estimates fluctuate across chromosome; HDL convergence warnings; LAVA returns p > 0.05 at most loci.

Fix: Meta-analyze contributing cohorts to push effective N to ~50k or higher before running rg; if meta-analysis is not feasible, report rg as exploratory; do NOT switch methods to "salvage" power, the problem is upstream of method choice.

LAVA univariate filter ignored

Trigger: Reporting LAVA bivariate rg at a locus where the univariate local h2 is non-significant in one or both traits.

Mechanism: LAVA's bivariate test is only valid at loci with detectable local heritability in BOTH traits. Without local h2 signal in at least one trait, the bivariate test is unidentified and can return spurious significant rg.

Symptom: LAVA bivariate p < 0.05 at loci where univariate h2 p > 0.05 for one trait; rg estimates near +/- 1 (boundary cases).

Fix: Filter loci on univ.p < 0.05 / N_loci (Bonferroni for ~2495 loci) in BOTH traits before running run.bivar(); report only at filtered loci. This is the documented LAVA workflow in Werme 2022 Supplement and the GitHub vignette.

Quantitative Thresholds

| Threshold | Source | Rationale | |-----------|--------|-----------| | |rg| > 0.7 | Operational high correlation | Suggests strong shared genetic architecture; near-universal in pairs like MDD-anxiety or LDL-CHD | | |rg| 0.3-0.7 | Operational moderate correlation | Common in psychiatric / cardiometabolic trait pairs; routine to flag for joint analysis | | |rg| < 0.3 | Operational low correlation | Globally weak; may still harbor biologically meaningful local rg via LAVA | | rg SE < 0.05 | Operational reliable estimate | Above this SE, point estimate is uncertain to 1 decimal place | | HDL sample overlap < 5% | Ning 2020 Nat Genet (Supplement) | Above this, HDL likelihood is misspecified and biased | | LDSC mean chi-square > 1.02 | LDSC documentation (Bulik-Sullivan 2015 tutorial) | Below this, LD-score regression is severely underpowered for h2 / rg | | LAVA local p < 0.05 / N_loci | Werme 2022 Nat Genet 54:274 | Bonferroni for ~2495 LDetect loci; standard genome-wide local-rg correction | | Popcorn rho_ge CI excludes 1 | Brown 2016 AJHG 99:76 | Evidence of population-specific causal architecture | | LCV gcp != 0 (two-sided p < 0.05) | O'Connor & Price 2018 Nat Genet 50:1728 | Directional evidence of (partial) causation given non-zero rg | | LDSC ratio < 0.2 | Bulik-Sullivan 2015 Methods | High ratio (intercept / chi-square - 1) indicates population stratification or model misfit | | MR + rg sensitivity trigger | Operational | |rg| > 0.3 with a significant IVW estimate REQUIRES CHP-aware sensitivity (CAUSE / LHC-MR) | | Conditional-rg LAVA covariate set | Werme 2022 | Up to 4 conditioning traits per run.pcor() call before identification fails |

Cross-Trait LDSC: Standard Workflow

Goal: Estimate global rg from two GWAS summary statistics, robust to any sample overlap.

Approach: Munge each sumstats file (column harmonization + filters), supply ancestry-matched LD scores, run --rg mode; interpret slope (rg) and intercept (overlap proxy) separately.

# Step 1: munge each GWAS to LDSC format (harmonize columns, filter on MAF and INFO, restrict to HapMap3)
munge_sumstats.py \
    --sumstats trait1.tsv.gz \
    --N 250000 \
    --merge-alleles w_hm3.snplist \
    --out trait1.munged

munge_sumstats.py \
    --sumstats trait2.tsv.gz \
    --N 180000 \
    --merge-alleles w_hm3.snplist \
    --out trait2.munged

ldsc.py \
    --rg trait1.munged.sumstats.gz,trait2.munged.sumstats.gz \
    --ref-ld-chr eur_w_ld_chr/ \
    --w-ld-chr eur_w_ld_chr/ \
    --out rg_t1_t2

grep -A 11 'Summary of Genetic Correlation Results' rg_t1_t2.log

The log block reports rg, SE, p-value, h2 per trait, and gcov_int (genetic covariance intercept = phenotypic-correlation overlap proxy). When running rg of one base trait against many others, use comma-separated lists: --rg base.sumstats.gz,t1.gz,t2.gz,t3.gz.

HDL: Lower-Variance rg for Independent Samples

Goal: Estimate global rg with ~60% lower variance than LDSC when the two GWAS draw from non-overlapping samples.

Approach: Format each GWAS as an HDL data frame; supply the UKB-array SVD eigen reference path; pass N0 (overlapping sample count, 0 for independent).

# remotes::install_github('zhenin/HDL/HDL')
library(HDL)

gwas1 <- data.frame(
    SNP = trait1$rsid,
    A1 = trait1$effect_allele,
    A2 = trait1$other_allele,
    N = trait1$N,
    Z = trait1$beta / trait1$se,
    b = trait1$beta,
    se = trait1$se
)

gwas2 <- data.frame(
    SNP = trait2$rsid,
    A1 = trait2$effect_allele,
    A2 = trait2$other_allele,
    N = trait2$N,
    Z = trait2$beta / trait2$se,
    b = trait2$beta,
    se = trait2$se
)

res <- HDL.rg(
    gwas1.df = gwas1,
    gwas2.df = gwas2,
    LD.path = 'UKB_array_SVD_eigen90_extraction',
    N0 = 0,  # number of overlapping individuals; 0 for independent cohorts. HDL corrects for overlap via N0 and is robust to its misspecification
    output.file = 'hdl_rg.txt'
)

print(res$rg)
print(res$rg.se)
print(res$P)

Pre-download the UKB SVD reference (HDL_documentation.html -> "How to obtain LD reference panel" link); eigen90 is the UKB-array (genotyped-SNP) panel used here, while eigen99 exists only for the imputed-variant panel -- match the panel to the SNP coverage of the input GWAS, not to a speed/precision setting. Do NOT run HDL when sample overlap is unknown or non-trivial; the wrapper does not warn.

LAVA: Local Genetic Correlation Per Locus

Goal: Identify loci where two traits share genetic etiology, including loci hidden by global rg cancellation.

Approach: Process inputs once -> filter to loci with detectable univariate local h2 in BOTH traits -> run bivariate per-locus rg; apply Bonferroni for ~2495 loci.

# remotes::install_github('josefin-werme/LAVA')
library(LAVA)

input <- process.input(
    input.info.file = 'input.info.txt',
    sample.overlap.file = 'sample.overlap.txt',
    ref.prefix = '1kg_EUR_chr',
    phenos = c('trait1', 'trait2')
)

loci <- read.loci('blocks_s2500_m25_f1_w200.GRCh37_hg19.locfile')

univ_results <- list()
biv_results <- list()
N_loci <- nrow(loci)

for (i in seq_len(N_loci)) {
    locus <- process.locus(loci[i, ], input)
    if (is.null(locus)) next  # no SNPs / no h2 / monomorphic
    univ <- run.univ(locus)
    univ_results[[i]] <- univ
    pass_univ <- all(univ$p < 0.05 / N_loci)  # Bonferroni on both traits
    if (!pass_univ) next
    biv_results[[i]] <- run.bivar(locus)
}

univ_df <- do.call(rbind, univ_results)
biv_df <- do.call(rbind, biv_results)
biv_df$padj <- p.adjust(biv_df$p, method = 'bonferroni', n = N_loci)
sig_loci <- subset(biv_df, padj < 0.05)

The standard LDetect partitioning files (~2495 EUR loci, ~1700 EAS, ~2700 AFR) are at the LAVA GitHub. Sample-overlap file (a per-pair phenotypic-correlation matrix; LDSC intercept is the standard proxy) protects LAVA from the same overlap bias that LDSC's intercept absorbs. For partial / conditional local rg, use run.pcor(locus, target = c('phenoA', 'phenoB'), phenos = c('cond1', 'cond2', ...)) with up to 4 conditioning traits; the canonical multi-predictor regression alternative is run.multireg().

rho-HESS: Alternative Local rg

Goal: Per-locus bivariate rg using the HESS quadratic-form estimator; complementary to LAVA.

Approach: Estimate local h2 per trait first; then bivariate cross-trait estimator using LD-projected effect estimates per locus.

# Step 1: eigenvalues + projections for both traits (per chromosome; two sumstats SPACE-separated).
# --local-rhog auto-writes per-trait files (step1_trait1_*, step1_trait2_*) AND the covariance
# intermediates (step1_chrN.eig.gz, step1_chrN.prjprod.gz) under the shared --out prefix.
for chr in {1..22}; do
    hess.py \
        --local-rhog trait1.sumstats.gz trait2.sumstats.gz \
        --chrom $chr \
        --bfile 1kg_EUR_chr${chr} \
        --partition fourier_ls-chr${chr}.bed \
        --out step1
done

# Step 2: per-trait local h2 from the Step-1 outputs (MUST run before Step 3)
hess.py --prefix step1_trait1 --out step2_trait1
hess.py --prefix step1_trait2 --out step2_trait2

# Step 3: local genetic covariance / rg. --local-hsqg-est passes the Step-2 per-trait local h2,
# --num-shared is the overlap count, --pheno-cor the phenotypic correlation (any value when
# --num-shared 0). Auto-aggregates across all chromosomes; no per-chromosome loop here.
hess.py \
    --prefix step1 \
    --local-hsqg-est step2_trait1.txt step2_trait2.txt \
    --num-shared 0 \
    --pheno-cor 0 \
    --out step3

--num-shared is the number of overlapping individuals; set 0 only if truly independent. HESS partition files use the Berisa & Pickrell 2016 LD-block boundaries (fourier_ls-*.bed). LAVA has largely superseded HESS for new analyses, but rho-HESS remains in active use for replication / triangulation.

Popcorn: Trans-Ancestry rg

Goal: Quantify shared causal architecture between two ancestries (e.g. EUR T2D vs EAS T2D) under a MAF-LD-aware model.

Approach: Compute cross-population LD scores once per ancestry pair; fit rg using sumstats from each population.

# Step 1: cross-population LD scores (one-time per ancestry pair)
popcorn compute -v 1 \
    --bfile1 1kg_EUR \
    --bfile2 1kg_EAS \
    --SNPs_to_store 20000 \
    --gen_effect \
    eur_eas_scores.txt

# Step 2: fit cross-population rg
popcorn fit -v 1 \
    --cfile eur_eas_scores.txt \
    --gen_effect \
    --sfile1 t2d_eur.sumstats.txt \
    --sfile2 t2d_eas.sumstats.txt \
    t2d_eur_eas_rg.txt

--gen_effect (which must be passed at BOTH compute and fit) selects the genetic-effect model and reports rho_ge (correlation of causal effect sizes); omitting it at both steps yields rho_gi (correlation of variant-level impacts, MAF-weighted). A single run reports one or the other, so run both modes to report both. When MAFs differ markedly across populations, rho_ge and rho_gi diverge; both are biologically meaningful and report-worthy. Effective N per population must be > ~5000 for stable estimates.

Reconciliation Across Methods

| Pattern | Likely cause | Action | |---------|--------------|--------| | LDSC rg and HDL rg agree (independent samples) | Both methods converging on true value | Report HDL as primary (lower SE); LDSC as sensitivity | | LDSC rg substantially below HDL rg, LDSC intercept large | Sample overlap; HDL is biased toward phenotypic correlation | Report LDSC as primary; flag overlap; do not report HDL | | Global LDSC rg ~ 0 but LAVA shows multiple Bonferroni-significant local rg | Locus-level cancellation in global average | Report both; the biology is "shared at specific loci, divergent overall" | | LAVA significant but univariate h2 non-significant at hit locus | Spurious bivariate without identified local h2 signal | Filter univariate first; do NOT report bivariate at unidentified loci | | Popcorn rho_ge << 1 across many trait pairs | Population-specific causal architecture | Real finding; report rho_ge alongside within-pop h2 | | LDSC ratio > 0.2 | Population stratification or model misfit | Re-check ancestry; consider LD-score reference mismatch; report with caveat | | LCV gcp ~ 0 with large rg | Genetic correlation without (partial) causation in either direction | Shared confounder hypothesis is preferred; do NOT report as causal | | LCV gcp > 0 (significant) with large rg | Partial-to-full causation in tested direction | Combine with bidirectional MR + CHP-aware sensitivity; this is supportive but not sufficient |

Operational rule for publication: Report LDSC rg + intercept as primary global statistic; report HDL only if overlap is verified < 5%; complement with LAVA local rg whenever global rg is near zero or biology suggests heterogeneity; report LCV gcp when downstream MR is planned; trans-ancestry analyses require Popcorn (not within-population LDSC).

Common Errors

| Error / symptom | Cause | Solution | |-----------------|-------|----------| | LDSC category not found after --rg | LD score column header mismatch (custom reference) | Inspect M_5_50 and .l2.ldscore headers; align with --ref-ld-chr prefix | | LDSC ratio > 1 (negative h2 z-score) | Severe stratification or wrong LD-score ancestry | Switch to ancestry-matched reference; check for population structure | | HDL convergence warning / NA SE | Reference panel mismatch or extreme overlap | Verify SVD eigen reference path; switch to LDSC when overlap suspected | | LAVA Insufficient SNPs at locus for most loci | LD reference and partition file from different builds | Match GRCh37 vs GRCh38; align LD reference to partition file | | LAVA bivariate rg = +/- 1 at boundary | Univariate filter not applied; locus is unidentified | Apply univ$p < 0.05/N_loci filter to BOTH traits before run.bivar() | | Popcorn complains about MAF format | sumstats EAF column missing or NA | Provide EAF; do not impute from external reference (creates miscalibration) | | HESS --num-shared defaulting to wrong value | Forgot to set explicitly; default 0 is independent | Always set explicitly; if unknown, use LDSC intercept to infer overlap | | GenomicSEM ldsc() returns negative-definite covariance | Numerical instability with many traits | Inspect per-pair LDSC results; drop low-h2 traits; regularize | | rg point estimate > 1 with CI overlapping 1 | Sampling variance; same-trait pair near identity | Report as "rg not distinguishable from 1"; constrained likelihood at the rg=1 boundary gives different SE -- LRT against H0: rg=1 is more precise than Wald CI | | Binary-vs-continuous trait pair scale concern | Reviewer asks about liability-vs-observed scale propagation | LDSC rg is scale-invariant -- case-control h2 liability vs observed scale propagates equivalently into rg; no correction needed |

Required Reporting for rg Analyses

| Component | Required | |-----------|----------| | Per-trait h2 + SE + intercept + mean chi-square | Yes | | Bivariate rg + SE + p | Yes | | gcov_int (cross-trait intercept) | Yes; non-zero under known overlap is expected, not bias | | LD reference panel + ancestry | Yes | | Method used (LDSC / HDL / LAVA / Popcorn) | Yes; rationale per Decision Tree | | Local rg supplementary (LAVA) | If global rg null but biology suggests sharing | | Sample-size: Neff per trait | Yes |

Anticipated Reviewer Pushback

| Pushback | Standard response | |----------|-------------------| | "Sample overlap?" | LDSC: gcov_int reported; non-zero under known overlap is expected, NOT bias. HDL: only valid if independent (<5% overlap) | | "Why LDSC not HDL?" | HDL gives lower variance but is biased > 5% overlap; LDSC is the conservative default | | "Local vs global rg?" | If global rg modest but biology suggests sharing, LAVA (Werme 2022) reported as supplementary | | "Cross-ancestry?" | Popcorn for trans-ancestry; rg < 1 in trans is real biology, not noise | | "Does rg motivate CHP-MR?" | If |rg| > 0.3, CAUSE / LHC-MR sensitivity reported (cross-ref pleiotropy-detection) | | "rg = 1 boundary?" | If CI includes 1, reported as "not distinguishable from rg=1"; constrained LRT alternative provided |

Tool Installation Notes

# LDSC Python 3 fork (original bulik/ldsc is Python 2.7 unmaintained since 2019).
# belowlab/ldsc v3.0.1 broke the --h2/--rg/--h2-cts CLI per its README;
# abdenlab/ldsc-python3 (v2.0.0) retains the working CLI. Docker
# `jtb114/ldsc:latest` is the recommended belowlab fallback.
git clone https://github.com/abdenlab/ldsc-python3.git
cd ldsc-python3 && pip install .   # Poetry project (pyproject.toml); no environment.yml
# pre-computed EUR / EAS / AFR LD scores at alkesgroup.broadinstitute.org/LDSCORE

# HESS
git clone https://github.com/huwenboshi/hess.git
# Berisa-Pickrell LDetect partition files bundled in repo

# Popcorn
git clone https://github.com/brielin/Popcorn.git
cd Popcorn && python setup.py install

# HDL
remotes::install_github('zhenin/HDL/HDL')
# UKB-array SVD eigen reference: HDL GitHub README has download link

# LAVA
remotes::install_github('josefin-werme/LAVA')
# Pre-computed LDetect partitioning at LAVA GitHub (s2500_m25_f1_w200 is GRCh37/hg19; lift over for GRCh38)

# GenomicSEM
remotes::install_github('GenomicSEM/GenomicSEM')

References

Bulik-Sullivan B et al 2015 Nat Genet 47:1236 (cross-trait LDSC; intercept absorbs sample overlap)
Bulik-Sullivan B et al 2015 Nat Genet 47:291 (univariate LDSC h2; companion paper)
Ning Z et al 2020 Nat Genet 52:859 (HDL; high-definition likelihood; ~60% lower variance than LDSC)
Werme J et al 2022 Nat Genet 54:274 (LAVA; local genetic correlation via per-locus PC projection)
Shi H et al 2017 AJHG 101:737 (rho-HESS; locus-level bivariate)
Shi H et al 2016 AJHG 99:139 (HESS univariate; companion)
Lee SH et al 2012 Bioinformatics 28:2540 (GREML-bivariate)
Brown BC et al 2016 AJHG 99:76 (Popcorn; trans-ancestry rg)
Galinsky KJ et al 2019 Genet Epidemiol 43:180 (cross-population genetic correlation of causal effect sizes)
Grotzinger AD et al 2019 Nat Hum Behav 3:513 (GenomicSEM)
O'Connor LJ & Price AL 2018 Nat Genet 50:1728 (LCV; gcp parameter)
Morrison J et al 2020 Nat Genet 52:740 (CAUSE; CHP-aware MR motivated by high rg)
Berisa T & Pickrell JK 2016 Bioinformatics 32:283 (LDetect LD blocks underpinning LAVA / HESS)
Speed D, Holmes J & Balding DJ 2020 Nat Genet 52:458 (model comparison of heritability frameworks)
Bulik-Sullivan B 2015 bioRxiv 018283 (relationship between LD Score regression and Haseman-Elston regression)
Skrivankova VW et al 2021 JAMA 326:1614 (STROBE-MR; rg reporting in MR context)

Related Skills

causal-genomics/mendelian-randomization - Primary causal estimation; |rg| > 0.3 motivates CHP-aware sensitivity
causal-genomics/pleiotropy-detection - CAUSE, LHC-MR, LCV; CHP-aware MR battery triggered by high rg
causal-genomics/heritability-partitioning - Partner method; LDSC stack for univariate h2 and partitioned enrichment
causal-genomics/genomic-sem - GenomicSEM ldsc() is the multivariate extension of bivariate LDSC
causal-genomics/colocalization-analysis - Locus-level shared causal variant; complements LAVA hits
causal-genomics/fine-mapping - Credible-set construction at LAVA-significant loci
causal-genomics/mediation-analysis - MVMR for X -> M -> Y after rg motivates causal hypothesis
population-genetics/association-testing - GWAS summary statistics underlying all rg methods
clinical-biostatistics/effect-measures - Translate genetic-architecture findings to clinical effect measures

Version Compatibility

Before using code patterns, verify installed versions match. If versions differ:

Python: pip show <package> then python -c 'import <module>; help(<module>)'
R: packageVersion('<pkg>') then ?function_name
CLI: <tool> --version then <tool> --help

Genetic Correlation

CLI (LDSC, robust to overlap): ldsc.py --rg trait1.sumstats.gz,trait2.sumstats.gz --ref-ld-chr eur_w_ld_chr/ --w-ld-chr eur_w_ld_chr/ --out rg
R (HDL, lower variance, requires independent samples): HDL.rg(gwas1.df, gwas2.df, LD.path = 'UKB_array_SVD_eigen90_extraction', N0 = 0)
R (LAVA, local rg per locus): process.input() -> run.univ() -> run.bivar(input, locus_id) over ~2495 LDetect-derived loci
CLI (rho-HESS, locus-level): hess.py --local-rhog t1.sumstats.gz t2.sumstats.gz --bfile <ref> --partition <part>.bed --chrom <chr>
CLI (Popcorn, trans-ancestry): popcorn fit -v 1 --cfile cross_pop_scores.txt --sfile1 pop1.txt --sfile2 pop2.txt out

Algorithmic Taxonomy

Cross-Trait LDSC Intercept: Sample Overlap is Absorbed, Not a Bias

The most common postdoc-level misreading: Treating a non-zero cross-trait LDSC intercept as evidence of bias in the rg estimate.

The bivariate LDSC regression has the form: E[Z1 Z2] = sqrt(N1 N2) * rg * h2-product / M * LD_score + rho_overlap

HDL Bias Under Sample Overlap

The mirror image trap: Running HDL on two GWAS that share controls or come from the same biobank.

Relationship to MR Causal Inference

Genetic correlation between an exposure and an outcome is a screening statistic, not a causal claim. High |rg| has three biological explanations:

Decision Tree by Scenario

Per-Method Failure Modes

Cross-trait LDSC intercept misread as bias

Trigger: Reader (collaborator, reviewer) sees a non-zero LDSC intercept and reports the rg as "biased by sample overlap".

Symptom: Reviewer comment requesting "correction for sample overlap" when LDSC was already used; collaborator suggesting switching to HDL because intercept is non-zero.

HDL bias with sample overlap

Trigger: Running HDL on two GWAS that share > 5% of individuals (e.g. two UKB-derived GWAS, two MVP-derived GWAS, GWAS reusing controls).

Symptom: HDL rg differs substantially from cross-trait LDSC rg; HDL CI is narrower than expected from N alone; LDSC intercept (which absorbs overlap) is materially non-zero.

Non-EUR ancestry mismatch with EUR LD scores

Trigger: Running LDSC on a non-EUR GWAS (or admixed sample) with the default eur_w_ld_chr/ reference.

Symptom: LDSC ratio is unusually high (>0.2); per-chromosome estimates wildly heterogeneous; total h2 mismatches independent estimates from the same cohort.

Fix: Use ancestry-matched LD scores from alkesgroup (eas_w_ld_chr, afr_w_ld_chr) or compute custom LD scores from in-sample LD reference. For trans-ancestry rg, switch to Popcorn.

Global rg masks local rg variation

Trigger: Two traits with biologically plausible shared etiology return global rg ~ 0 in cross-trait LDSC.

Cross-population rg below 1 even at causal level

Trigger: Computing rg between same-trait GWAS in two ancestries (e.g. EUR T2D vs EAS T2D).

Symptom: Trans-ancestry rg point estimate around 0.6-0.9 with CI excluding 1 for a trait expected to be "the same disease".

Same-Trait Cross-Cohort rg as Consistency Check

Low chi-square mean (underpowered GWAS)

Trigger: Mean chi-square in either input GWAS is below 1.02 (heuristic LDSC threshold).

Mechanism: LDSC, HDL, and LAVA all depend on bivariate moments of Z-scores against LD score; weak signal means the slope is dominated by noise.

Symptom: LDSC rg SE > 0.2; intercept estimates fluctuate across chromosome; HDL convergence warnings; LAVA returns p > 0.05 at most loci.

LAVA univariate filter ignored

Trigger: Reporting LAVA bivariate rg at a locus where the univariate local h2 is non-significant in one or both traits.

Symptom: LAVA bivariate p < 0.05 at loci where univariate h2 p > 0.05 for one trait; rg estimates near +/- 1 (boundary cases).

Quantitative Thresholds

Cross-Trait LDSC: Standard Workflow

Goal: Estimate global rg from two GWAS summary statistics, robust to any sample overlap.

Approach: Munge each sumstats file (column harmonization + filters), supply ancestry-matched LD scores, run --rg mode; interpret slope (rg) and intercept (overlap proxy) separately.

# Step 1: munge each GWAS to LDSC format (harmonize columns, filter on MAF and INFO, restrict to HapMap3)
munge_sumstats.py \
    --sumstats trait1.tsv.gz \
    --N 250000 \
    --merge-alleles w_hm3.snplist \
    --out trait1.munged

munge_sumstats.py \
    --sumstats trait2.tsv.gz \
    --N 180000 \
    --merge-alleles w_hm3.snplist \
    --out trait2.munged

ldsc.py \
    --rg trait1.munged.sumstats.gz,trait2.munged.sumstats.gz \
    --ref-ld-chr eur_w_ld_chr/ \
    --w-ld-chr eur_w_ld_chr/ \
    --out rg_t1_t2

grep -A 11 'Summary of Genetic Correlation Results' rg_t1_t2.log

HDL: Lower-Variance rg for Independent Samples

Goal: Estimate global rg with ~60% lower variance than LDSC when the two GWAS draw from non-overlapping samples.

Approach: Format each GWAS as an HDL data frame; supply the UKB-array SVD eigen reference path; pass N0 (overlapping sample count, 0 for independent).

# remotes::install_github('zhenin/HDL/HDL')
library(HDL)

gwas1 <- data.frame(
    SNP = trait1$rsid,
    A1 = trait1$effect_allele,
    A2 = trait1$other_allele,
    N = trait1$N,
    Z = trait1$beta / trait1$se,
    b = trait1$beta,
    se = trait1$se
)

gwas2 <- data.frame(
    SNP = trait2$rsid,
    A1 = trait2$effect_allele,
    A2 = trait2$other_allele,
    N = trait2$N,
    Z = trait2$beta / trait2$se,
    b = trait2$beta,
    se = trait2$se
)

res <- HDL.rg(
    gwas1.df = gwas1,
    gwas2.df = gwas2,
    LD.path = 'UKB_array_SVD_eigen90_extraction',
    N0 = 0,  # number of overlapping individuals; 0 for independent cohorts. HDL corrects for overlap via N0 and is robust to its misspecification
    output.file = 'hdl_rg.txt'
)

print(res$rg)
print(res$rg.se)
print(res$P)

LAVA: Local Genetic Correlation Per Locus

Goal: Identify loci where two traits share genetic etiology, including loci hidden by global rg cancellation.

Approach: Process inputs once -> filter to loci with detectable univariate local h2 in BOTH traits -> run bivariate per-locus rg; apply Bonferroni for ~2495 loci.

# remotes::install_github('josefin-werme/LAVA')
library(LAVA)

input <- process.input(
    input.info.file = 'input.info.txt',
    sample.overlap.file = 'sample.overlap.txt',
    ref.prefix = '1kg_EUR_chr',
    phenos = c('trait1', 'trait2')
)

loci <- read.loci('blocks_s2500_m25_f1_w200.GRCh37_hg19.locfile')

univ_results <- list()
biv_results <- list()
N_loci <- nrow(loci)

for (i in seq_len(N_loci)) {
    locus <- process.locus(loci[i, ], input)
    if (is.null(locus)) next  # no SNPs / no h2 / monomorphic
    univ <- run.univ(locus)
    univ_results[[i]] <- univ
    pass_univ <- all(univ$p < 0.05 / N_loci)  # Bonferroni on both traits
    if (!pass_univ) next
    biv_results[[i]] <- run.bivar(locus)
}

univ_df <- do.call(rbind, univ_results)
biv_df <- do.call(rbind, biv_results)
biv_df$padj <- p.adjust(biv_df$p, method = 'bonferroni', n = N_loci)
sig_loci <- subset(biv_df, padj < 0.05)

rho-HESS: Alternative Local rg

Goal: Per-locus bivariate rg using the HESS quadratic-form estimator; complementary to LAVA.

Approach: Estimate local h2 per trait first; then bivariate cross-trait estimator using LD-projected effect estimates per locus.

# Step 1: eigenvalues + projections for both traits (per chromosome; two sumstats SPACE-separated).
# --local-rhog auto-writes per-trait files (step1_trait1_*, step1_trait2_*) AND the covariance
# intermediates (step1_chrN.eig.gz, step1_chrN.prjprod.gz) under the shared --out prefix.
for chr in {1..22}; do
    hess.py \
        --local-rhog trait1.sumstats.gz trait2.sumstats.gz \
        --chrom $chr \
        --bfile 1kg_EUR_chr${chr} \
        --partition fourier_ls-chr${chr}.bed \
        --out step1
done

# Step 2: per-trait local h2 from the Step-1 outputs (MUST run before Step 3)
hess.py --prefix step1_trait1 --out step2_trait1
hess.py --prefix step1_trait2 --out step2_trait2

# Step 3: local genetic covariance / rg. --local-hsqg-est passes the Step-2 per-trait local h2,
# --num-shared is the overlap count, --pheno-cor the phenotypic correlation (any value when
# --num-shared 0). Auto-aggregates across all chromosomes; no per-chromosome loop here.
hess.py \
    --prefix step1 \
    --local-hsqg-est step2_trait1.txt step2_trait2.txt \
    --num-shared 0 \
    --pheno-cor 0 \
    --out step3

Popcorn: Trans-Ancestry rg

Goal: Quantify shared causal architecture between two ancestries (e.g. EUR T2D vs EAS T2D) under a MAF-LD-aware model.

Approach: Compute cross-population LD scores once per ancestry pair; fit rg using sumstats from each population.

# Step 1: cross-population LD scores (one-time per ancestry pair)
popcorn compute -v 1 \
    --bfile1 1kg_EUR \
    --bfile2 1kg_EAS \
    --SNPs_to_store 20000 \
    --gen_effect \
    eur_eas_scores.txt

# Step 2: fit cross-population rg
popcorn fit -v 1 \
    --cfile eur_eas_scores.txt \
    --gen_effect \
    --sfile1 t2d_eur.sumstats.txt \
    --sfile2 t2d_eas.sumstats.txt \
    t2d_eur_eas_rg.txt

Reconciliation Across Methods

Common Errors

Required Reporting for rg Analyses

Anticipated Reviewer Pushback

Tool Installation Notes

# LDSC Python 3 fork (original bulik/ldsc is Python 2.7 unmaintained since 2019).
# belowlab/ldsc v3.0.1 broke the --h2/--rg/--h2-cts CLI per its README;
# abdenlab/ldsc-python3 (v2.0.0) retains the working CLI. Docker
# `jtb114/ldsc:latest` is the recommended belowlab fallback.
git clone https://github.com/abdenlab/ldsc-python3.git
cd ldsc-python3 && pip install .   # Poetry project (pyproject.toml); no environment.yml
# pre-computed EUR / EAS / AFR LD scores at alkesgroup.broadinstitute.org/LDSCORE

# HESS
git clone https://github.com/huwenboshi/hess.git
# Berisa-Pickrell LDetect partition files bundled in repo

# Popcorn
git clone https://github.com/brielin/Popcorn.git
cd Popcorn && python setup.py install

# HDL
remotes::install_github('zhenin/HDL/HDL')
# UKB-array SVD eigen reference: HDL GitHub README has download link

# LAVA
remotes::install_github('josefin-werme/LAVA')
# Pre-computed LDetect partitioning at LAVA GitHub (s2500_m25_f1_w200 is GRCh37/hg19; lift over for GRCh38)

# GenomicSEM
remotes::install_github('GenomicSEM/GenomicSEM')

References

Bulik-Sullivan B et al 2015 Nat Genet 47:1236 (cross-trait LDSC; intercept absorbs sample overlap)
Bulik-Sullivan B et al 2015 Nat Genet 47:291 (univariate LDSC h2; companion paper)
Ning Z et al 2020 Nat Genet 52:859 (HDL; high-definition likelihood; ~60% lower variance than LDSC)
Werme J et al 2022 Nat Genet 54:274 (LAVA; local genetic correlation via per-locus PC projection)
Shi H et al 2017 AJHG 101:737 (rho-HESS; locus-level bivariate)
Shi H et al 2016 AJHG 99:139 (HESS univariate; companion)
Lee SH et al 2012 Bioinformatics 28:2540 (GREML-bivariate)
Brown BC et al 2016 AJHG 99:76 (Popcorn; trans-ancestry rg)
Galinsky KJ et al 2019 Genet Epidemiol 43:180 (cross-population genetic correlation of causal effect sizes)
Grotzinger AD et al 2019 Nat Hum Behav 3:513 (GenomicSEM)
O'Connor LJ & Price AL 2018 Nat Genet 50:1728 (LCV; gcp parameter)
Morrison J et al 2020 Nat Genet 52:740 (CAUSE; CHP-aware MR motivated by high rg)
Berisa T & Pickrell JK 2016 Bioinformatics 32:283 (LDetect LD blocks underpinning LAVA / HESS)
Speed D, Holmes J & Balding DJ 2020 Nat Genet 52:458 (model comparison of heritability frameworks)
Bulik-Sullivan B 2015 bioRxiv 018283 (relationship between LD Score regression and Haseman-Elston regression)
Skrivankova VW et al 2021 JAMA 326:1614 (STROBE-MR; rg reporting in MR context)

Related Skills

causal-genomics/mendelian-randomization - Primary causal estimation; |rg| > 0.3 motivates CHP-aware sensitivity
causal-genomics/pleiotropy-detection - CAUSE, LHC-MR, LCV; CHP-aware MR battery triggered by high rg
causal-genomics/heritability-partitioning - Partner method; LDSC stack for univariate h2 and partitioned enrichment
causal-genomics/genomic-sem - GenomicSEM ldsc() is the multivariate extension of bivariate LDSC
causal-genomics/colocalization-analysis - Locus-level shared causal variant; complements LAVA hits
causal-genomics/fine-mapping - Credible-set construction at LAVA-significant loci
causal-genomics/mediation-analysis - MVMR for X -> M -> Y after rg motivates causal hypothesis
population-genetics/association-testing - GWAS summary statistics underlying all rg methods
clinical-biostatistics/effect-measures - Translate genetic-architecture findings to clinical effect measures

Adoption

GPTomics/bio-causal-genomics-genetic-correlation

$ install --global

Security Scan Results

SKILL.md

Version Compatibility

Genetic Correlation

Algorithmic Taxonomy

Cross-Trait LDSC Intercept: Sample Overlap is Absorbed, Not a Bias

HDL Bias Under Sample Overlap

Relationship to MR Causal Inference

Decision Tree by Scenario

Per-Method Failure Modes

Cross-trait LDSC intercept misread as bias

HDL bias with sample overlap

Non-EUR ancestry mismatch with EUR LD scores

Global rg masks local rg variation

Cross-population rg below 1 even at causal level

Same-Trait Cross-Cohort rg as Consistency Check

Low chi-square mean (underpowered GWAS)

LAVA univariate filter ignored

Quantitative Thresholds

Cross-Trait LDSC: Standard Workflow

HDL: Lower-Variance rg for Independent Samples

LAVA: Local Genetic Correlation Per Locus

rho-HESS: Alternative Local rg

Popcorn: Trans-Ancestry rg

Reconciliation Across Methods

Common Errors

Required Reporting for rg Analyses

Anticipated Reviewer Pushback

Tool Installation Notes

References

Related Skills

Related Skills

GPTomics/bio-workflows-clip-pipeline

GPTomics/bio-comparative-genomics-whole-genome-duplication

GPTomics/bio-comparative-genomics-whole-genome-alignment

GPTomics/bio-comparative-genomics-synteny-analysis

GPTomics/bio-causal-genomics-genetic-correlation

$ install --global

Security Scan Results

SKILL.md

Version Compatibility

Genetic Correlation

Algorithmic Taxonomy

Cross-Trait LDSC Intercept: Sample Overlap is Absorbed, Not a Bias

HDL Bias Under Sample Overlap

Relationship to MR Causal Inference

Decision Tree by Scenario

Per-Method Failure Modes

Cross-trait LDSC intercept misread as bias

HDL bias with sample overlap

Non-EUR ancestry mismatch with EUR LD scores

Global rg masks local rg variation

Cross-population rg below 1 even at causal level

Same-Trait Cross-Cohort rg as Consistency Check

Low chi-square mean (underpowered GWAS)

LAVA univariate filter ignored

Quantitative Thresholds

Cross-Trait LDSC: Standard Workflow

HDL: Lower-Variance rg for Independent Samples

LAVA: Local Genetic Correlation Per Locus

rho-HESS: Alternative Local rg

Popcorn: Trans-Ancestry rg

Reconciliation Across Methods

Common Errors

Required Reporting for rg Analyses

Anticipated Reviewer Pushback

Tool Installation Notes

References

Related Skills

Related Skills

GPTomics/bio-workflows-clip-pipeline

GPTomics/bio-comparative-genomics-whole-genome-duplication

GPTomics/bio-comparative-genomics-whole-genome-alignment

GPTomics/bio-comparative-genomics-synteny-analysis