Identification Theory

Comprehensive framework for causal identification in statistical methodology

Use this skill when working on: causal identification, mediation analysis identification, DAG-based reasoning, potential outcomes, identification assumptions, partial identification, sensitivity analysis, or deriving identification formulas.

Core Concepts

What is Identification?

A causal parameter $\psi$ is identified if it can be uniquely determined from the observed data distribution $P(O)$.

Formally: $\psi$ is identified if $P_1(O) = P_2(O) \Rightarrow \psi_1 = \psi_2$.

Why Identification Matters

Causal Question → Target Estimand → Identification → Estimation → Inference
     ↓                  ↓                ↓               ↓            ↓
  "Does A           E[Y(1)-Y(0)]     Express in      Statistical   Confidence
   cause Y?"                         terms of P(O)    methods      intervals

Without identification, no amount of data can answer causal questions.

Two Frameworks

1. Potential Outcomes (Rubin/Neyman)

Primitives:

$Y(a)$ = potential outcome under treatment $a$
Only $Y = Y(A)$ is observed (consistency)
Fundamental problem: never observe both $Y(0)$ and $Y(1)$ for same unit

Advantages:

Clear definition of causal effects
Natural for experimental reasoning
Connects to missing data theory

2. Structural Causal Models (Pearl)

Primitives:

Directed Acyclic Graph (DAG) encoding causal structure
Structural equations: $Y := f_Y(PA_Y, U_Y)$
Interventions via do-operator: $P(Y | do(A=a))$

Advantages:

Visual representation of assumptions
Systematic identification algorithms
Clear separation of statistical and causal assumptions

DAG Framework

Directed Acyclic Graphs (DAGs)

A DAG $\mathcal{G} = (V, E)$ consists of:

Vertices $V$: Random variables
Directed edges $E$: Direct causal relationships
Acyclic: No directed cycles

Key DAG Terminology

| Term | Definition | Notation | |------|------------|----------| | Parents | Direct causes | $PA_Y$ | | Children | Direct effects | $CH_Y$ | | Ancestors | All causes | $AN_Y$ | | Descendants | All effects | $DE_Y$ | | Collider | Node with two incoming arrows | $A \to C \leftarrow B$ | | Mediator | Node on causal path | $A \to M \to Y$ | | Confounder | Common cause | $A \leftarrow C \to Y$ |

# DAG specification and visualization using dagitty
library(dagitty)

# Define mediation DAG
mediation_dag <- dagitty('
  dag {
    A [exposure]
    M [mediator]
    Y [outcome]
    X [confounder]

    X -> A
    X -> M
    X -> Y
    A -> M
    A -> Y
    M -> Y
  }
')

# Visualize
plot(mediation_dag)

# Find adjustment sets
adjustmentSets(mediation_dag, exposure = "A", outcome = "Y")

# Check implied conditional independencies
impliedConditionalIndependencies(mediation_dag)

D-Separation

The Core Concept

Two nodes $A$ and $B$ are d-separated by set $Z$ if every path between them is blocked.

Path Blocking Rules

| Path Type | Blocked by conditioning on... | |-----------|-------------------------------| | Chain: $A \to M \to B$ | $M$ (blocks) | | Fork: $A \leftarrow C \to B$ | $C$ (blocks) | | Collider: $A \to C \leftarrow B$ | NOT $C$ (conditioning opens!) |

D-separation Formula

$$A \perp!!!\perp_{\mathcal{G}} B \mid Z \iff \text{every path } A \text{---} B \text{ is blocked by } Z$$

# Check d-separation using dagitty
check_dseparation <- function(dag, x, y, z = NULL) {
  if (is.null(z)) {
    dseparated(dag, x, y)
  } else {
    dseparated(dag, x, y, z)
  }
}

# Find all d-separating sets
find_dsep_sets <- function(dag, x, y) {
  # All adjustment sets that d-separate x and y
  adjustmentSets(dag, exposure = x, outcome = y, effect = "total")
}

# Verify conditional independence implications
verify_ci_implications <- function(dag, data) {
  implied_ci <- impliedConditionalIndependencies(dag)

  results <- lapply(implied_ci, function(ci) {
    # Parse the CI statement
    vars <- strsplit(as.character(ci), " _\\|\\|_ | \\| ")[[1]]
    x <- vars[1]
    y <- vars[2]
    z <- if (length(vars) > 2) vars[3:length(vars)] else NULL

    # Test with partial correlation or conditional independence test
    test_result <- test_conditional_independence(data, x, y, z)

    list(statement = as.character(ci), p_value = test_result$p.value)
  })

  do.call(rbind, lapply(results, as.data.frame))
}

Backdoor Criterion

Definition

A set $Z$ satisfies the backdoor criterion relative to $(A, Y)$ if:

No node in $Z$ is a descendant of $A$
$Z$ blocks every path between $A$ and $Y$ that contains an arrow into $A$

Backdoor Adjustment Formula

If $Z$ satisfies the backdoor criterion: $$P(Y | do(A = a)) = \sum_z P(Y | A = a, Z = z) P(Z = z)$$

or equivalently: $$E[Y(a)] = E_Z[E[Y | A = a, Z]]$$

Front-Door Criterion

When backdoor fails but mediator is unconfounded: $$P(Y | do(A)) = \sum_m P(M = m | A) \sum_{a'} P(Y | M = m, A = a') P(A = a')$$

# Check backdoor criterion
check_backdoor <- function(dag, exposure, outcome, adjustment_set) {
  # Using dagitty
  valid_sets <- adjustmentSets(dag, exposure = exposure,
                                outcome = outcome, type = "minimal")

  # Check if proposed set is valid
  is_valid <- any(sapply(valid_sets, function(s) {
    setequal(s, adjustment_set)
  }))

  list(
    is_valid = is_valid,
    minimal_sets = valid_sets,
    proposed = adjustment_set
  )
}

# Compute backdoor-adjusted estimate
backdoor_adjustment <- function(data, outcome, exposure, adjustment) {
  formula_str <- paste(outcome, "~", exposure, "+",
                       paste(adjustment, collapse = " + "))
  model <- lm(as.formula(formula_str), data = data)

  # Standardization
  predictions_a1 <- predict(model,
    newdata = transform(data, setNames(list(1), exposure)))
  predictions_a0 <- predict(model,
    newdata = transform(data, setNames(list(0), exposure)))

  list(
    ate = mean(predictions_a1 - predictions_a0),
    se = sqrt(var(predictions_a1 - predictions_a0) / nrow(data))
  )
}

# Full identification analysis
analyze_identification <- function(dag, exposure, outcome) {
  list(
    adjustment_sets = adjustmentSets(dag, exposure, outcome),
    instrumental_sets = instrumentalVariables(dag, exposure, outcome),
    direct_effects = adjustmentSets(dag, exposure, outcome, effect = "direct"),
    implied_independencies = impliedConditionalIndependencies(dag)
  )
}

Framework Equivalence

For most problems, both frameworks give equivalent results: $$E[Y(a)] = E[Y | do(A=a)]$$

Choose based on context and audience.

Key Identification Assumptions

For Treatment Effects

| Assumption | Formal Statement | Interpretation | |------------|------------------|----------------| | Consistency | $Y = Y(A)$ | Observed outcome equals potential outcome for received treatment | | Positivity | $P(A=a \mid X=x) > 0$ for all $x$ with $P(X=x) > 0$ | Every covariate stratum has both treated and untreated | | Exchangeability | $Y(a) \perp!!!\perp A \mid X$ | No unmeasured confounding given $X$ | | SUTVA | No interference, single version of treatment | Units don't affect each other |

For Mediation Effects

Additional assumptions required:

| Assumption | Formal Statement | Interpretation | |------------|------------------|----------------| | Cross-world exchangeability | $Y(a,m) \perp!!!\perp M(a^*) \mid X$ | Counterfactual mediator independent of counterfactual outcome | | No $A$-$M$ interaction (optional) | $Y(a,m) - Y(a',m)$ constant in $m$ | Simplifies identification | | Compositional | $Y(a) = Y(a, M(a))$ | Potential outcome composition |

Standard Identification Results

1. Average Treatment Effect (ATE)

Target: $\psi = E[Y(1) - Y(0)]$

Under exchangeability (A1), consistency (A2), positivity (A3):

$$\psi = E\left[E[Y | A=1, X] - E[Y | A=0, X]\right]$$

Proof sketch: \begin{align} E[Y(a)] &= E[E[Y(a) | X]] && \text{(iterated expectations)} \ &= E[E[Y(a) | A=a, X]] && \text{(A1: exchangeability)} \ &= E[E[Y | A=a, X]] && \text{(A2: consistency)} \end{align}

2. Average Treatment Effect on Treated (ATT)

Target: $\psi_{ATT} = E[Y(1) - Y(0) | A=1]$

Under weaker exchangeability $Y(0) \perp!!!\perp A \mid X$:

$$\psi_{ATT} = E\left[E[Y | A=1, X] - E[Y | A=0, X] \mid A=1\right]$$

3. Natural Direct and Indirect Effects (Mediation)

Target:

NDE: $E[Y(1, M(0)) - Y(0, M(0))]$
NIE: $E[Y(1, M(1)) - Y(1, M(0))]$

Under mediation assumptions (see VanderWeele, 2015):

$$NDE = \int\int {E[Y|A=1,M=m,X=x] - E[Y|A=0,M=m,X=x]} , dP(m|A=0,X=x) , dP(x)$$

$$NIE = \int\int E[Y|A=1,M=m,X=x] {dP(m|A=1,X=x) - dP(m|A=0,X=x)} , dP(x)$$

4. Controlled Direct Effect (CDE)

Target: $CDE(m) = E[Y(1,m) - Y(0,m)]$

Simpler identification (no cross-world assumption):

$$CDE(m) = E[E[Y|A=1,M=m,X] - E[Y|A=0,M=m,X]]$$

DAG-Based Identification

The Back-Door Criterion

A set $X$ satisfies the back-door criterion relative to $(A, Y)$ if:

No node in $X$ is a descendant of $A$
$X$ blocks every path between $A$ and $Y$ that contains an arrow into $A$

If satisfied: $$P(Y | do(A=a)) = \sum_x P(Y | A=a, X=x) P(X=x)$$

The Front-Door Criterion

When there's an unmeasured confounder $U$ between $A$ and $Y$, but $M$ mediates all of $A$'s effect:

    U
   / \
  ↓   ↓
  A → M → Y

Identification: $$P(Y | do(A=a)) = \sum_m P(M=m | A=a) \sum_{a'} P(Y | M=m, A=a') P(A=a')$$

Instrumental Variables

When $Z$ affects $Y$ only through $A$:

  U
  ↓
Z → A → Y

Local ATE identification (with monotonicity): $$LATE = \frac{E[Y | Z=1] - E[Y | Z=0]}{E[A | Z=1] - E[A | Z=0]}$$

Sequential Identification (Multiple Mediators)

Sequential Mediation (A → M1 → M2 → Y)

Product of three path identification requires:

Standard confounding control for each arrow
No intermediate confounders affected by treatment
Sequential ignorability assumptions

Path-specific effects:

Direct: $A \to Y$
Through $M_1$ only: $A \to M_1 \to Y$
Through $M_2$ only: $A \to M_2 \to Y$
Through both: $A \to M_1 \to M_2 \to Y$

Identification Formula (No Intermediate Confounding)

$$\text{Effect through } M_1 \to M_2 = \int E\left[\frac{\partial^3}{\partial a \partial m_1 \partial m_2} E[Y|A,M_1,M_2,X]\right]$$

Expressed as product of coefficients: $\hat{\alpha}_1 \cdot \hat{\beta}_1 \cdot \hat{\gamma}_2$

Partial Identification

When point identification fails, we can still bound the parameter.

Manski Bounds (No Assumptions)

For ATE with missing outcomes: $$E[Y(1)] \in [E[Y \cdot A]/P(A=1) + y_{min}P(A=0), E[Y \cdot A]/P(A=1) + y_{max}P(A=0)]$$

Sensitivity Analysis

When exchangeability is uncertain, parameterize violation:

Unmeasured confounding parameter $\Gamma$: $$\frac{1}{\Gamma} \leq \frac{P(A=1|X,U=1)/P(A=0|X,U=1)}{P(A=1|X,U=0)/P(A=0|X,U=0)} \leq \Gamma$$

Compute bounds as function of $\Gamma$ (Rosenbaum bounds).

E-Value

Minimum strength of unmeasured confounding (on risk ratio scale) needed to explain away observed effect:

$$E\text{-value} = RR + \sqrt{RR \times (RR-1)}$$

Identification Strategies by Design

Randomized Controlled Trials (RCTs)

Treatment assignment random → exchangeability holds by design
Still need SUTVA, consistency
For mediation: randomize $M$ as well, or use sequential ignorability

Observational Studies

| Strategy | Key Assumption | Best For | |----------|----------------|----------| | Regression adjustment | All confounders measured | Rich covariate data | | Propensity score | Correct PS model | High-dimensional confounders | | Instrumental variables | Valid instrument exists | Unmeasured confounding | | Regression discontinuity | Continuity at threshold | Sharp treatment rules | | Difference-in-differences | Parallel trends | Panel data |

Natural Experiments

Exploit exogenous variation (policy changes, geographic variation)
Requires careful argument for why variation is "as-if random"

Identification in the MediationVerse

medfit: Foundation

Implements standard mediation identification
VanderWeele regression-based approach
Supports binary/continuous treatments and mediators

probmed: Effect Size

$P_M$ identification requires identified NDE/NIE
Handles case when NDE and NIE have opposite signs

RMediation: Confidence Intervals

Takes identified effects as input
Distribution of product of coefficients (PRODCLIN)
Monte Carlo intervals

medrobust: Sensitivity

When identification assumptions are uncertain
Bounds on effects under confounding
E-values for unmeasured confounding

medsim: Validation

Simulate data where truth is known
Verify identification formulas recover true effects
Test estimator properties

Identification Proof Template

\begin{theorem}[Identification of $\psi$]
Under Assumptions:
\begin{enumerate}[label=A\arabic*.]
\item (Consistency) $Y = Y(A)$, $M = M(A)$
\item (Positivity) $P(A=a|X) > \epsilon > 0$ for all $a \in \mathcal{A}$
\item (Exchangeability) $Y(a) \perp\!\!\!\perp A \mid X$
\end{enumerate}
the causal estimand $\psi = E[g(Y(a))]$ is identified by
\[
\psi = E_X\left[E[g(Y) \mid A=a, X]\right].
\]
\end{theorem}

\begin{proof}
\begin{align}
E[g(Y(a))] &= E\left[E[g(Y(a)) \mid X]\right]
    && \text{(law of total expectation)} \\
&= E\left[E[g(Y(a)) \mid A=a, X]\right]
    && \text{(by A3: exchangeability)} \\
&= E\left[E[g(Y) \mid A=a, X]\right]
    && \text{(by A1: consistency)}
\end{align}
The RHS depends only on the observed data distribution $P(Y,A,X)$.
\end{proof}

Common Identification Pitfalls

1. Conditioning on Colliders

A → C ← Y

Conditioning on $C$ opens a path between $A$ and $Y$.

2. Conditioning on Mediators

A → M → Y

Conditioning on $M$ blocks the indirect effect, doesn't control confounding.

3. Overcontrol Bias

Conditioning on descendants of treatment can bias estimates.

4. M-Bias

U1 → X ← U2
↓         ↓
A ——————→ Y

Conditioning on $X$ opens path $A \leftarrow U_1 \rightarrow X \leftarrow U_2 \rightarrow Y$.

5. Table 2 Fallacy

Interpreting coefficients causally when model includes intermediate variables.

Verification Questions

When reviewing identification arguments, ask:

Is the target estimand clearly defined?
Are all assumptions explicitly stated?
Is each step in the derivation justified?
Are the assumptions plausible in this context?
What if an assumption is violated?
Is there a DAG that encodes the assumptions?
Are there alternative identification strategies?

Integration with Other Skills

This skill works with:

proof-architect - For writing identification proofs
asymptotic-theory - For inference after identification
methods-paper-writer - For presenting identification in manuscripts
simulation-architect - For validating identification

Key References

Imai
Hernan
Pearl, J. (2009). Causality: Models, Reasoning, and Inference (2nd ed.)
VanderWeele, T.J. (2015). Explanation in Causal Inference
Hernán, M.A. & Robins, J.M. (2020). Causal Inference: What If
Imbens, G.W. & Rubin, D.B. (2015). Causal Inference for Statistics

Version: 1.0 Created: 2025-12-08 Domain: Causal Inference, Mediation Analysis

Identification Theory

Comprehensive framework for causal identification in statistical methodology

Core Concepts

What is Identification?

A causal parameter $\psi$ is identified if it can be uniquely determined from the observed data distribution $P(O)$.

Formally: $\psi$ is identified if $P_1(O) = P_2(O) \Rightarrow \psi_1 = \psi_2$.

Why Identification Matters

Causal Question → Target Estimand → Identification → Estimation → Inference
     ↓                  ↓                ↓               ↓            ↓
  "Does A           E[Y(1)-Y(0)]     Express in      Statistical   Confidence
   cause Y?"                         terms of P(O)    methods      intervals

Without identification, no amount of data can answer causal questions.

Two Frameworks

1. Potential Outcomes (Rubin/Neyman)

Primitives:

$Y(a)$ = potential outcome under treatment $a$
Only $Y = Y(A)$ is observed (consistency)
Fundamental problem: never observe both $Y(0)$ and $Y(1)$ for same unit

Advantages:

Clear definition of causal effects
Natural for experimental reasoning
Connects to missing data theory

2. Structural Causal Models (Pearl)

Primitives:

Directed Acyclic Graph (DAG) encoding causal structure
Structural equations: $Y := f_Y(PA_Y, U_Y)$
Interventions via do-operator: $P(Y | do(A=a))$

Advantages:

Visual representation of assumptions
Systematic identification algorithms
Clear separation of statistical and causal assumptions

DAG Framework

Directed Acyclic Graphs (DAGs)

A DAG $\mathcal{G} = (V, E)$ consists of:

Vertices $V$: Random variables
Directed edges $E$: Direct causal relationships
Acyclic: No directed cycles

Key DAG Terminology

# DAG specification and visualization using dagitty
library(dagitty)

# Define mediation DAG
mediation_dag <- dagitty('
  dag {
    A [exposure]
    M [mediator]
    Y [outcome]
    X [confounder]

    X -> A
    X -> M
    X -> Y
    A -> M
    A -> Y
    M -> Y
  }
')

# Visualize
plot(mediation_dag)

# Find adjustment sets
adjustmentSets(mediation_dag, exposure = "A", outcome = "Y")

# Check implied conditional independencies
impliedConditionalIndependencies(mediation_dag)

D-Separation

The Core Concept

Two nodes $A$ and $B$ are d-separated by set $Z$ if every path between them is blocked.

Path Blocking Rules

D-separation Formula

$$A \perp!!!\perp_{\mathcal{G}} B \mid Z \iff \text{every path } A \text{---} B \text{ is blocked by } Z$$

# Check d-separation using dagitty
check_dseparation <- function(dag, x, y, z = NULL) {
  if (is.null(z)) {
    dseparated(dag, x, y)
  } else {
    dseparated(dag, x, y, z)
  }
}

# Find all d-separating sets
find_dsep_sets <- function(dag, x, y) {
  # All adjustment sets that d-separate x and y
  adjustmentSets(dag, exposure = x, outcome = y, effect = "total")
}

# Verify conditional independence implications
verify_ci_implications <- function(dag, data) {
  implied_ci <- impliedConditionalIndependencies(dag)

  results <- lapply(implied_ci, function(ci) {
    # Parse the CI statement
    vars <- strsplit(as.character(ci), " _\\|\\|_ | \\| ")[[1]]
    x <- vars[1]
    y <- vars[2]
    z <- if (length(vars) > 2) vars[3:length(vars)] else NULL

    # Test with partial correlation or conditional independence test
    test_result <- test_conditional_independence(data, x, y, z)

    list(statement = as.character(ci), p_value = test_result$p.value)
  })

  do.call(rbind, lapply(results, as.data.frame))
}

Backdoor Criterion

Definition

A set $Z$ satisfies the backdoor criterion relative to $(A, Y)$ if:

No node in $Z$ is a descendant of $A$
$Z$ blocks every path between $A$ and $Y$ that contains an arrow into $A$

Backdoor Adjustment Formula

If $Z$ satisfies the backdoor criterion: $$P(Y | do(A = a)) = \sum_z P(Y | A = a, Z = z) P(Z = z)$$

or equivalently: $$E[Y(a)] = E_Z[E[Y | A = a, Z]]$$

Front-Door Criterion

When backdoor fails but mediator is unconfounded: $$P(Y | do(A)) = \sum_m P(M = m | A) \sum_{a'} P(Y | M = m, A = a') P(A = a')$$

# Check backdoor criterion
check_backdoor <- function(dag, exposure, outcome, adjustment_set) {
  # Using dagitty
  valid_sets <- adjustmentSets(dag, exposure = exposure,
                                outcome = outcome, type = "minimal")

  # Check if proposed set is valid
  is_valid <- any(sapply(valid_sets, function(s) {
    setequal(s, adjustment_set)
  }))

  list(
    is_valid = is_valid,
    minimal_sets = valid_sets,
    proposed = adjustment_set
  )
}

# Compute backdoor-adjusted estimate
backdoor_adjustment <- function(data, outcome, exposure, adjustment) {
  formula_str <- paste(outcome, "~", exposure, "+",
                       paste(adjustment, collapse = " + "))
  model <- lm(as.formula(formula_str), data = data)

  # Standardization
  predictions_a1 <- predict(model,
    newdata = transform(data, setNames(list(1), exposure)))
  predictions_a0 <- predict(model,
    newdata = transform(data, setNames(list(0), exposure)))

  list(
    ate = mean(predictions_a1 - predictions_a0),
    se = sqrt(var(predictions_a1 - predictions_a0) / nrow(data))
  )
}

# Full identification analysis
analyze_identification <- function(dag, exposure, outcome) {
  list(
    adjustment_sets = adjustmentSets(dag, exposure, outcome),
    instrumental_sets = instrumentalVariables(dag, exposure, outcome),
    direct_effects = adjustmentSets(dag, exposure, outcome, effect = "direct"),
    implied_independencies = impliedConditionalIndependencies(dag)
  )
}

Framework Equivalence

For most problems, both frameworks give equivalent results: $$E[Y(a)] = E[Y | do(A=a)]$$

Choose based on context and audience.

Key Identification Assumptions

For Treatment Effects

For Mediation Effects

Additional assumptions required:

Standard Identification Results

1. Average Treatment Effect (ATE)

Target: $\psi = E[Y(1) - Y(0)]$

Under exchangeability (A1), consistency (A2), positivity (A3):

$$\psi = E\left[E[Y | A=1, X] - E[Y | A=0, X]\right]$$

2. Average Treatment Effect on Treated (ATT)

Target: $\psi_{ATT} = E[Y(1) - Y(0) | A=1]$

Under weaker exchangeability $Y(0) \perp!!!\perp A \mid X$:

$$\psi_{ATT} = E\left[E[Y | A=1, X] - E[Y | A=0, X] \mid A=1\right]$$

3. Natural Direct and Indirect Effects (Mediation)

Target:

NDE: $E[Y(1, M(0)) - Y(0, M(0))]$
NIE: $E[Y(1, M(1)) - Y(1, M(0))]$

Under mediation assumptions (see VanderWeele, 2015):

$$NDE = \int\int {E[Y|A=1,M=m,X=x] - E[Y|A=0,M=m,X=x]} , dP(m|A=0,X=x) , dP(x)$$

$$NIE = \int\int E[Y|A=1,M=m,X=x] {dP(m|A=1,X=x) - dP(m|A=0,X=x)} , dP(x)$$

4. Controlled Direct Effect (CDE)

Target: $CDE(m) = E[Y(1,m) - Y(0,m)]$

Simpler identification (no cross-world assumption):

$$CDE(m) = E[E[Y|A=1,M=m,X] - E[Y|A=0,M=m,X]]$$

DAG-Based Identification

The Back-Door Criterion

A set $X$ satisfies the back-door criterion relative to $(A, Y)$ if:

No node in $X$ is a descendant of $A$
$X$ blocks every path between $A$ and $Y$ that contains an arrow into $A$

If satisfied: $$P(Y | do(A=a)) = \sum_x P(Y | A=a, X=x) P(X=x)$$

The Front-Door Criterion

When there's an unmeasured confounder $U$ between $A$ and $Y$, but $M$ mediates all of $A$'s effect:

    U
   / \
  ↓   ↓
  A → M → Y

Identification: $$P(Y | do(A=a)) = \sum_m P(M=m | A=a) \sum_{a'} P(Y | M=m, A=a') P(A=a')$$

Instrumental Variables

When $Z$ affects $Y$ only through $A$:

  U
  ↓
Z → A → Y

Local ATE identification (with monotonicity): $$LATE = \frac{E[Y | Z=1] - E[Y | Z=0]}{E[A | Z=1] - E[A | Z=0]}$$

Sequential Identification (Multiple Mediators)

Sequential Mediation (A → M1 → M2 → Y)

Product of three path identification requires:

Standard confounding control for each arrow
No intermediate confounders affected by treatment
Sequential ignorability assumptions

Path-specific effects:

Direct: $A \to Y$
Through $M_1$ only: $A \to M_1 \to Y$
Through $M_2$ only: $A \to M_2 \to Y$
Through both: $A \to M_1 \to M_2 \to Y$

Identification Formula (No Intermediate Confounding)

$$\text{Effect through } M_1 \to M_2 = \int E\left[\frac{\partial^3}{\partial a \partial m_1 \partial m_2} E[Y|A,M_1,M_2,X]\right]$$

Expressed as product of coefficients: $\hat{\alpha}_1 \cdot \hat{\beta}_1 \cdot \hat{\gamma}_2$

Partial Identification

When point identification fails, we can still bound the parameter.

Manski Bounds (No Assumptions)

For ATE with missing outcomes: $$E[Y(1)] \in [E[Y \cdot A]/P(A=1) + y_{min}P(A=0), E[Y \cdot A]/P(A=1) + y_{max}P(A=0)]$$

Sensitivity Analysis

When exchangeability is uncertain, parameterize violation:

Unmeasured confounding parameter $\Gamma$: $$\frac{1}{\Gamma} \leq \frac{P(A=1|X,U=1)/P(A=0|X,U=1)}{P(A=1|X,U=0)/P(A=0|X,U=0)} \leq \Gamma$$

Compute bounds as function of $\Gamma$ (Rosenbaum bounds).

E-Value

Minimum strength of unmeasured confounding (on risk ratio scale) needed to explain away observed effect:

$$E\text{-value} = RR + \sqrt{RR \times (RR-1)}$$

Identification Strategies by Design

Randomized Controlled Trials (RCTs)

Treatment assignment random → exchangeability holds by design
Still need SUTVA, consistency
For mediation: randomize $M$ as well, or use sequential ignorability

Observational Studies

Natural Experiments

Exploit exogenous variation (policy changes, geographic variation)
Requires careful argument for why variation is "as-if random"

Identification in the MediationVerse

medfit: Foundation

Implements standard mediation identification
VanderWeele regression-based approach
Supports binary/continuous treatments and mediators

probmed: Effect Size

$P_M$ identification requires identified NDE/NIE
Handles case when NDE and NIE have opposite signs

RMediation: Confidence Intervals

Takes identified effects as input
Distribution of product of coefficients (PRODCLIN)
Monte Carlo intervals

medrobust: Sensitivity

When identification assumptions are uncertain
Bounds on effects under confounding
E-values for unmeasured confounding

medsim: Validation

Simulate data where truth is known
Verify identification formulas recover true effects
Test estimator properties

Identification Proof Template

\begin{theorem}[Identification of $\psi$]
Under Assumptions:
\begin{enumerate}[label=A\arabic*.]
\item (Consistency) $Y = Y(A)$, $M = M(A)$
\item (Positivity) $P(A=a|X) > \epsilon > 0$ for all $a \in \mathcal{A}$
\item (Exchangeability) $Y(a) \perp\!\!\!\perp A \mid X$
\end{enumerate}
the causal estimand $\psi = E[g(Y(a))]$ is identified by
\[
\psi = E_X\left[E[g(Y) \mid A=a, X]\right].
\]
\end{theorem}

\begin{proof}
\begin{align}
E[g(Y(a))] &= E\left[E[g(Y(a)) \mid X]\right]
    && \text{(law of total expectation)} \\
&= E\left[E[g(Y(a)) \mid A=a, X]\right]
    && \text{(by A3: exchangeability)} \\
&= E\left[E[g(Y) \mid A=a, X]\right]
    && \text{(by A1: consistency)}
\end{align}
The RHS depends only on the observed data distribution $P(Y,A,X)$.
\end{proof}

Common Identification Pitfalls

1. Conditioning on Colliders

A → C ← Y

Conditioning on $C$ opens a path between $A$ and $Y$.

2. Conditioning on Mediators

A → M → Y

Conditioning on $M$ blocks the indirect effect, doesn't control confounding.

3. Overcontrol Bias

Conditioning on descendants of treatment can bias estimates.

4. M-Bias

U1 → X ← U2
↓         ↓
A ——————→ Y

Conditioning on $X$ opens path $A \leftarrow U_1 \rightarrow X \leftarrow U_2 \rightarrow Y$.

5. Table 2 Fallacy

Interpreting coefficients causally when model includes intermediate variables.

Verification Questions

When reviewing identification arguments, ask:

Is the target estimand clearly defined?
Are all assumptions explicitly stated?
Is each step in the derivation justified?
Are the assumptions plausible in this context?
What if an assumption is violated?
Is there a DAG that encodes the assumptions?
Are there alternative identification strategies?

Integration with Other Skills

This skill works with:

proof-architect - For writing identification proofs
asymptotic-theory - For inference after identification
methods-paper-writer - For presenting identification in manuscripts
simulation-architect - For validating identification

Key References

Imai
Hernan
Pearl, J. (2009). Causality: Models, Reasoning, and Inference (2nd ed.)
VanderWeele, T.J. (2015). Explanation in Causal Inference
Hernán, M.A. & Robins, J.M. (2020). Causal Inference: What If
Imbens, G.W. & Rubin, D.B. (2015). Causal Inference for Statistics

Version: 1.0 Created: 2025-12-08 Domain: Causal Inference, Mediation Analysis

Adoption

brycewang-stanford/identification-theory

$ install --global

Security Scan Results

SKILL.md

Identification Theory

Core Concepts

What is Identification?

Why Identification Matters

Two Frameworks

1. Potential Outcomes (Rubin/Neyman)

2. Structural Causal Models (Pearl)

DAG Framework

Directed Acyclic Graphs (DAGs)

Key DAG Terminology

D-Separation

The Core Concept

Path Blocking Rules

D-separation Formula

Backdoor Criterion

Definition

Backdoor Adjustment Formula

Front-Door Criterion

Framework Equivalence

Key Identification Assumptions

For Treatment Effects

For Mediation Effects

Standard Identification Results

1. Average Treatment Effect (ATE)

2. Average Treatment Effect on Treated (ATT)

3. Natural Direct and Indirect Effects (Mediation)

4. Controlled Direct Effect (CDE)

DAG-Based Identification

The Back-Door Criterion

The Front-Door Criterion

Instrumental Variables

Sequential Identification (Multiple Mediators)

Sequential Mediation (A → M1 → M2 → Y)

Identification Formula (No Intermediate Confounding)

Partial Identification

Manski Bounds (No Assumptions)

Sensitivity Analysis

E-Value

Identification Strategies by Design

Randomized Controlled Trials (RCTs)

Observational Studies

Natural Experiments

Identification in the MediationVerse

medfit: Foundation

probmed: Effect Size

RMediation: Confidence Intervals

medrobust: Sensitivity

medsim: Validation

Identification Proof Template

Common Identification Pitfalls

1. Conditioning on Colliders

2. Conditioning on Mediators

3. Overcontrol Bias

4. M-Bias

5. Table 2 Fallacy

Verification Questions

Integration with Other Skills

Key References

Related Skills

brycewang-stanford/literature-review-tools

brycewang-stanford/auto-empirical-research-skills

brycewang-stanford/aer-preregistration

brycewang-stanford/economist-data-skill

brycewang-stanford/identification-theory

$ install --global

Security Scan Results

SKILL.md

Identification Theory

Core Concepts

What is Identification?

Why Identification Matters

Two Frameworks

1. Potential Outcomes (Rubin/Neyman)

2. Structural Causal Models (Pearl)

DAG Framework