Proof Architect

Structured methodology for constructing and verifying mathematical proofs in statistical research

Use this skill when working on: mathematical proofs, theorem development, derivations, consistency proofs, asymptotic arguments, identification proofs, or verifying proof correctness.

Proof Structure Framework

Standard Proof Components

Every rigorous statistical proof should contain:

Claim Statement - Precise mathematical statement of what is being proved
Assumptions - All conditions required (clearly enumerated A1, A2, ...)
Notation - Define all symbols before use
Proof Body - Logical sequence of justified steps
Conclusion - Explicit statement that claim is established

Proof Skeleton Template

\begin{theorem}[Name]
\label{thm:name}
Under Assumptions \ref{A1}--\ref{An}, [precise claim].
\end{theorem}

\begin{proof}
The proof proceeds in [n] steps.

\textbf{Step 1: [Description]}
[Content with justification for each transition]

\textbf{Step 2: [Description]}
[Content]

\vdots

\textbf{Step n: Conclusion}
Combining Steps 1--[n-1], we obtain [result], completing the proof.
\end{proof}

Proof Types in Statistical Methodology

1. Identification Proofs

Goal: Show that a causal/statistical quantity is uniquely determined from observed data distribution.

Standard Structure:

Define target estimand (e.g., $\psi = E[Y(a)]$)
State identifying assumptions (consistency, positivity, exchangeability)
Apply identification formula derivation
Show formula depends only on observable quantities

Template:

\begin{theorem}[Identification of $\psi$]
Under Assumptions \ref{A:consistency}--\ref{A:positivity}, the causal effect
$\psi = E[Y(a)]$ is identified by
\[
\psi = \int E[Y \mid A=a, X=x] \, dP(x).
\]
\end{theorem}

\begin{proof}
\begin{align}
E[Y(a)] &= E[E[Y(a) \mid X]] && \text{(law of iterated expectations)} \\
        &= E[E[Y(a) \mid A=a, X]] && \text{(A\ref{A:exchangeability}: $Y(a) \indep A \mid X$)} \\
        &= E[E[Y \mid A=a, X]] && \text{(A\ref{A:consistency}: $Y = Y(A)$)} \\
        &= \int E[Y \mid A=a, X=x] \, dP(x) && \text{(definition)}
\end{align}
which depends only on the observed data distribution.
\end{proof}

2. Consistency Proofs

Goal: Show that an estimator converges to the true parameter value.

Standard Structure:

Define estimator $\hat{\theta}_n$
Define target parameter $\theta_0$
Establish convergence: $\hat{\theta}_n \xrightarrow{p} \theta_0$

Key Tools:

Law of Large Numbers (LLN)
Continuous Mapping Theorem
Slutsky's Theorem
M-estimation theory

Template:

\begin{theorem}[Consistency]
Under Assumptions \ref{A1}--\ref{An}, $\hat{\theta}_n \xrightarrow{p} \theta_0$.
\end{theorem}

\begin{proof}
Define $M_n(\theta) = n^{-1} \sum_{i=1}^n m(O_i; \theta)$ and
$M(\theta) = E[m(O; \theta)]$.

\textbf{Step 1: Uniform convergence}
By [ULLN conditions], $\sup_{\theta \in \Theta} |M_n(\theta) - M(\theta)| \xrightarrow{p} 0$.

\textbf{Step 2: Unique maximum}
$M(\theta)$ is uniquely maximized at $\theta_0$ (by identifiability).

\textbf{Step 3: Conclusion}
By standard M-estimation theory, Steps 1--2 imply $\hat{\theta}_n \xrightarrow{p} \theta_0$.
\end{proof}

3. Asymptotic Normality Proofs

Goal: Establish $\sqrt{n}(\hat{\theta}_n - \theta_0) \xrightarrow{d} N(0, V)$.

Standard Structure:

Taylor expansion around true value
Apply CLT to score/influence function
Invert Hessian/information matrix
State limiting distribution

Key Tools:

Central Limit Theorem (CLT)
Delta Method
Influence Function Theory
Semiparametric Efficiency Theory

Template:

\begin{theorem}[Asymptotic Normality]
Under Assumptions \ref{A1}--\ref{An},
\[
\sqrt{n}(\hat{\theta}_n - \theta_0) \xrightarrow{d} N(0, V)
\]
where $V = E[\phi(O)\phi(O)^\top]$ and $\phi$ is the influence function.
\end{theorem}

\begin{proof}
\textbf{Step 1: Score equation}
$\hat{\theta}_n$ solves $\mathbb{P}_n[\psi(O; \theta)] = 0$ where $\psi = \partial_\theta m$.

\textbf{Step 2: Taylor expansion}
\[
0 = \mathbb{P}_n[\psi(O; \hat{\theta}_n)] = \mathbb{P}_n[\psi(O; \theta_0)]
    + \mathbb{P}_n[\dot{\psi}(O; \tilde{\theta})](\hat{\theta}_n - \theta_0)
\]

\textbf{Step 3: Rearrangement}
\[
\sqrt{n}(\hat{\theta}_n - \theta_0) = -\left(\mathbb{P}_n[\dot{\psi}]\right)^{-1}
    \sqrt{n} \mathbb{P}_n[\psi(O; \theta_0)]
\]

\textbf{Step 4: Apply CLT}
$\sqrt{n} \mathbb{P}_n[\psi(O; \theta_0)] \xrightarrow{d} N(0, \text{Var}(\psi))$ by CLT.

\textbf{Step 5: Slutsky}
$\mathbb{P}_n[\dot{\psi}] \xrightarrow{p} E[\dot{\psi}]$ by WLLN. Apply Slutsky's theorem.
\end{proof}

4. Efficiency Proofs

Goal: Show estimator achieves semiparametric efficiency bound.

Standard Structure:

Characterize the tangent space
Derive efficient influence function (EIF)
Show estimator's influence function equals EIF
Conclude variance achieves bound

Template:

\begin{theorem}[Semiparametric Efficiency]
$\hat{\theta}_n$ is semiparametrically efficient with influence function
\[
\phi(O) = [optimal formula]
\]
achieving the efficiency bound $V_{\text{eff}} = E[\phi(O)^2]$.
\end{theorem}

5. Double Robustness Proofs

Goal: Show estimator is consistent if either nuisance model is correctly specified.

Standard Structure:

Write estimating equation with both nuisance functions
Show bias term is product of two errors
Conclude: if either error is zero, estimator is consistent

Template:

\begin{theorem}[Double Robustness]
The estimator $\hat{\psi}_{DR}$ is consistent if either:
\begin{enumerate}
\item The outcome model $\mu(a,x) = E[Y \mid A=a, X=x]$ is correctly specified, or
\item The propensity score $\pi(x) = P(A=1 \mid X=x)$ is correctly specified.
\end{enumerate}
\end{theorem}

\begin{proof}
The estimating equation has the form:
\[
\psi - \hat{\psi}_{DR} = E\left[\frac{(A-\pi)(Y-\mu)}{\pi(1-\pi)}\right] + o_p(1)
\]
The bias term $(A-\pi)(Y-\mu)$ is zero in expectation if either:
\begin{itemize}
\item $E[A-\pi \mid X] = 0$ (propensity correctly specified), or
\item $E[Y-\mu \mid A, X] = 0$ (outcome correctly specified).
\end{itemize}
\end{proof}

Proof Verification Checklist

Level 1: Structure Check

[ ] Claim clearly stated with all conditions
[ ] All notation defined before use
[ ] Logical flow apparent (steps labeled)
[ ] Each step has explicit justification
[ ] Conclusion explicitly stated

Level 2: Step Validation

For each step, verify:

[ ] Mathematical operation is valid
[ ] Cited results apply (check conditions)
[ ] Inequalities have correct direction
[ ] Limits/integrals converge
[ ] Dimensions/types match

Level 3: Edge Cases

[ ] Boundary cases handled (n=1, p=0, etc.)
[ ] Degenerate cases addressed
[ ] Assumptions actually used (not vacuous)
[ ] What happens at assumption boundaries?

Level 4: Consistency

[ ] Result matches intuition
[ ] Special cases recover known results
[ ] Numerical verification possible?
[ ] Consistent with simulation evidence?

Common Proof Errors

Technical Errors

| Error | Example | Fix | |-------|---------|-----| | Interchanging limits | $\lim \sum \neq \sum \lim$ | Verify DCT/MCT conditions | | Division by zero | $1/\pi(x)$ when $\pi(x)=0$ | State positivity assumption | | Incorrect conditioning | $E[Y \mid A,X] \neq E[Y \mid X]$ | Check independence structure | | Wrong norm | $|f|2$ vs $|f|\infty$ | Verify which space | | Missing measurability | Random variable not measurable | State measurability |

Logical Errors

| Error | Example | Fix | |-------|---------|-----| | Circular reasoning | Using result to prove itself | Check logical dependency | | Unstated assumption | "Clearly, X holds" | Make all assumptions explicit | | Incorrect quantifier | $\exists$ vs $\forall$ | Be precise about scope | | Missing case | Not handling $\theta = 0$ | Enumerate all cases |

Statistical Errors

| Error | Example | Fix | |-------|---------|-----| | Confusing $\xrightarrow{p}$ and $\xrightarrow{d}$ | Different convergence modes | State which mode | | Ignoring dependence | Applying iid CLT to dependent data | Check independence | | Wrong variance | Using population variance for sample | Distinguish estimator/parameter |

Notation Standards (VanderWeele Convention)

Causal Quantities

| Symbol | Meaning | |--------|---------| | $Y(a)$ | Potential outcome under treatment $a$ | | $Y(a,m)$ | Potential outcome under $A=a$, $M=m$ | | $M(a)$ | Potential mediator under treatment $a$ | | $NDE$ | Natural Direct Effect: $E[Y(1,M(0)) - Y(0,M(0))]$ | | $NIE$ | Natural Indirect Effect: $E[Y(1,M(1)) - Y(1,M(0))]$ | | $TE$ | Total Effect: $E[Y(1) - Y(0)] = NDE + NIE$ | | $P_M$ | Proportion Mediated: $NIE/TE$ |

Statistical Quantities

| Symbol | Meaning | |--------|---------| | $\theta_0$ | True parameter value | | $\hat{\theta}_n$ | Estimator based on $n$ observations | | $\phi(O)$ | Influence function | | $\mathbb{P}_n$ | Empirical measure | | $\mathbb{G}_n$ | Empirical process: $\sqrt{n}(\mathbb{P}_n - P)$ |

Convergence

| Symbol | Meaning | |--------|---------| | $\xrightarrow{p}$ | Convergence in probability | | $\xrightarrow{d}$ | Convergence in distribution | | $\xrightarrow{a.s.}$ | Almost sure convergence | | $O_p(1)$ | Bounded in probability | | $o_p(1)$ | Converges to zero in probability |

Proof Construction Workflow

Step 1: Understand the Goal

What exactly needs to be proved?
What type of proof is this? (identification, consistency, etc.)
What are the key challenges?

Step 2: Gather Tools

What theorems/lemmas are available?
What regularity conditions will be needed?
Are there similar proofs to reference?

Step 3: Outline Structure

Break into logical steps
Identify the key technical step
Plan how to handle edge cases

Step 4: Write First Draft

Fill in details for each step
Be explicit about every transition
Note where conditions are used

Step 5: Verify

Run through verification checklist
Check each step independently
Test special cases

Step 6: Polish

Improve notation consistency
Add intuitive explanations
Ensure assumptions are minimal

Integration with Other Skills

This skill works with:

identification-theory - For causal identification proofs
asymptotic-theory - For inference proofs
methods-paper-writer - For presenting proofs in manuscripts
proof-verifier - For systematic verification

Version: 1.0 Created: 2025-12-08 Domain: Mathematical Statistics, Causal Inference

Key References

van der Vaart
Lehmann
Casella
Bickel
Serfling

Proof Architect

Structured methodology for constructing and verifying mathematical proofs in statistical research

Use this skill when working on: mathematical proofs, theorem development, derivations, consistency proofs, asymptotic arguments, identification proofs, or verifying proof correctness.

Proof Structure Framework

Standard Proof Components

Every rigorous statistical proof should contain:

Claim Statement - Precise mathematical statement of what is being proved
Assumptions - All conditions required (clearly enumerated A1, A2, ...)
Notation - Define all symbols before use
Proof Body - Logical sequence of justified steps
Conclusion - Explicit statement that claim is established

Proof Skeleton Template

\begin{theorem}[Name]
\label{thm:name}
Under Assumptions \ref{A1}--\ref{An}, [precise claim].
\end{theorem}

\begin{proof}
The proof proceeds in [n] steps.

\textbf{Step 1: [Description]}
[Content with justification for each transition]

\textbf{Step 2: [Description]}
[Content]

\vdots

\textbf{Step n: Conclusion}
Combining Steps 1--[n-1], we obtain [result], completing the proof.
\end{proof}

Proof Types in Statistical Methodology

1. Identification Proofs

Goal: Show that a causal/statistical quantity is uniquely determined from observed data distribution.

Standard Structure:

Define target estimand (e.g., $\psi = E[Y(a)]$)
State identifying assumptions (consistency, positivity, exchangeability)
Apply identification formula derivation
Show formula depends only on observable quantities

Template:

\begin{theorem}[Identification of $\psi$]
Under Assumptions \ref{A:consistency}--\ref{A:positivity}, the causal effect
$\psi = E[Y(a)]$ is identified by
\[
\psi = \int E[Y \mid A=a, X=x] \, dP(x).
\]
\end{theorem}

\begin{proof}
\begin{align}
E[Y(a)] &= E[E[Y(a) \mid X]] && \text{(law of iterated expectations)} \\
        &= E[E[Y(a) \mid A=a, X]] && \text{(A\ref{A:exchangeability}: $Y(a) \indep A \mid X$)} \\
        &= E[E[Y \mid A=a, X]] && \text{(A\ref{A:consistency}: $Y = Y(A)$)} \\
        &= \int E[Y \mid A=a, X=x] \, dP(x) && \text{(definition)}
\end{align}
which depends only on the observed data distribution.
\end{proof}

2. Consistency Proofs

Goal: Show that an estimator converges to the true parameter value.

Standard Structure:

Define estimator $\hat{\theta}_n$
Define target parameter $\theta_0$
Establish convergence: $\hat{\theta}_n \xrightarrow{p} \theta_0$

Key Tools:

Law of Large Numbers (LLN)
Continuous Mapping Theorem
Slutsky's Theorem
M-estimation theory

Template:

\begin{theorem}[Consistency]
Under Assumptions \ref{A1}--\ref{An}, $\hat{\theta}_n \xrightarrow{p} \theta_0$.
\end{theorem}

\begin{proof}
Define $M_n(\theta) = n^{-1} \sum_{i=1}^n m(O_i; \theta)$ and
$M(\theta) = E[m(O; \theta)]$.

\textbf{Step 1: Uniform convergence}
By [ULLN conditions], $\sup_{\theta \in \Theta} |M_n(\theta) - M(\theta)| \xrightarrow{p} 0$.

\textbf{Step 2: Unique maximum}
$M(\theta)$ is uniquely maximized at $\theta_0$ (by identifiability).

\textbf{Step 3: Conclusion}
By standard M-estimation theory, Steps 1--2 imply $\hat{\theta}_n \xrightarrow{p} \theta_0$.
\end{proof}

3. Asymptotic Normality Proofs

Goal: Establish $\sqrt{n}(\hat{\theta}_n - \theta_0) \xrightarrow{d} N(0, V)$.

Standard Structure:

Taylor expansion around true value
Apply CLT to score/influence function
Invert Hessian/information matrix
State limiting distribution

Key Tools:

Central Limit Theorem (CLT)
Delta Method
Influence Function Theory
Semiparametric Efficiency Theory

Template:

\begin{theorem}[Asymptotic Normality]
Under Assumptions \ref{A1}--\ref{An},
\[
\sqrt{n}(\hat{\theta}_n - \theta_0) \xrightarrow{d} N(0, V)
\]
where $V = E[\phi(O)\phi(O)^\top]$ and $\phi$ is the influence function.
\end{theorem}

\begin{proof}
\textbf{Step 1: Score equation}
$\hat{\theta}_n$ solves $\mathbb{P}_n[\psi(O; \theta)] = 0$ where $\psi = \partial_\theta m$.

\textbf{Step 2: Taylor expansion}
\[
0 = \mathbb{P}_n[\psi(O; \hat{\theta}_n)] = \mathbb{P}_n[\psi(O; \theta_0)]
    + \mathbb{P}_n[\dot{\psi}(O; \tilde{\theta})](\hat{\theta}_n - \theta_0)
\]

\textbf{Step 3: Rearrangement}
\[
\sqrt{n}(\hat{\theta}_n - \theta_0) = -\left(\mathbb{P}_n[\dot{\psi}]\right)^{-1}
    \sqrt{n} \mathbb{P}_n[\psi(O; \theta_0)]
\]

\textbf{Step 4: Apply CLT}
$\sqrt{n} \mathbb{P}_n[\psi(O; \theta_0)] \xrightarrow{d} N(0, \text{Var}(\psi))$ by CLT.

\textbf{Step 5: Slutsky}
$\mathbb{P}_n[\dot{\psi}] \xrightarrow{p} E[\dot{\psi}]$ by WLLN. Apply Slutsky's theorem.
\end{proof}

4. Efficiency Proofs

Goal: Show estimator achieves semiparametric efficiency bound.

Standard Structure:

Characterize the tangent space
Derive efficient influence function (EIF)
Show estimator's influence function equals EIF
Conclude variance achieves bound

Template:

\begin{theorem}[Semiparametric Efficiency]
$\hat{\theta}_n$ is semiparametrically efficient with influence function
\[
\phi(O) = [optimal formula]
\]
achieving the efficiency bound $V_{\text{eff}} = E[\phi(O)^2]$.
\end{theorem}

5. Double Robustness Proofs

Goal: Show estimator is consistent if either nuisance model is correctly specified.

Standard Structure:

Write estimating equation with both nuisance functions
Show bias term is product of two errors
Conclude: if either error is zero, estimator is consistent

Template:

\begin{theorem}[Double Robustness]
The estimator $\hat{\psi}_{DR}$ is consistent if either:
\begin{enumerate}
\item The outcome model $\mu(a,x) = E[Y \mid A=a, X=x]$ is correctly specified, or
\item The propensity score $\pi(x) = P(A=1 \mid X=x)$ is correctly specified.
\end{enumerate}
\end{theorem}

\begin{proof}
The estimating equation has the form:
\[
\psi - \hat{\psi}_{DR} = E\left[\frac{(A-\pi)(Y-\mu)}{\pi(1-\pi)}\right] + o_p(1)
\]
The bias term $(A-\pi)(Y-\mu)$ is zero in expectation if either:
\begin{itemize}
\item $E[A-\pi \mid X] = 0$ (propensity correctly specified), or
\item $E[Y-\mu \mid A, X] = 0$ (outcome correctly specified).
\end{itemize}
\end{proof}

Proof Verification Checklist

Level 1: Structure Check

[ ] Claim clearly stated with all conditions
[ ] All notation defined before use
[ ] Logical flow apparent (steps labeled)
[ ] Each step has explicit justification
[ ] Conclusion explicitly stated

Level 2: Step Validation

For each step, verify:

[ ] Mathematical operation is valid
[ ] Cited results apply (check conditions)
[ ] Inequalities have correct direction
[ ] Limits/integrals converge
[ ] Dimensions/types match

Level 3: Edge Cases

[ ] Boundary cases handled (n=1, p=0, etc.)
[ ] Degenerate cases addressed
[ ] Assumptions actually used (not vacuous)
[ ] What happens at assumption boundaries?

Level 4: Consistency

[ ] Result matches intuition
[ ] Special cases recover known results
[ ] Numerical verification possible?
[ ] Consistent with simulation evidence?

Common Proof Errors

Technical Errors

Logical Errors

Statistical Errors

Notation Standards (VanderWeele Convention)

Causal Quantities

Statistical Quantities

Convergence

Proof Construction Workflow

Step 1: Understand the Goal

What exactly needs to be proved?
What type of proof is this? (identification, consistency, etc.)
What are the key challenges?

Step 2: Gather Tools

What theorems/lemmas are available?
What regularity conditions will be needed?
Are there similar proofs to reference?

Step 3: Outline Structure

Break into logical steps
Identify the key technical step
Plan how to handle edge cases

Step 4: Write First Draft

Fill in details for each step
Be explicit about every transition
Note where conditions are used

Step 5: Verify

Run through verification checklist
Check each step independently
Test special cases

Step 6: Polish

Improve notation consistency
Add intuitive explanations
Ensure assumptions are minimal

Integration with Other Skills

This skill works with:

identification-theory - For causal identification proofs
asymptotic-theory - For inference proofs
methods-paper-writer - For presenting proofs in manuscripts
proof-verifier - For systematic verification

Version: 1.0 Created: 2025-12-08 Domain: Mathematical Statistics, Causal Inference

Key References

van der Vaart
Lehmann
Casella
Bickel
Serfling

Adoption

brycewang-stanford/proof-architect

$ install --global

Security Scan Results

SKILL.md

Proof Architect

Proof Structure Framework

Standard Proof Components

Proof Skeleton Template

Proof Types in Statistical Methodology

1. Identification Proofs

2. Consistency Proofs

3. Asymptotic Normality Proofs

4. Efficiency Proofs

5. Double Robustness Proofs

Proof Verification Checklist

Level 1: Structure Check

Level 2: Step Validation

Level 3: Edge Cases

Level 4: Consistency

Common Proof Errors

Technical Errors

Logical Errors

Statistical Errors

Notation Standards (VanderWeele Convention)

Causal Quantities

Statistical Quantities

Convergence

Proof Construction Workflow

Step 1: Understand the Goal

Step 2: Gather Tools

Step 3: Outline Structure

Step 4: Write First Draft

Step 5: Verify

Step 6: Polish

Integration with Other Skills

Key References

Related Skills

brycewang-stanford/literature-review-tools

brycewang-stanford/auto-empirical-research-skills

brycewang-stanford/aer-preregistration

brycewang-stanford/economist-data-skill

brycewang-stanford/proof-architect

$ install --global

Security Scan Results

SKILL.md

Proof Architect

Proof Structure Framework

Standard Proof Components

Proof Skeleton Template

Proof Types in Statistical Methodology

1. Identification Proofs

2. Consistency Proofs

3. Asymptotic Normality Proofs

4. Efficiency Proofs

5. Double Robustness Proofs

Proof Verification Checklist

Level 1: Structure Check

Level 2: Step Validation

Level 3: Edge Cases

Level 4: Consistency

Common Proof Errors

Technical Errors

Logical Errors

Statistical Errors

Notation Standards (VanderWeele Convention)

Causal Quantities

Statistical Quantities

Convergence

Proof Construction Workflow

Step 1: Understand the Goal

Step 2: Gather Tools

Step 3: Outline Structure

Step 4: Write First Draft

Step 5: Verify

Step 6: Polish

Integration with Other Skills

Key References

Related Skills

brycewang-stanford/literature-review-tools