Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

brycewang-stanford/replicate-paper

Name: replicate-paper
Author: brycewang-stanford

skills/28-maxwell2732-paper-replicate-agent-demo/dot-claude/skills/replicate-paper/SKILL.md

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research replicate-paper

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Skill: /replicate-paper

Trigger: /replicate-paper [paper.pdf] [data.csv|dta] or "replicate this paper"

Purpose: Full 6-phase autonomous replication of a biomedical/epidemiology paper using UK Biobank or similar data. Produces Python and R scripts plus a polished validation report.

Invocation

/replicate-paper papers/AuthorYear.pdf data/ukb_extract.csv

Or with just: "replicate this paper" (Claude will ask for paths if not provided).

The 6-Phase Pipeline

Phase 1: Intake

Goal: Understand exactly what needs to be replicated.

Read the paper PDF (all sections: Abstract, Methods, Results, Supplementary)
Identify every table and figure that presents empirical results
For each: record the gold standard values, SEs/CIs, sample sizes, and source location
Save targets to quality_reports/[paper_name]_replication_targets.md
Summarize: original software, data source, sample N, key methods, any replication package available

Output: quality_reports/[paper_name]_replication_targets.md

Phase 2: Data Audit

Goal: Confirm what we can and cannot replicate given the available data.

Load the provided dataset (data/[filename])
Compare to paper's described sample:
- Total N, exposed N, event counts
- Key variable distributions
- Missing data patterns
Apply inclusion/exclusion criteria as stated in Methods; document each step's effect on N
If variables are missing or differently named: document the gap; flag as a known discrepancy
Save audit summary to quality_reports/[paper_name]_data_audit.md

Output: quality_reports/[paper_name]_data_audit.md

Phase 3: Code Analysis

Goal: Map the paper's methods to our dataset before writing a single line of code.

Read original Stata/R code (if provided in replication package)
Map each variable name in original code → corresponding variable in our dataset
Identify methodological steps: sample construction, covariate coding, model fitting, SE clustering
Flag any steps where original code differs from Methods text (use the paper, not the code, as ground truth)
Document the mapping in quality_reports/[paper_name]_variable_map.md

Output: quality_reports/[paper_name]_variable_map.md

Phase 4: Translation

Goal: Produce clean, reproducible Python and R scripts that implement the paper's analysis.

Rules:

Line-by-line translation first — no improvements during replication
Follow python-code-conventions.md and r-code-conventions.md exactly
Set seed: random.seed(YYYYMMDD) + numpy.random.seed(YYYYMMDD) (Python); set.seed(YYYYMMDD) (R)
Use pathlib.Path for all Python paths; here::here() for all R paths
Comment every non-obvious Stata→Python or Stata→R translation decision
Refer to replication-protocol.md translation pitfall tables

Python script: replications/[paper_name]/python/replicate.py

Structure:

# Replication: [Paper Author (Year)]
# Date: YYYY-MM-DD
# Original: Stata / R
# Python version: X.Y.Z
# Key packages: pandas X.X, statsmodels X.X, lifelines X.X

from pathlib import Path
import random
import numpy as np
import pandas as pd
# ... other imports

random.seed(YYYYMMDD)
np.random.seed(YYYYMMDD)

DATA_DIR = Path(__file__).parents[3] / "data"
RESULTS_DIR = Path(__file__).parent / "results"
RESULTS_DIR.mkdir(exist_ok=True)

# --- 1. Load Data ---
# --- 2. Sample Construction ---
# --- 3. Model Fitting ---
# --- 4. Save Results ---

R script: replications/[paper_name]/R/replicate.R

Structure:

# Replication: [Paper Author (Year)]
# Date: YYYY-MM-DD
# Original: Stata / Python
# R version: X.Y.Z
# Key packages: survival X.X, fixest X.X

library(here)
library(tidyverse)
library(survival)
# ... other packages

set.seed(YYYYMMDD)

data_dir <- here("data")
results_dir <- here("replications", "[paper_name]", "R", "results")
dir.create(results_dir, recursive = TRUE, showWarnings = FALSE)

# --- 1. Load Data ---
# --- 2. Sample Construction ---
# --- 3. Model Fitting ---
# --- 4. Save Results ---

Outputs:

replications/[paper_name]/python/replicate.py
replications/[paper_name]/R/replicate.R
replications/[paper_name]/python/results/ (parquet/pkl files)
replications/[paper_name]/R/results/ (rds files)

Phase 5: Validation

Goal: Run both scripts and compare results to gold standard targets.

Execute Python script: python replications/[paper_name]/python/replicate.py
Execute R script: Rscript replications/[paper_name]/R/replicate.R
Load results; compare to targets using tolerance thresholds from replication-protocol.md:
- Integers: exact
- Point estimates: ±0.01
- SEs: ±0.05
- P-values: same significance bracket
- Percentages: ±0.1pp
For each mismatch: investigate root cause before proceeding
Save replications/[paper_name]/validation_report.md

Output: replications/[paper_name]/validation_report.md

Phase 6: Report

Goal: Produce a polished, self-contained replication report.

Report structure:

# Replication Report: [Paper Author (Year)]
**Date:** [YYYY-MM-DD]
**Replicator:** Claude (domain-reviewer verified)

## Paper Summary
[1 paragraph: research question, population, exposure, outcome, key finding]

## Methods Summary
[Bullet list: sample, exclusions, covariates, model, SEs, software]

## Data
[Bullet list: our dataset, N after exclusions, any discrepancies vs. paper sample]

## Results Comparison

| Target | Table/Fig | Paper Value | Our Value (Python) | Our Value (R) | Diff | Status |
|--------|-----------|-------------|-------------------|---------------|------|--------|

## Discrepancies
[Each discrepancy: what, investigated how, resolved or not]

## Corrective Steps Taken
[Any adjustments made during validation and why]

## Verdict
**[REPLICATED / PARTIAL / FAILED]**
- Targets matched: N / Total
- Remaining discrepancies: [list or "none"]

## Reproducibility
- Python: X.Y.Z | pandas X.X | statsmodels X.X | lifelines X.X
- R: X.Y.Z | survival X.X | fixest X.X
- Data: [filename, UKB application ID if applicable]
- Seed: YYYYMMDD

Save to: reports/[paper_name]_replication_report.md

After saving: run domain-reviewer agent on the report.

Quality Gate

After Phase 6, score the output. Minimum 80/100 to commit.

Auto-commit if score >= 80:

git add replications/[paper_name]/ reports/[paper_name]_replication_report.md quality_reports/[paper_name]_*.md
git commit -m "Replicate [Paper Author (Year)] -- [VERDICT]: N/Total targets matched"

Failure Modes & Recovery

| Failure | Recovery | |---------|---------| | Script syntax error | Fix before proceeding | | N mismatch > 5% | Stop, audit inclusion/exclusion criteria | | All point estimates off by same factor | Check unit conversion (HR vs. log-HR, OR vs. log-OR) | | SEs systematically too large | Check clustering level | | Cannot install package | Document, note in report, use closest alternative | | Data variable missing | Document gap; attempt proxy; flag as ASSUMED in report |

brycewang-stanford/replicate-paper

skills/28-maxwell2732-paper-replicate-agent-demo/dot-claude/skills/replicate-paper/SKILL.md

Run a full 6-phase autonomous replication of a biomedical/epidemiology paper against UK Biobank or similar cohort data, producing Python and R scripts plus a validated replication report. Use when asked to replicate a paper end-to-end, or when invoked as /replicate-paper [paper.pdf] [data.csv|dta].

2,769 stars

development

Updated Jul 10, 2026

$ install --global

skillsauth

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research replicate-paper

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jul 10, 2026, 5:18 AM138.7s1 file scanned

SKILL.md

name:: replicate-paper
description:: Run a full 6-phase autonomous replication of a biomedical/epidemiology paper against UK Biobank or similar cohort data, producing Python and R scripts plus a validated replication report. Use when asked to replicate a paper end-to-end, or when invoked as /replicate-paper [paper.pdf] [data.csv|dta].
allowed-tools:: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"]

Skill: /replicate-paper

Trigger: /replicate-paper [paper.pdf] [data.csv|dta] or "replicate this paper"

Purpose: Full 6-phase autonomous replication of a biomedical/epidemiology paper using UK Biobank or similar data. Produces Python and R scripts plus a polished validation report.

Invocation

/replicate-paper papers/AuthorYear.pdf data/ukb_extract.csv

Or with just: "replicate this paper" (Claude will ask for paths if not provided).

The 6-Phase Pipeline

Phase 1: Intake

Goal: Understand exactly what needs to be replicated.

Read the paper PDF (all sections: Abstract, Methods, Results, Supplementary)
Identify every table and figure that presents empirical results
For each: record the gold standard values, SEs/CIs, sample sizes, and source location
Save targets to quality_reports/[paper_name]_replication_targets.md
Summarize: original software, data source, sample N, key methods, any replication package available

Output: quality_reports/[paper_name]_replication_targets.md

Phase 2: Data Audit

Goal: Confirm what we can and cannot replicate given the available data.

Load the provided dataset (data/[filename])
Compare to paper's described sample:
- Total N, exposed N, event counts
- Key variable distributions
- Missing data patterns
Apply inclusion/exclusion criteria as stated in Methods; document each step's effect on N
If variables are missing or differently named: document the gap; flag as a known discrepancy
Save audit summary to quality_reports/[paper_name]_data_audit.md

Output: quality_reports/[paper_name]_data_audit.md

Phase 3: Code Analysis

Goal: Map the paper's methods to our dataset before writing a single line of code.

Read original Stata/R code (if provided in replication package)
Map each variable name in original code → corresponding variable in our dataset
Identify methodological steps: sample construction, covariate coding, model fitting, SE clustering
Flag any steps where original code differs from Methods text (use the paper, not the code, as ground truth)
Document the mapping in quality_reports/[paper_name]_variable_map.md

Output: quality_reports/[paper_name]_variable_map.md

Phase 4: Translation

Goal: Produce clean, reproducible Python and R scripts that implement the paper's analysis.

Rules:

Line-by-line translation first — no improvements during replication
Follow python-code-conventions.md and r-code-conventions.md exactly
Set seed: random.seed(YYYYMMDD) + numpy.random.seed(YYYYMMDD) (Python); set.seed(YYYYMMDD) (R)
Use pathlib.Path for all Python paths; here::here() for all R paths
Comment every non-obvious Stata→Python or Stata→R translation decision
Refer to replication-protocol.md translation pitfall tables

Python script: replications/[paper_name]/python/replicate.py

Structure:

# Replication: [Paper Author (Year)]
# Date: YYYY-MM-DD
# Original: Stata / R
# Python version: X.Y.Z
# Key packages: pandas X.X, statsmodels X.X, lifelines X.X

from pathlib import Path
import random
import numpy as np
import pandas as pd
# ... other imports

random.seed(YYYYMMDD)
np.random.seed(YYYYMMDD)

DATA_DIR = Path(__file__).parents[3] / "data"
RESULTS_DIR = Path(__file__).parent / "results"
RESULTS_DIR.mkdir(exist_ok=True)

# --- 1. Load Data ---
# --- 2. Sample Construction ---
# --- 3. Model Fitting ---
# --- 4. Save Results ---

R script: replications/[paper_name]/R/replicate.R

Structure:

# Replication: [Paper Author (Year)]
# Date: YYYY-MM-DD
# Original: Stata / Python
# R version: X.Y.Z
# Key packages: survival X.X, fixest X.X

library(here)
library(tidyverse)
library(survival)
# ... other packages

set.seed(YYYYMMDD)

data_dir <- here("data")
results_dir <- here("replications", "[paper_name]", "R", "results")
dir.create(results_dir, recursive = TRUE, showWarnings = FALSE)

# --- 1. Load Data ---
# --- 2. Sample Construction ---
# --- 3. Model Fitting ---
# --- 4. Save Results ---

Outputs:

replications/[paper_name]/python/replicate.py
replications/[paper_name]/R/replicate.R
replications/[paper_name]/python/results/ (parquet/pkl files)
replications/[paper_name]/R/results/ (rds files)

Phase 5: Validation

Goal: Run both scripts and compare results to gold standard targets.

Execute Python script: python replications/[paper_name]/python/replicate.py
Execute R script: Rscript replications/[paper_name]/R/replicate.R
Load results; compare to targets using tolerance thresholds from replication-protocol.md:
- Integers: exact
- Point estimates: ±0.01
- SEs: ±0.05
- P-values: same significance bracket
- Percentages: ±0.1pp
For each mismatch: investigate root cause before proceeding
Save replications/[paper_name]/validation_report.md

Output: replications/[paper_name]/validation_report.md

Phase 6: Report

Goal: Produce a polished, self-contained replication report.

Report structure:

# Replication Report: [Paper Author (Year)]
**Date:** [YYYY-MM-DD]
**Replicator:** Claude (domain-reviewer verified)

## Paper Summary
[1 paragraph: research question, population, exposure, outcome, key finding]

## Methods Summary
[Bullet list: sample, exclusions, covariates, model, SEs, software]

## Data
[Bullet list: our dataset, N after exclusions, any discrepancies vs. paper sample]

## Results Comparison

| Target | Table/Fig | Paper Value | Our Value (Python) | Our Value (R) | Diff | Status |
|--------|-----------|-------------|-------------------|---------------|------|--------|

## Discrepancies
[Each discrepancy: what, investigated how, resolved or not]

## Corrective Steps Taken
[Any adjustments made during validation and why]

## Verdict
**[REPLICATED / PARTIAL / FAILED]**
- Targets matched: N / Total
- Remaining discrepancies: [list or "none"]

## Reproducibility
- Python: X.Y.Z | pandas X.X | statsmodels X.X | lifelines X.X
- R: X.Y.Z | survival X.X | fixest X.X
- Data: [filename, UKB application ID if applicable]
- Seed: YYYYMMDD

Save to: reports/[paper_name]_replication_report.md

After saving: run domain-reviewer agent on the report.

Quality Gate

After Phase 6, score the output. Minimum 80/100 to commit.

Auto-commit if score >= 80:

git add replications/[paper_name]/ reports/[paper_name]_replication_report.md quality_reports/[paper_name]_*.md
git commit -m "Replicate [Paper Author (Year)] -- [VERDICT]: N/Total targets matched"

Failure Modes & Recovery

Related Skills

brycewang-stanford/literature-review-tools

tools

VerifiedTrustedCommunity

Recommend AND run open-source AI tools, agents, Claude Code / Codex skills, and MCP servers for any stage of a literature review — searching, reading, extracting, synthesizing, screening, citation-checking, and paper writing. Use when the user asks "what tool should I use to..." OR "install/run/use <tool> to ..." for research/lit-review work: automating a survey or related-work section, PDF→Markdown extraction for LLMs (MinerU/marker/docling), PRISMA / systematic review (ASReview), citation-backed Q&A over PDFs (PaperQA2), wiring papers into Claude/Cursor via MCP (arxiv/paper-search/zotero servers), or chatting with a Zotero library. Ships a launcher (scripts/litrun.py) that installs each tool in an isolated venv and runs it. Curated catalog of 70+ vetted projects. 支持中英文（用于「文献综述工具选型」与「一键安装/运行」）。

3,109SKILL.mdUpdated Jul 28, 2026

brycewang-stanford/literature-review-tools

brycewang-stanford/auto-empirical-research-skills

development

VerifiedTrustedCommunity

Route empirical-research requests through the Auto-Empirical Research Skills catalog when this whole repository is installed as one skill in Codex, CodeBuddy, Claude Code, or another IDE. Use to choose and load the right vendored AERS skill for causal inference, econometrics, replication, data acquisition, manuscript writing, peer review and referee responses, citation checking, de-AIGC editing, or full empirical-paper workflows without reading the entire repository at once.

3,109SKILL.mdUpdated Jun 27, 2026

brycewang-stanford/auto-empirical-research-skills

brycewang-stanford/aer-preregistration

documentation

VerifiedTrustedCommunity

Use when the project collects primary data or runs a field, lab, or survey experiment, before the intervention begins — write the pre-analysis plan, size the sample from a power calculation, and register with the AEA RCT Registry. Apply after the design is chosen in aer-identification and before any outcome data are seen.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/aer-preregistration

brycewang-stanford/economist-data-skill

tools

VerifiedTrustedCommunity

Guide economists to authoritative data sources with explicit, confirmed data specifications before retrieval; interfaces with Playwright MCP to navigate portals and extract real data, not articles about data.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/economist-data-skill

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research.git

# Copy into Claude Code skills folder (global)
cp -r Awesome-Agent-Skills-for-Empirical-Research/skills/28-maxwell2732-paper-replicate-agent-demo/dot-claude/skills/replicate-paper ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

2,769 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT