skills/27-dariia-m-my_claude_skills/paper_verification/SKILL.md
Thoroughly verify all code, tables, figures, modeling decisions, and quantitative claims in an academic paper against its source R scripts and output files. Use this skill whenever you need to audit, replicate, or verify an academic research paper - including cross-checking LaTeX tables against R output, validating econometric modeling choices, ensuring sample sizes are consistent, building a verification manifest, and running automated replication tests. Trigger this skill for any mention of: paper verification, replication check, table audit, code-paper consistency, reproducing results, verifying estimates, checking coefficients, or any variant of "does the paper match the code."
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research academic-paper-verifyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A systematic skill for verifying the integrity and replicability of an academic research paper. This covers everything from individual coefficient checks to full end-to-end replication.
Verification proceeds in six phases. Each phase produces structured output. Do not skip phases - earlier phases feed into later ones.
Phase 1: Discovery -> inventory of all project files, scripts, outputs, paper
Phase 2: Table Audit -> cross-check every number in every table
Phase 3: Inline Claims -> verify quantitative claims in paper body text
Phase 4: Code Review -> audit R scripts for correctness, modeling decisions, data pipeline
Phase 5: Manifest Build -> create verification_manifest.json linking claims to code
Phase 6: Replication -> write and run tests/verify_replication.R, fix failures
.Rproj files, README, or ask the user.references/phase-details.md for the full procedure for each phase.references/common-pitfalls.md for known failure modes to watch for.Scan the entire project and build an inventory. You need to know what you're working with before you can verify anything.
Find and catalog:
.R and .Rmd scripts (note execution order if a master script exists).csv, .rds, .tex, .txt, .log in results/, output/, tables/, etc..tex in the root or paper/ or draft/ directory.csv, .dta, .rds, .xlsx in data/ or similarProduce: A file inventory printed to the console, organized by type, with notes on what each script appears to do (based on filename and a quick scan of its first ~30 lines).
Key questions to answer in this phase:
This is the most critical phase. Read references/phase-details.md Section 2 for the full
procedure.
For every table in the paper:
Locate the table in the LaTeX source. Extract every number: coefficients, standard errors, t-statistics, p-values, confidence intervals, sample sizes (N), R-squared, F-statistics, means, medians, percentages - everything.
Locate the corresponding R output file that produced this table. This might be a .tex file
generated by stargazer, modelsummary, xtable, kableExtra, huxtable, or similar.
It could also be a .csv, .rds, or text log.
Cross-check every single number. Compare to the R output with appropriate tolerance:
Check for rounding consistency - if a coefficient is 0.0347 in the R output and 0.035 in the paper, that is acceptable rounding. If it is 0.038, that is a discrepancy.
Verify that column headers, variable names, and panel labels in the paper match the specification in the code.
Check that the number of observations (N) is consistent across all tables that use the same sample. If Table 1 reports N=4,521 and Table 3 uses the same sample but reports N=4,519, that needs explanation.
Produce: A table-by-table verification report. For each table:
Read the paper body text (not just tables) and find every quantitative claim. These include:
For each claim, trace it back to a specific table cell, figure, or R output. Flag any claim that cannot be traced or that contradicts the evidence.
Produce: A claims checklist with claim text, source location in paper, evidence source, and VERIFIED/UNVERIFIED/DISCREPANCY status.
Read every R script in the project, in execution order. This is not just a syntax check -
you are auditing the analytical pipeline. Read references/phase-details.md Section 4 and
references/common-pitfalls.md for what to look for.
Data Pipeline Verification:
merge, join, filter, subset, or mutate step, check:
(a) How many observations before vs. after the transformation?
(b) Do all column names needed downstream still exist?
(c) Are key summary statistics (mean, min, max, N) reasonable after the step?Modeling Decisions:
Robustness and Red Flags:
filter(year > 2005) when the
paper says "post-treatment period" without defining the cutoff)Produce: A script-by-script review with:
Create verification_manifest.json that maps every quantitative claim in the paper to
the code that produces it.
Structure:
{
"paper_file": "paper/main.tex",
"generated_at": "2026-02-08T12:00:00Z",
"claims": [
{
"id": "T1_R2_C3",
"type": "coefficient",
"paper_location": {"file": "paper/main.tex", "line": 234, "context": "Table 1, Row 2, Col 3"},
"paper_value": "0.035",
"source_script": "code/02_main_regression.R",
"source_line": 87,
"output_file": "results/table1.tex",
"output_location": {"line": 15, "context": "second coefficient in column 3"},
"expected_value": "0.0347",
"tolerance": 0.001,
"status": "PASS",
"notes": "Acceptable rounding from 0.0347 to 0.035"
},
{
"id": "BODY_P12_S3",
"type": "inline_claim",
"paper_location": {"file": "paper/main.tex", "line": 412, "context": "paragraph 12, sentence 3"},
"paper_value": "3.2 percentage points",
"source_script": "code/02_main_regression.R",
"source_line": 87,
"output_file": "results/table1.tex",
"output_location": {"line": 15},
"expected_value": "0.032",
"tolerance": 0.001,
"status": "PASS",
"notes": "Coefficient 0.0323 reported as 3.2pp"
}
],
"summary": {
"total_claims": 142,
"passed": 139,
"failed": 2,
"unverified": 1
}
}
Every coefficient, standard error, sample size, p-value, summary statistic, and verbal claim should appear in this manifest. Be exhaustive.
Write tests/verify_replication.R that programmatically reruns the analysis and checks
results against the manifest.
Read references/replication-script-template.md for the template and structure.
The test script must:
verification_manifest.jsonAfter writing the test script:
Produce:
tests/verify_replication.R - the test scripttests/replication_results.json - structured test resultstests/replication_summary.md - human-readable summary of what passed, what failed,
what was fixed, and what remains unresolvedAt the end of the full verification, produce a consolidated report. Use this structure:
# Paper Verification Report
## Executive Summary
- Total quantitative claims checked: X
- Passed: Y
- Failed: Z
- Unverified: W
- Code issues found: N (M major, K minor)
## Table-by-Table Results
[from Phase 2]
## Inline Claims Results
[from Phase 3]
## Code Review Findings
[from Phase 4]
## Replication Test Results
[from Phase 6]
## Recommendations
[prioritized list of issues to address]
.do files or Python scripts mixed in, verify those too using
the same principles.tools
Show mcp-stata identity, connected tools, and status. Use when the user asks if mcp-stata is available, asks about access to the toolkit, or asks what Stata tools are connected.
tools
Activate when users mention Stata commands, .do files, regressions, econometrics, stored results, graphs, dataset inspection, replication, or Stata errors. Route the task through mcp-stata tools and the specialized research skills instead of treating it as plain text coding.
development
Build and review paper-ready regression, balance, and summary tables from Stata outputs. Use when the user needs a clean table for a draft, appendix, or coauthor share-out.
tools
Install, configure, update, or verify mcp-stata across Claude Code, Codex, Gemini CLI, Cursor, Windsurf, and VS Code. Activate when users ask to set up the Stata toolkit or troubleshoot the installation.