Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

brycewang-stanford/panel-data-analyst

Name: panel-data-analyst
Author: brycewang-stanford

skills/43-wentorai-research-plugins/skills/analysis/econometrics/panel-data-analyst/SKILL.md

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research panel-data-analyst

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Panel Data Analyst

Perform expert-level panel data regression analysis including fixed effects, random effects, dynamic panel models (Arellano-Bond/Blundell-Bond GMM), and advanced diagnostic tests. This skill covers the full workflow from panel setup through model selection, estimation, and publication-ready reporting.

Overview

Panel data -- repeated observations on the same cross-sectional units over time -- is the workhorse of modern empirical economics, finance, political science, and management research. Panel methods exploit both cross-sectional and temporal variation, enabling researchers to control for unobserved heterogeneity that would bias ordinary cross-sectional estimates.

The choice between fixed effects, random effects, and dynamic panel estimators depends on the data structure, the nature of unobserved heterogeneity, and the identifying assumptions the researcher is willing to make. This skill provides a systematic decision framework and implementation in both Stata and R, with emphasis on the diagnostic tests that justify model selection.

Beyond basic FE/RE models, this skill covers the advanced techniques increasingly required by journal reviewers: instrumental variables within panel frameworks, Driscoll-Kraay standard errors for cross-sectional dependence, correlated random effects (Mundlak/Chamberlain), and system GMM for dynamic panels with endogenous regressors.

Panel Data Setup

Declaring Panel Structure

* Stata panel setup
xtset firm_id year
xtset  // Verify panel structure

* Check panel balance
xtdescribe
* Shows: min/max/avg observations per panel, gaps

* Summary statistics by panel dimension
xtsum revenue profit employees rnd_spending
* Reports overall, between, and within variation

Panel Diagnostics

* Check for gaps in panel
xtset firm_id year
gen gap = year - l.year if l.year != .
tab gap  // Should be all 1's for balanced annual panels

* Create balanced subsample
by firm_id: gen T_i = _N
tab T_i
keep if T_i == max_T  // Keep only units observed in all periods

* Attrition analysis
gen in_panel = 1
xtset firm_id year
tsfill, full
replace in_panel = 0 if missing(in_panel)
reg in_panel l.revenue l.profit l.size, cluster(firm_id)

Fixed Effects vs. Random Effects

Fixed Effects Estimation

* Within estimator (entity fixed effects)
xtreg profit revenue rnd_spending employees i.year, fe robust
estimates store fe_model

* Entity and time fixed effects
reghdfe profit revenue rnd_spending employees, ///
    absorb(firm_id year) cluster(firm_id)
estimates store twoway_fe

* First-differences (alternative to within estimator)
reg d.profit d.revenue d.rnd_spending d.employees i.year, ///
    cluster(firm_id)
estimates store fd_model

Random Effects Estimation

* GLS random effects
xtreg profit revenue rnd_spending employees i.year, re robust
estimates store re_model

Hausman Test for Model Selection

* Classic Hausman test
xtreg profit revenue rnd_spending employees, fe
estimates store fe_haus
xtreg profit revenue rnd_spending employees, re
estimates store re_haus
hausman fe_haus re_haus

* Robust Hausman test (preferred with heteroskedasticity)
* Mundlak (1978) approach: add group means to RE model
foreach var of varlist revenue rnd_spending employees {
    bysort firm_id: egen m_`var' = mean(`var')
}
xtreg profit revenue rnd_spending employees ///
    m_revenue m_rnd_spending m_employees i.year, re cluster(firm_id)
test m_revenue m_rnd_spending m_employees
* Rejection => FE preferred; failure to reject => RE acceptable

Dynamic Panel Models

Arellano-Bond GMM (Difference GMM)

* When the lagged dependent variable is a regressor:
* y_it = alpha * y_{i,t-1} + X_it * beta + mu_i + epsilon_it

* Difference GMM (Arellano & Bond 1991)
xtabond profit l.profit revenue rnd_spending employees, ///
    lags(1) twostep robust artests(2)

* Diagnostics
* AR(1) should be significant, AR(2) should NOT be significant
* Hansen J test of overidentifying restrictions (p > 0.10 desired)

System GMM (Blundell-Bond)

* System GMM (Blundell & Bond 1998)
* More efficient than difference GMM, especially with persistent series

xtabond2 profit l.profit revenue rnd_spending employees i.year, ///
    gmm(l.profit, lag(2 4) collapse) ///
    gmm(revenue rnd_spending, lag(2 3) collapse) ///
    iv(employees i.year) ///
    twostep robust orthogonal small

* Key diagnostics to report:
* 1. Number of instruments (should not exceed number of groups)
* 2. Hansen J test p-value (> 0.10, but < 0.25 preferred -- not too high)
* 3. AR(2) test p-value (> 0.10 for valid instruments)
* 4. Difference-in-Hansen test for subset of instruments

GMM Diagnostic Checklist

| Test | Null Hypothesis | Desired Result | Stata Command | |------|----------------|----------------|---------------| | AR(1) | No first-order autocorrelation | Reject (p < 0.05) | Reported automatically | | AR(2) | No second-order autocorrelation | Fail to reject (p > 0.10) | Reported automatically | | Hansen J | Instruments are valid | Fail to reject (p > 0.10) | Reported automatically | | Diff-in-Hansen | Level instruments valid | Fail to reject (p > 0.10) | Reported automatically | | Instrument count | -- | N_instruments < N_groups | Check output |

Standard Error Options

Choosing the Right Standard Errors

* Entity-clustered (default choice for firm panels)
xtreg profit revenue rnd_spending, fe cluster(firm_id)

* Two-way clustering (firm and year)
reghdfe profit revenue rnd_spending, ///
    absorb(firm_id) cluster(firm_id year)

* Driscoll-Kraay standard errors (cross-sectional dependence)
xtscc profit revenue rnd_spending i.year, fe lag(3)

* Newey-West within panels (autocorrelation + heteroskedasticity)
xtreg profit revenue rnd_spending, fe
xtpcse profit revenue rnd_spending i.firm_id, correlation(ar1)

Diagnostic Tests for Standard Error Selection

* Test for heteroskedasticity in FE model
xtreg profit revenue rnd_spending, fe
xttest3  // Modified Wald test (rejects => use robust/cluster SE)

* Test for serial correlation
xtserial profit revenue rnd_spending
* Wooldridge test (rejects => use cluster SE or Newey-West)

* Test for cross-sectional dependence
xtreg profit revenue rnd_spending, fe
xtcsd, pesaran abs
* Pesaran CD test (rejects => consider Driscoll-Kraay SE)

Advanced Specifications

Interaction Effects in Panel Models

* Continuous x continuous interaction with FE
xtreg profit c.rnd_spending##c.market_share i.year, fe cluster(firm_id)

* Visualize marginal effect
margins, dydx(rnd_spending) at(market_share=(0(0.1)1))
marginsplot, title("Marginal Effect of R&D by Market Share")

Instrumental Variables in Panel Data

* IV with fixed effects (xtivreg)
xtivreg profit (rnd_spending = tax_credit regulatory_change) ///
    employees size i.year, fe first

* First-stage F-statistic check
* Report Kleibergen-Paap rk Wald F for weak instruments

Correlated Random Effects (Mundlak)

* Mundlak (1978) approach: include within-group means
foreach var of varlist revenue rnd_spending employees {
    bysort firm_id: egen bar_`var' = mean(`var')
}

xtreg profit revenue rnd_spending employees ///
    bar_revenue bar_rnd_spending bar_employees ///
    i.year, re cluster(firm_id)

* Coefficients on time-varying vars are equivalent to FE estimates
* Coefficients on bar_ vars capture between-unit effects

Publication Tables

* Comparison table: FE vs RE vs GMM
esttab fe_model re_model gmm_model using "tables/panel_comparison.tex", ///
    b(3) se(3) star(* 0.10 ** 0.05 *** 0.01) ///
    label title("Panel Regression Results") ///
    mtitles("Fixed Effects" "Random Effects" "System GMM") ///
    stats(N N_g r2_w ar2p hansenp, ///
        labels("Observations" "Firms" "Within R-squared" ///
               "AR(2) p-value" "Hansen p-value") ///
        fmt(0 0 3 3 3)) ///
    addnotes("Clustered standard errors in parentheses." ///
             "All models include year fixed effects.") ///
    replace

References

Wooldridge, J.M. (2010), Econometric Analysis of Cross Section and Panel Data, 2nd ed., MIT Press
Arellano & Bond (1991), "Some Tests of Specification for Panel Data," RES 58(2)
Blundell & Bond (1998), "Initial Conditions and Moment Restrictions in Dynamic Panel Data Models," JoE 87(1)
Roodman (2009), "How to Do xtabond2: An Introduction to Difference and System GMM in Stata," SJ 9(1)
Cameron & Trivedi (2005), Microeconometrics: Methods and Applications, Cambridge University Press

brycewang-stanford/panel-data-analyst

skills/43-wentorai-research-plugins/skills/analysis/econometrics/panel-data-analyst/SKILL.md

Expert panel data regression analysis with fixed effects and GMM

1,232 stars

data-ai

Updated May 26, 2026

$ install --global

skillsauth

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research panel-data-analyst

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 3:27 PM6.0s1 file scanned

SKILL.md

name:: panel-data-analyst
description:: Expert panel data regression analysis with fixed effects and GMM
emoji:: 📊
category:: analysis
subcategory:: econometrics
keywords:: ["panel data", "fixed effects", "random effects", "GMM", "dynamic panel", "Hausman test"]
source:: https://www.stata.com/manuals/xt.pdf

Panel Data Analyst

Overview

Panel Data Setup

Declaring Panel Structure

* Stata panel setup
xtset firm_id year
xtset  // Verify panel structure

* Check panel balance
xtdescribe
* Shows: min/max/avg observations per panel, gaps

* Summary statistics by panel dimension
xtsum revenue profit employees rnd_spending
* Reports overall, between, and within variation

Panel Diagnostics

* Check for gaps in panel
xtset firm_id year
gen gap = year - l.year if l.year != .
tab gap  // Should be all 1's for balanced annual panels

* Create balanced subsample
by firm_id: gen T_i = _N
tab T_i
keep if T_i == max_T  // Keep only units observed in all periods

* Attrition analysis
gen in_panel = 1
xtset firm_id year
tsfill, full
replace in_panel = 0 if missing(in_panel)
reg in_panel l.revenue l.profit l.size, cluster(firm_id)

Fixed Effects vs. Random Effects

Fixed Effects Estimation

* Within estimator (entity fixed effects)
xtreg profit revenue rnd_spending employees i.year, fe robust
estimates store fe_model

* Entity and time fixed effects
reghdfe profit revenue rnd_spending employees, ///
    absorb(firm_id year) cluster(firm_id)
estimates store twoway_fe

* First-differences (alternative to within estimator)
reg d.profit d.revenue d.rnd_spending d.employees i.year, ///
    cluster(firm_id)
estimates store fd_model

Random Effects Estimation

* GLS random effects
xtreg profit revenue rnd_spending employees i.year, re robust
estimates store re_model

Hausman Test for Model Selection

* Classic Hausman test
xtreg profit revenue rnd_spending employees, fe
estimates store fe_haus
xtreg profit revenue rnd_spending employees, re
estimates store re_haus
hausman fe_haus re_haus

* Robust Hausman test (preferred with heteroskedasticity)
* Mundlak (1978) approach: add group means to RE model
foreach var of varlist revenue rnd_spending employees {
    bysort firm_id: egen m_`var' = mean(`var')
}
xtreg profit revenue rnd_spending employees ///
    m_revenue m_rnd_spending m_employees i.year, re cluster(firm_id)
test m_revenue m_rnd_spending m_employees
* Rejection => FE preferred; failure to reject => RE acceptable

Dynamic Panel Models

Arellano-Bond GMM (Difference GMM)

* When the lagged dependent variable is a regressor:
* y_it = alpha * y_{i,t-1} + X_it * beta + mu_i + epsilon_it

* Difference GMM (Arellano & Bond 1991)
xtabond profit l.profit revenue rnd_spending employees, ///
    lags(1) twostep robust artests(2)

* Diagnostics
* AR(1) should be significant, AR(2) should NOT be significant
* Hansen J test of overidentifying restrictions (p > 0.10 desired)

System GMM (Blundell-Bond)

* System GMM (Blundell & Bond 1998)
* More efficient than difference GMM, especially with persistent series

xtabond2 profit l.profit revenue rnd_spending employees i.year, ///
    gmm(l.profit, lag(2 4) collapse) ///
    gmm(revenue rnd_spending, lag(2 3) collapse) ///
    iv(employees i.year) ///
    twostep robust orthogonal small

* Key diagnostics to report:
* 1. Number of instruments (should not exceed number of groups)
* 2. Hansen J test p-value (> 0.10, but < 0.25 preferred -- not too high)
* 3. AR(2) test p-value (> 0.10 for valid instruments)
* 4. Difference-in-Hansen test for subset of instruments

GMM Diagnostic Checklist

Standard Error Options

Choosing the Right Standard Errors

* Entity-clustered (default choice for firm panels)
xtreg profit revenue rnd_spending, fe cluster(firm_id)

* Two-way clustering (firm and year)
reghdfe profit revenue rnd_spending, ///
    absorb(firm_id) cluster(firm_id year)

* Driscoll-Kraay standard errors (cross-sectional dependence)
xtscc profit revenue rnd_spending i.year, fe lag(3)

* Newey-West within panels (autocorrelation + heteroskedasticity)
xtreg profit revenue rnd_spending, fe
xtpcse profit revenue rnd_spending i.firm_id, correlation(ar1)

Diagnostic Tests for Standard Error Selection

* Test for heteroskedasticity in FE model
xtreg profit revenue rnd_spending, fe
xttest3  // Modified Wald test (rejects => use robust/cluster SE)

* Test for serial correlation
xtserial profit revenue rnd_spending
* Wooldridge test (rejects => use cluster SE or Newey-West)

* Test for cross-sectional dependence
xtreg profit revenue rnd_spending, fe
xtcsd, pesaran abs
* Pesaran CD test (rejects => consider Driscoll-Kraay SE)

Advanced Specifications

Interaction Effects in Panel Models

* Continuous x continuous interaction with FE
xtreg profit c.rnd_spending##c.market_share i.year, fe cluster(firm_id)

* Visualize marginal effect
margins, dydx(rnd_spending) at(market_share=(0(0.1)1))
marginsplot, title("Marginal Effect of R&D by Market Share")

Instrumental Variables in Panel Data

* IV with fixed effects (xtivreg)
xtivreg profit (rnd_spending = tax_credit regulatory_change) ///
    employees size i.year, fe first

* First-stage F-statistic check
* Report Kleibergen-Paap rk Wald F for weak instruments

Correlated Random Effects (Mundlak)

* Mundlak (1978) approach: include within-group means
foreach var of varlist revenue rnd_spending employees {
    bysort firm_id: egen bar_`var' = mean(`var')
}

xtreg profit revenue rnd_spending employees ///
    bar_revenue bar_rnd_spending bar_employees ///
    i.year, re cluster(firm_id)

* Coefficients on time-varying vars are equivalent to FE estimates
* Coefficients on bar_ vars capture between-unit effects

Publication Tables

* Comparison table: FE vs RE vs GMM
esttab fe_model re_model gmm_model using "tables/panel_comparison.tex", ///
    b(3) se(3) star(* 0.10 ** 0.05 *** 0.01) ///
    label title("Panel Regression Results") ///
    mtitles("Fixed Effects" "Random Effects" "System GMM") ///
    stats(N N_g r2_w ar2p hansenp, ///
        labels("Observations" "Firms" "Within R-squared" ///
               "AR(2) p-value" "Hansen p-value") ///
        fmt(0 0 3 3 3)) ///
    addnotes("Clustered standard errors in parentheses." ///
             "All models include year fixed effects.") ///
    replace

References

Wooldridge, J.M. (2010), Econometric Analysis of Cross Section and Panel Data, 2nd ed., MIT Press
Arellano & Bond (1991), "Some Tests of Specification for Panel Data," RES 58(2)
Blundell & Bond (1998), "Initial Conditions and Moment Restrictions in Dynamic Panel Data Models," JoE 87(1)
Roodman (2009), "How to Do xtabond2: An Introduction to Difference and System GMM in Stata," SJ 9(1)
Cameron & Trivedi (2005), Microeconometrics: Methods and Applications, Cambridge University Press

Related Skills

brycewang-stanford/literature-review-tools

tools

VerifiedTrustedCommunity

Recommend AND run open-source AI tools, agents, Claude Code / Codex skills, and MCP servers for any stage of a literature review — searching, reading, extracting, synthesizing, screening, citation-checking, and paper writing. Use when the user asks "what tool should I use to..." OR "install/run/use <tool> to ..." for research/lit-review work: automating a survey or related-work section, PDF→Markdown extraction for LLMs (MinerU/marker/docling), PRISMA / systematic review (ASReview), citation-backed Q&A over PDFs (PaperQA2), wiring papers into Claude/Cursor via MCP (arxiv/paper-search/zotero servers), or chatting with a Zotero library. Ships a launcher (scripts/litrun.py) that installs each tool in an isolated venv and runs it. Curated catalog of 70+ vetted projects. 支持中英文（用于「文献综述工具选型」与「一键安装/运行」）。

3,109SKILL.mdUpdated Jul 28, 2026

brycewang-stanford/literature-review-tools

brycewang-stanford/auto-empirical-research-skills

development

VerifiedTrustedCommunity

Route empirical-research requests through the Auto-Empirical Research Skills catalog when this whole repository is installed as one skill in Codex, CodeBuddy, Claude Code, or another IDE. Use to choose and load the right vendored AERS skill for causal inference, econometrics, replication, data acquisition, manuscript writing, peer review and referee responses, citation checking, de-AIGC editing, or full empirical-paper workflows without reading the entire repository at once.

3,109SKILL.mdUpdated Jun 27, 2026

brycewang-stanford/auto-empirical-research-skills

brycewang-stanford/aer-preregistration

documentation

VerifiedTrustedCommunity

Use when the project collects primary data or runs a field, lab, or survey experiment, before the intervention begins — write the pre-analysis plan, size the sample from a power calculation, and register with the AEA RCT Registry. Apply after the design is chosen in aer-identification and before any outcome data are seen.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/aer-preregistration

brycewang-stanford/economist-data-skill

tools

VerifiedTrustedCommunity

Guide economists to authoritative data sources with explicit, confirmed data specifications before retrieval; interfaces with Playwright MCP to navigate portals and extract real data, not articles about data.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/economist-data-skill

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research.git

# Copy into Claude Code skills folder (global)
cp -r Awesome-Agent-Skills-for-Empirical-Research/skills/43-wentorai-research-plugins/skills/analysis/econometrics/panel-data-analyst ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

1,232 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT