skills/43-wentorai-research-plugins/skills/domains/biomedical/epidemiology-guide/SKILL.md
Epidemiological study designs, measures of association, and public health ana...
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research epidemiology-guideInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A skill for designing and analyzing epidemiological studies. Covers study design selection, measures of disease frequency and association, bias assessment, and public health data analysis methods.
Evidence Strength
|
Systematic Review / Meta-Analysis (Highest)
|
Randomized Controlled Trial
|
Cohort Study (Prospective)
|
Case-Control Study
|
Cross-Sectional Study
|
Case Report / Case Series (Lowest)
| Design | Research Question | Time | Cost | Bias Risk | |--------|------------------|------|------|-----------| | RCT | Does intervention X prevent outcome Y? | Years | Very high | Lowest | | Prospective Cohort | Does exposure X increase risk of Y? | Years | High | Moderate | | Retrospective Cohort | Historical exposure-outcome relationship? | Months | Moderate | Moderate-High | | Case-Control | What exposures are associated with rare disease? | Months | Low | High | | Cross-Sectional | What is the prevalence of X? | Weeks | Low | High | | Ecological | Do population-level factors correlate with disease? | Weeks | Very low | Very high |
import numpy as np
def compute_measures(cases: int, population: int,
person_time: float = None,
period_years: float = 1.0) -> dict:
"""
Compute basic epidemiological measures.
Args:
cases: Number of new cases (for incidence) or existing cases (for prevalence)
population: Population at risk
person_time: Person-years of follow-up (for incidence rate)
period_years: Time period in years (for cumulative incidence)
"""
measures = {}
# Point prevalence
measures['prevalence'] = {
'value': cases / population,
'per_1000': (cases / population) * 1000,
'formula': 'cases / population at a point in time'
}
# Cumulative incidence (risk)
measures['cumulative_incidence'] = {
'value': cases / population,
'per_1000': (cases / population) * 1000,
'period_years': period_years,
'formula': 'new cases / population at risk during time period'
}
# Incidence rate (if person-time available)
if person_time:
measures['incidence_rate'] = {
'value': cases / person_time,
'per_1000_py': (cases / person_time) * 1000,
'formula': 'new cases / person-time at risk'
}
return measures
def measures_of_association(a: int, b: int, c: int, d: int) -> dict:
"""
Compute epidemiological measures of association from a 2x2 table.
Disease+ Disease-
Exposed+ a b a+b
Exposed- c d c+d
a+c b+d N
Args:
a: Exposed with disease
b: Exposed without disease
c: Unexposed with disease
d: Unexposed without disease
"""
# Risk in exposed and unexposed
risk_exposed = a / (a + b)
risk_unexposed = c / (c + d)
# Risk Ratio (Relative Risk)
rr = risk_exposed / risk_unexposed
ln_rr = np.log(rr)
se_ln_rr = np.sqrt(1/a - 1/(a+b) + 1/c - 1/(c+d))
rr_ci = (np.exp(ln_rr - 1.96*se_ln_rr), np.exp(ln_rr + 1.96*se_ln_rr))
# Odds Ratio
or_val = (a * d) / (b * c)
ln_or = np.log(or_val)
se_ln_or = np.sqrt(1/a + 1/b + 1/c + 1/d)
or_ci = (np.exp(ln_or - 1.96*se_ln_or), np.exp(ln_or + 1.96*se_ln_or))
# Attributable Risk (Risk Difference)
ar = risk_exposed - risk_unexposed
se_ar = np.sqrt(risk_exposed*(1-risk_exposed)/(a+b) +
risk_unexposed*(1-risk_unexposed)/(c+d))
ar_ci = (ar - 1.96*se_ar, ar + 1.96*se_ar)
# Attributable Fraction in Exposed
af_exposed = (rr - 1) / rr
# Population Attributable Fraction
prevalence_exposure = (a + b) / (a + b + c + d)
paf = prevalence_exposure * (rr - 1) / (prevalence_exposure * (rr - 1) + 1)
return {
'risk_ratio': {'value': round(rr, 3), 'ci_95': tuple(round(x, 3) for x in rr_ci)},
'odds_ratio': {'value': round(or_val, 3), 'ci_95': tuple(round(x, 3) for x in or_ci)},
'risk_difference': {'value': round(ar, 4), 'ci_95': tuple(round(x, 4) for x in ar_ci)},
'attributable_fraction_exposed': round(af_exposed, 3),
'population_attributable_fraction': round(paf, 3),
'number_needed_to_harm': round(1/ar, 1) if ar > 0 else None
}
# Example: smoking and lung cancer
result = measures_of_association(a=80, b=920, c=10, d=990)
print(f"RR: {result['risk_ratio']['value']} ({result['risk_ratio']['ci_95']})")
print(f"OR: {result['odds_ratio']['value']} ({result['odds_ratio']['ci_95']})")
print(f"PAF: {result['population_attributable_fraction']}")
| Bias Type | Description | Mitigation Strategy | |-----------|------------|-------------------| | Selection bias | Non-random sample selection | Random sampling, matching | | Information bias | Measurement error in exposure/outcome | Validated instruments, blinding | | Recall bias | Differential recall by disease status | Use records, not self-report | | Confounding | Third variable affects both exposure and outcome | Stratification, regression, matching | | Lead-time bias | Earlier detection misinterpreted as longer survival | Use mortality, not survival | | Healthy worker effect | Workers are healthier than general population | Use employed comparison group |
def assess_confounding(crude_rr: float, adjusted_rr: float,
threshold: float = 0.10) -> dict:
"""
Assess whether a variable is a confounder.
"""
pct_change = abs(crude_rr - adjusted_rr) / crude_rr * 100
return {
'crude_RR': crude_rr,
'adjusted_RR': adjusted_rr,
'percent_change': round(pct_change, 1),
'is_confounder': pct_change > threshold * 100,
'interpretation': (
f"{'Confounder detected' if pct_change > threshold * 100 else 'Not a confounder'}: "
f"adjusting changed the RR by {pct_change:.1f}% "
f"(threshold: {threshold*100:.0f}%)"
)
}
For time-to-event data, use Kaplan-Meier estimators for descriptive analysis, log-rank tests for group comparisons, and Cox proportional hazards regression for multivariable analysis. Always check the proportional hazards assumption using Schoenfeld residuals and report median survival times with 95% confidence intervals.
Follow STROBE (observational studies), CONSORT (trials), or RECORD (routinely collected data) reporting guidelines. Report all measures with 95% confidence intervals. Present both crude and adjusted estimates to show the impact of confounding adjustment.
development
Conduct rigorous thematic analysis (TA) of qualitative data following Braun and Clarke's (2006) six-phase framework. Use whenever the user mentions 'thematic analysis', 'TA', 'Braun and Clarke', 'qualitative coding', 'identifying themes', or asks for help analysing interviews, focus groups, open-ended survey responses, or transcripts to identify patterns. Also trigger for questions about inductive vs theoretical coding, semantic vs latent themes, essentialist vs constructionist epistemology, building a thematic map, or writing up a qualitative findings section. Covers all six phases, the four upfront analytic decisions, the 15-point quality checklist, and the five common pitfalls. Produces a Word document write-up and an annotated thematic map. Does NOT cover IPA, grounded theory, discourse analysis, conversation analysis, or narrative analysis — use a different method for those.
development
Guide users through writing a systematic literature review (SLR) following the PRISMA 2020 framework. Use this skill whenever the user mentions 'systematic review', 'systematic literature review', 'SLR', 'PRISMA', 'PRISMA 2020', 'PRISMA flow diagram', 'PRISMA checklist', or asks for help writing, structuring, or auditing a literature review that follows reporting guidelines. Also trigger when the user asks about inclusion/exclusion criteria for a review, search strategies for databases like Scopus/WoS/PubMed, study selection processes, risk of bias assessment, or narrative synthesis for a review paper. This skill covers the full PRISMA 2020 checklist (27 items), produces a Word document manuscript in strict journal article format, generates an annotated PRISMA flow diagram, and enforces APA 7th Edition referencing throughout. It does NOT cover meta-analysis or statistical pooling. By Chuah Kee Man.
testing
Performs placebo-in-time sensitivity analysis with hierarchical null model and optional Bayesian assurance. Use when checking model robustness, verifying lack of pre-intervention effects, or estimating study power.
data-ai
Fit, summarize, plot, and interpret a chosen CausalPy experiment. Use after the causal method has been selected, including when configuring PyMC/sklearn models and scale-aware custom priors.