skills/domains/biomedical/epidemiology-guide/SKILL.md
Epidemiological study designs, measures of association, and public health ana...
npx skillsauth add wentorai/research-plugins epidemiology-guideInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A skill for designing and analyzing epidemiological studies. Covers study design selection, measures of disease frequency and association, bias assessment, and public health data analysis methods.
Evidence Strength
|
Systematic Review / Meta-Analysis (Highest)
|
Randomized Controlled Trial
|
Cohort Study (Prospective)
|
Case-Control Study
|
Cross-Sectional Study
|
Case Report / Case Series (Lowest)
| Design | Research Question | Time | Cost | Bias Risk | |--------|------------------|------|------|-----------| | RCT | Does intervention X prevent outcome Y? | Years | Very high | Lowest | | Prospective Cohort | Does exposure X increase risk of Y? | Years | High | Moderate | | Retrospective Cohort | Historical exposure-outcome relationship? | Months | Moderate | Moderate-High | | Case-Control | What exposures are associated with rare disease? | Months | Low | High | | Cross-Sectional | What is the prevalence of X? | Weeks | Low | High | | Ecological | Do population-level factors correlate with disease? | Weeks | Very low | Very high |
import numpy as np
def compute_measures(cases: int, population: int,
person_time: float = None,
period_years: float = 1.0) -> dict:
"""
Compute basic epidemiological measures.
Args:
cases: Number of new cases (for incidence) or existing cases (for prevalence)
population: Population at risk
person_time: Person-years of follow-up (for incidence rate)
period_years: Time period in years (for cumulative incidence)
"""
measures = {}
# Point prevalence
measures['prevalence'] = {
'value': cases / population,
'per_1000': (cases / population) * 1000,
'formula': 'cases / population at a point in time'
}
# Cumulative incidence (risk)
measures['cumulative_incidence'] = {
'value': cases / population,
'per_1000': (cases / population) * 1000,
'period_years': period_years,
'formula': 'new cases / population at risk during time period'
}
# Incidence rate (if person-time available)
if person_time:
measures['incidence_rate'] = {
'value': cases / person_time,
'per_1000_py': (cases / person_time) * 1000,
'formula': 'new cases / person-time at risk'
}
return measures
def measures_of_association(a: int, b: int, c: int, d: int) -> dict:
"""
Compute epidemiological measures of association from a 2x2 table.
Disease+ Disease-
Exposed+ a b a+b
Exposed- c d c+d
a+c b+d N
Args:
a: Exposed with disease
b: Exposed without disease
c: Unexposed with disease
d: Unexposed without disease
"""
# Risk in exposed and unexposed
risk_exposed = a / (a + b)
risk_unexposed = c / (c + d)
# Risk Ratio (Relative Risk)
rr = risk_exposed / risk_unexposed
ln_rr = np.log(rr)
se_ln_rr = np.sqrt(1/a - 1/(a+b) + 1/c - 1/(c+d))
rr_ci = (np.exp(ln_rr - 1.96*se_ln_rr), np.exp(ln_rr + 1.96*se_ln_rr))
# Odds Ratio
or_val = (a * d) / (b * c)
ln_or = np.log(or_val)
se_ln_or = np.sqrt(1/a + 1/b + 1/c + 1/d)
or_ci = (np.exp(ln_or - 1.96*se_ln_or), np.exp(ln_or + 1.96*se_ln_or))
# Attributable Risk (Risk Difference)
ar = risk_exposed - risk_unexposed
se_ar = np.sqrt(risk_exposed*(1-risk_exposed)/(a+b) +
risk_unexposed*(1-risk_unexposed)/(c+d))
ar_ci = (ar - 1.96*se_ar, ar + 1.96*se_ar)
# Attributable Fraction in Exposed
af_exposed = (rr - 1) / rr
# Population Attributable Fraction
prevalence_exposure = (a + b) / (a + b + c + d)
paf = prevalence_exposure * (rr - 1) / (prevalence_exposure * (rr - 1) + 1)
return {
'risk_ratio': {'value': round(rr, 3), 'ci_95': tuple(round(x, 3) for x in rr_ci)},
'odds_ratio': {'value': round(or_val, 3), 'ci_95': tuple(round(x, 3) for x in or_ci)},
'risk_difference': {'value': round(ar, 4), 'ci_95': tuple(round(x, 4) for x in ar_ci)},
'attributable_fraction_exposed': round(af_exposed, 3),
'population_attributable_fraction': round(paf, 3),
'number_needed_to_harm': round(1/ar, 1) if ar > 0 else None
}
# Example: smoking and lung cancer
result = measures_of_association(a=80, b=920, c=10, d=990)
print(f"RR: {result['risk_ratio']['value']} ({result['risk_ratio']['ci_95']})")
print(f"OR: {result['odds_ratio']['value']} ({result['odds_ratio']['ci_95']})")
print(f"PAF: {result['population_attributable_fraction']}")
| Bias Type | Description | Mitigation Strategy | |-----------|------------|-------------------| | Selection bias | Non-random sample selection | Random sampling, matching | | Information bias | Measurement error in exposure/outcome | Validated instruments, blinding | | Recall bias | Differential recall by disease status | Use records, not self-report | | Confounding | Third variable affects both exposure and outcome | Stratification, regression, matching | | Lead-time bias | Earlier detection misinterpreted as longer survival | Use mortality, not survival | | Healthy worker effect | Workers are healthier than general population | Use employed comparison group |
def assess_confounding(crude_rr: float, adjusted_rr: float,
threshold: float = 0.10) -> dict:
"""
Assess whether a variable is a confounder.
"""
pct_change = abs(crude_rr - adjusted_rr) / crude_rr * 100
return {
'crude_RR': crude_rr,
'adjusted_RR': adjusted_rr,
'percent_change': round(pct_change, 1),
'is_confounder': pct_change > threshold * 100,
'interpretation': (
f"{'Confounder detected' if pct_change > threshold * 100 else 'Not a confounder'}: "
f"adjusting changed the RR by {pct_change:.1f}% "
f"(threshold: {threshold*100:.0f}%)"
)
}
For time-to-event data, use Kaplan-Meier estimators for descriptive analysis, log-rank tests for group comparisons, and Cox proportional hazards regression for multivariable analysis. Always check the proportional hazards assumption using Schoenfeld residuals and report median survival times with 95% confidence intervals.
Follow STROBE (observational studies), CONSORT (trials), or RECORD (routinely collected data) reporting guidelines. Report all measures with 95% confidence intervals. Present both crude and adjusted estimates to show the impact of confounding adjustment.
documentation
Write Tsinghua University theses using the ThuThesis LaTeX template
development
Templates, formatting rules, and strategies for thesis and dissertation writing
documentation
Set up LaTeX templates for PhD and Master's thesis documents
documentation
Write SJTU theses using the SJTUThesis LaTeX template with full compliance