.agents/skills/statistical-analysis/SKILL.md
Apply statistical methods including descriptive stats, trend analysis, outlier detection, and hypothesis testing. Use when analyzing distributions, testing for significance, detecting anomalies, computing correlations, or interpreting statistical results.
npx skillsauth add mmahalwy/cooper statistical-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Descriptive statistics, trend analysis, outlier detection, hypothesis testing, and guidance on when to be cautious about statistical claims.
Choose the right measure of center based on the data:
| Situation | Use | Why | |---|---|---| | Symmetric distribution, no outliers | Mean | Most efficient estimator | | Skewed distribution | Median | Robust to outliers | | Categorical or ordinal data | Mode | Only option for non-numeric | | Highly skewed with outliers (e.g., revenue per user) | Median + mean | Report both; the gap shows skew |
Always report mean and median together for business metrics. If they diverge significantly, the data is skewed and the mean alone is misleading.
Report key percentiles to tell a richer story than mean alone:
p1: Bottom 1% (floor / minimum typical value)
p5: Low end of normal range
p25: First quartile
p50: Median (typical user)
p75: Third quartile
p90: Top 10% / power users
p95: High end of normal range
p99: Top 1% / extreme users
Example narrative: "The median session duration is 4.2 minutes, but the top 10% of users spend over 22 minutes per session, pulling the mean up to 7.8 minutes."
Characterize every numeric distribution you analyze:
Moving averages to smooth noise:
# 7-day moving average (good for daily data with weekly seasonality)
df['ma_7d'] = df['metric'].rolling(window=7, min_periods=1).mean()
# 28-day moving average (smooths weekly AND monthly patterns)
df['ma_28d'] = df['metric'].rolling(window=28, min_periods=1).mean()
Period-over-period comparison:
Growth rates:
Simple growth: (current - previous) / previous
CAGR: (ending / beginning) ^ (1 / years) - 1
Log growth: ln(current / previous) -- better for volatile series
Check for periodic patterns:
For business analysts (not data scientists), use straightforward methods:
Always communicate uncertainty. Provide a range, not a point estimate:
When to escalate to a data scientist: Non-linear trends, multiple seasonalities, external factors (marketing spend, holidays), or when forecast accuracy matters for resource allocation.
Z-score method (for normally distributed data):
z_scores = (df['value'] - df['value'].mean()) / df['value'].std()
outliers = df[abs(z_scores) > 3] # More than 3 standard deviations
IQR method (robust to non-normal distributions):
Q1 = df['value'].quantile(0.25)
Q3 = df['value'].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
outliers = df[(df['value'] < lower_bound) | (df['value'] > upper_bound)]
Percentile method (simplest):
outliers = df[(df['value'] < df['value'].quantile(0.01)) |
(df['value'] > df['value'].quantile(0.99))]
Do NOT automatically remove outliers. Instead:
Report what you did: "We excluded 47 records (0.3%) with transaction amounts >$50K, which represent bulk enterprise orders analyzed separately."
For detecting unusual values in a time series:
Use hypothesis testing when you need to determine whether an observed difference is likely real or could be due to random chance. Common scenarios:
| Scenario | Test | When to Use | |---|---|---| | Compare two group means | t-test (independent) | Normal data, two groups | | Compare two group proportions | z-test for proportions | Conversion rates, binary outcomes | | Compare paired measurements | Paired t-test | Before/after on same entities | | Compare 3+ group means | ANOVA | Multiple segments or variants | | Non-normal data, two groups | Mann-Whitney U test | Skewed metrics, ordinal data | | Association between categories | Chi-squared test | Two categorical variables |
Statistical significance means the difference is unlikely due to chance.
Practical significance means the difference is large enough to matter for business decisions.
A difference can be statistically significant but practically meaningless (common with large samples). Always report:
When you find a correlation, explicitly consider:
What you can say: "Users who use feature X have 30% higher retention" What you cannot say without more evidence: "Feature X causes 30% higher retention"
When you test many hypotheses, some will be "significant" by chance:
A trend in aggregated data can reverse when data is segmented:
You can only analyze entities that "survived" to be in your dataset:
Aggregate trends may not apply to individuals:
Be wary of false precision:
development
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.
content-media
Interactive PDF viewer. Use when the user wants to open, show, or view a PDF and collaborate on it visually — annotate, highlight, stamp, fill form fields, place signature/initials, or review markup together. Not for summarization or text extraction (use native Read instead).
documentation
Write or review UX copy — microcopy, error messages, empty states, CTAs. Trigger with "write copy for", "what should this button say?", "review this error message", or when naming a CTA, wording a confirmation dialog, filling an empty state, or writing onboarding text.
development
Rapidly triage an incoming NDA and classify it as GREEN (standard approval), YELLOW (counsel review), or RED (full legal review). Use when a new NDA arrives from sales or business development, when screening for embedded non-solicits, non-competes, or missing carveouts, or when deciding whether an NDA can be signed under standard delegation.