.claude/skills/triangulation/SKILL.md
# Skill: Triangulation / Sanity Check ## Purpose Cross-reference analytical findings against multiple data sources, external benchmarks, and common sense to catch errors before they become bad decisions. ## When to Use Apply this skill after every analysis, before presenting findings to stakeholders, and whenever a result seems surprising. If a finding would change a decision, it MUST be triangulated first. ## Instructions ### Triangulation Framework Every finding gets checked through four
npx skillsauth add ai-analyst-lab/ai-analyst .claude/skills/triangulationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Cross-reference analytical findings against multiple data sources, external benchmarks, and common sense to catch errors before they become bad decisions.
Apply this skill after every analysis, before presenting findings to stakeholders, and whenever a result seems surprising. If a finding would change a decision, it MUST be triangulated first.
Every finding gets checked through four lenses — starting with the most common source of misleading results:
CHECK 0: SEGMENT-FIRST → Does this hold at the segment level, or is it a Simpson's Paradox?
CHECK 1: INTERNAL → Do the numbers add up within the analysis?
CHECK 2: CROSS-REFERENCE → Does another data source agree?
CHECK 3: PLAUSIBILITY → Does this make sense given what we know about the world?
Run this check BEFORE accepting any aggregate finding. Simpson's Paradox is the #1 source of misleading analytical conclusions — an aggregate trend that reverses when you look at segments.
Default segments to always check (use whichever are available in the data):
Process for each aggregate finding:
If opposite trends detected:
⚠️ SIMPSON'S PARADOX DETECTED
The aggregate [metric] shows [aggregate trend].
However, [segment value] shows the OPPOSITE: [segment trend].
The aggregate is misleading because [explanation — e.g., the growing
segment masks the declining segment].
Action: Report segment-level findings instead of aggregate. Flag this
prominently in the Executive Summary.
If no opposite trends detected: Record: "Segment-first check PASSED — aggregate trends are consistent with [dimensions checked] segment-level trends."
Include in the Validation Report:
| Check | Result | Detail |
|-------|--------|--------|
| Segment-first (platform) | PASS/FAIL | [specifics] |
| Segment-first (user type) | PASS/FAIL | [specifics] |
This check typically takes 2-3 queries and prevents the most common analytical error. Never skip it.
Arithmetic checks:
Logical checks:
def check_internal_consistency(findings):
checks = []
for finding in findings:
# Segment sum check
if finding.has_segments:
segment_sum = sum(finding.segment_values)
total = finding.total_value
if abs(segment_sum - total) / total > 0.02:
checks.append(("FAIL", f"Segments sum to {segment_sum}, but total is {total}"))
# Rate bounds check
if finding.is_rate:
if finding.value < 0 or finding.value > 1:
checks.append(("FAIL", f"{finding.name} = {finding.value} is outside [0,1]"))
# Funnel monotonicity
if finding.is_funnel:
for i in range(1, len(finding.steps)):
if finding.steps[i] > finding.steps[i-1]:
checks.append(("FAIL", f"Funnel step {i} ({finding.steps[i]}) > step {i-1} ({finding.steps[i-1]})"))
return checks
Calculate the same thing two different ways:
Compare against related metrics:
Time-based cross-reference:
Order-of-magnitude checks for common metrics:
| Metric | Typical Range | If Outside Range | |--------|--------------|------------------| | SaaS conversion (free → paid) | 2-5% | >10% suspicious; <1% possible but check | | E-commerce conversion | 1-4% | >8% check for bot filtering issues | | Email open rate | 15-30% | >50% check for pixel tracking issues | | Click-through rate (email) | 2-5% | >15% suspicious | | Monthly churn (SaaS) | 3-8% | <1% check for measurement window; >15% check definition | | DAU/MAU ratio | 10-25% (B2B SaaS) | >40% unusual for non-social products | | NPS | 20-50 (good SaaS) | >70 or <-10 check sample methodology | | Mobile share of traffic | 50-70% (consumer) | <30% check if app traffic is included | | Bounce rate | 40-60% | <20% check for double-firing analytics | | Average session duration | 2-5 min (consumer) | >15 min check for session timeout definition |
Benchmark sources:
What it is: A trend that appears in several groups reverses when the groups are combined. How to check: Always look at both the aggregate AND the segmented view. If they disagree, investigate the segment sizes. Example: Overall conversion went up, but conversion went DOWN in every segment. Cause: the highest-converting segment grew as a share of traffic.
What it is: Analyzing only the data that "survived" a selection process, ignoring what was filtered out. How to check: Ask "what's NOT in this dataset?" Check if churned users, failed transactions, or deleted accounts are excluded. Example: "Average revenue per user increased!" — but only because low-spending users churned, leaving only high-spenders.
What it is: Events counted in different time zones create artificial spikes or dips at day boundaries. How to check: Look at hourly distributions. If there's a spike at midnight UTC, check if events are being bucketed incorrectly. Example: "Signups spike at midnight" — because the mobile app reports in local time but the backend stores in UTC.
What it is: Comparing periods where one period has incomplete data (e.g., comparing full January to partial February). How to check: Always verify the data range is complete. Check the latest event date. Compare like-for-like periods. Example: "February revenue dropped 40%!" — but it's February 15th, and you're comparing to all of January.
What it is: A rate changes not because the behavior changed, but because the pool being measured changed. How to check: Always look at numerator and denominator separately before interpreting the ratio. Example: "Conversion rate doubled!" — because a marketing campaign brought in low-intent traffic (denominator spiked, numerator stayed flat, then the campaign ended and denominator dropped back).
What it is: Two metrics move together, but one doesn't cause the other. How to check: Look for confounders. Ask "what else changed at the same time?" Check if the relationship holds across different segments. Example: "Users who use Feature X have 2x retention" — but maybe power users both use Feature X AND have high retention because they're power users, not because Feature X causes retention.
# Validation Report: [Analysis Name]
## Date: [YYYY-MM-DD]
### Overall Confidence: [HIGH / MEDIUM / LOW]
### Finding-by-Finding Validation
#### Finding 1: [statement]
| Check | Result | Detail |
|-------|--------|--------|
| Internal consistency | PASS/WARN/FAIL | [specifics] |
| Cross-reference | PASS/WARN/FAIL | [specifics] |
| External plausibility | PASS/WARN/FAIL | [specifics] |
| Analytical errors | PASS/WARN/FAIL | [which errors checked, any found] |
| **Confidence** | **HIGH/MEDIUM/LOW** | [summary justification] |
[Repeat for each finding]
### Caveats for Stakeholders
[What should be mentioned when presenting these findings]
### Recommended Additional Validation
[What would increase confidence — more data, different analysis, A/B test]
Finding: "Mobile conversion rate increased from 2.1% to 3.4% in March" Cross-reference check: Look at numerator and denominator separately.
Finding: "Overall activation rate improved from 45% to 48% this quarter" Segment check:
Finding: "Email campaign achieved 72% open rate" External plausibility: Industry average is 15-30%. 72% is extreme. Investigation: Apple Mail Privacy Protection pre-fetches email images, inflating open rates for Apple Mail users. 68% of the list uses Apple Mail. Verdict: WARN — True open rate is likely 25-35% after adjusting for Apple privacy pre-fetching. Report adjusted number alongside raw number.
testing
# Skill: {{BLANK_1_SKILL_NAME}} ## Purpose {{BLANK_2_WHEN_TO_FIRE}} ## When to Use Fires automatically when the user asks Claude to do something that matches the trigger condition above. ## Instructions 1. Detect the trigger condition 2. Execute your guardrail check 3. If the check matters, print a clear, visible warning with "{{BLANK_3_SIGNATURE_PHRASE}}" as the first line 4. Continue with the analysis, incorporating the warning into the output ## Anti-Patterns - Do not fire when the condit
development
# Skill: Visualization Patterns ## Purpose Ensure every chart Claude Code produces follows high-quality design standards with named themes, consistent styling, and clear data communication. ## When to Use Apply this skill whenever generating a chart, graph, or data visualization. Always apply the active theme unless the user specifies otherwise. Default theme: `minimal`. ## Instructions ### Pre-flight: Load Learnings Before executing, check `.knowledge/learnings/index.md` for relevant entrie
data-ai
# Skill: Tracking Gap Identification ## Purpose Assess whether the data needed for an analysis actually exists, identify what's missing, and produce prioritized instrumentation requests for engineering when gaps are found. ## When to Use Apply this skill after the Data Explorer agent inventories available data, when an analysis requires data that might not exist, or when initial query results suggest incomplete tracking. Run before committing to an analysis approach. ## Instructions ### Gap
testing
# Skill: Switch Dataset ## Purpose Change the active dataset. Updates the active pointer, validates the target dataset exists, and confirms with a summary of what's now active. ## When to Use Invoke as `/switch-dataset {name}` when the user wants to analyze a different dataset than the currently active one. ## Instructions ### Step 1: Validate the target dataset 1. Read `data_sources.yaml` to check if `{name}` exists as a registered source 2. If not found, try fuzzy matching (case-insensiti