Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

lyndonkl/reference-class-forecasting

Name: reference-class-forecasting
Author: lyndonkl

skills/reference-class-forecasting/SKILL.md

npx skillsauth add lyndonkl/claude reference-class-forecasting

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Reference Class Forecasting

Interactive Menu
Quick Reference
Resource Files

Interactive Menu

What would you like to do?

Core Workflows

1. Find My Base Rate - Identify reference class and get statistical baseline

Guided process to select correct reference class
Search strategies for finding historical frequencies
Validation that you have the right anchor

2. Test "This Time Is Different" - Challenge uniqueness claims

Reversal test for uniqueness bias
Similarity matching framework
Burden of proof calculator

3. Calculate Funnel Base Rates - Multi-stage probability chains

When no single base rate exists
Sequential probability modeling
Product rule for compound events

4. Validate My Reference Class - Ensure you chose the right comparison set

Too broad vs too narrow test
Homogeneity check
Sample size evaluation

5. Learn the Framework - Deep dive into methodology

Read Outside View Principles
Read Reference Class Selection Guide
Read Common Pitfalls

6. Exit - Return to main forecasting workflow

1. Find My Base Rate

Let's establish your statistical baseline.

Step 1: What are you forecasting?

Tell me the specific event or outcome you're predicting.

Example prompts:

"Will this startup succeed?"
"Will this bill pass Congress?"
"Will this project launch on time?"

Step 2: Identify the Reference Class

I'll help you identify what bucket this belongs to.

Framework:

Too broad: "All companies" → meaningless
Just right: "Seed-stage B2B SaaS startups in fintech"
Too narrow: "Companies founded by people named Steve in 2024" → no data

Key Questions:

What type of entity is this? (company, bill, project, person, etc.)
What stage/size/category?
What industry/domain?
What time period is relevant?

I'll work with you to refine this until we have a specific, searchable class.

Step 3: Search for Historical Data

I'll help you find the base rate using:

Web search for published statistics
Academic studies on success rates
Government/industry reports
Proxy metrics if direct data unavailable

Search Strategy:

"historical success rate of [reference class]"
"[reference class] failure statistics"
"[reference class] survival rate"
"what percentage of [reference class]"

Step 4: Set Your Anchor

Once we find the base rate, that becomes your starting probability.

The Rule:

Treat this base rate as your starting point. Adjust only when you have specific, evidence-based reasons from your "inside view" analysis.

Default anchors if no data found:

Novel innovation: 10-20% (most innovations fail)
Established industry: 50% (uncertain)
Regulated/proven process: 70-80% (systems work)

Next: Return to menu or proceed to inside view analysis.

2. Test "This Time Is Different"

Challenge uniqueness bias.

When someone (including yourself) believes "this case is special," we need to stress-test that belief.

The Uniqueness Audit

Question 1: Similarity Matching

What are 5 historical cases that are most similar to this one?
For each, what was the outcome?
How is your case materially different from these?

Question 2: The Reversal Test

If someone claimed a different case was "unique" for the same reasons you're claiming, would you accept it?
Are you applying special pleading?

Question 3: Burden of Proof The base rate says [X]%. You claim it should be [Y]%.

Calculate the gap: |Y - X|

Required evidence strength:

Gap < 10%: Minimal evidence needed
Gap 10-30%: Moderate evidence needed (2-3 specific factors)
Gap > 30%: Extraordinary evidence needed (multiple independent strong signals)

Output

I'll tell you:

Whether "this time is different" is justified
How much you can reasonably adjust from the base rate
What evidence would be needed to justify larger moves

Next: Return to menu

3. Calculate Funnel Base Rates

For multi-stage processes without a single base rate.

When to Use

No direct statistic exists (e.g., "success rate of X")
Event requires multiple sequential steps
Each stage has independent probabilities

The Funnel Method

Example: "Will Bill X become law?"

No direct data on "Bill X success rate," but we can model the funnel:

Stage 1: Bills introduced → Bills that reach committee
- P(committee | introduced) = ?
Stage 2: Bills in committee → Bills that reach floor vote
- P(floor | committee) = ?
Stage 3: Bills voted on → Bills that pass
- P(pass | floor vote) = ?

Final Base Rate:

P(law) = P(committee) × P(floor) × P(pass)

Process

I'll help you:

Decompose the event into sequential stages
Search for statistics on each stage
Multiply probabilities using the product rule
Validate the model (are stages truly independent?)

Common Funnels

Startup success: Seed → Series A → Profitability → Exit
Drug approval: Discovery → Trials → FDA → Market
Project delivery: Planning → Development → Testing → Launch

Next: Return to menu

4. Validate My Reference Class

Ensure you chose the right comparison set.

The Three Tests

Test 1: Homogeneity

Are the members of this class actually similar enough?
Is there high variance in outcomes?
Should you subdivide further?

Example: "Tech startups" is too broad (consumer vs B2B vs hardware are very different). Subdivide.

Test 2: Sample Size

Do you have enough historical cases?
Minimum: 20-30 cases for meaningful statistics
If N < 20: Widen the class or acknowledge high uncertainty

Test 3: Relevance

Have conditions changed since the historical data?
Are there structural differences (regulation, technology, market)?
Time decay: Data from >10 years ago may be stale

Validation Checklist

I'll walk you through:

[ ] Class has 20+ historical examples
[ ] Members are reasonably homogeneous
[ ] Data is from relevant time period
[ ] No major structural changes since data collection
[ ] Class is specific enough to be meaningful
[ ] Class is broad enough to have data

Output: Confidence level in your reference class (High/Medium/Low)

Next: Return to menu

5. Learn the Framework

Deep dive into the methodology.

Resource Files

📄 Outside View Principles

Statistical thinking vs narrative thinking
Why the outside view beats experts
Kahneman's planning fallacy research
When outside view fails

📄 Reference Class Selection Guide

Systematic method for choosing comparison sets
Balancing specificity vs data availability
Similarity metrics and matching
Edge cases and judgment calls

📄 Common Pitfalls

Base rate neglect examples
"This time is different" bias
Overfitting to small samples
Ignoring regression to the mean
Availability bias in class selection

Next: Return to menu

Quick Reference

The Outside View Commandments

Base Rate First: Establish statistical baseline BEFORE analyzing specifics
Assume Average: Treat case as typical until proven otherwise
Burden of Proof: Large deviations from base rate require strong evidence
Class Precision: Reference class should be specific but data-rich
No Narratives: Resist compelling stories; trust frequencies

One-Sentence Summary

Find what usually happens to things like this, start there, and only move with evidence.

Integration with Other Skills

Before: Use estimation-fermi if you need to calculate base rate from components
After: Use bayesian-reasoning-calibration to update from base rate with new evidence
Companion: Use scout-mindset-bias-check to validate you're not cherry-picking the reference class

Resource Files

📁 resources/

outside-view-principles.md - Theory and research
reference-class-selection.md - Systematic selection method
common-pitfalls.md - What to avoid

Ready to start? Choose a number from the menu above.

lyndonkl/reference-class-forecasting

skills/reference-class-forecasting/SKILL.md

Anchors predictions in historical reality by identifying a class of similar past events and using their statistical frequency as a baseline (outside view) before analyzing case-specific details. Use when starting a forecast, establishing base rates, testing "this time is different" claims, or when user mentions reference classes, outside view, base rates, or starting a new prediction.

81 stars

testing

Updated Apr 20, 2026

$ install --global

skillsauth

npx skillsauth add lyndonkl/claude reference-class-forecasting

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 20, 2026, 6:27 AM5.7s4 files scanned

SKILL.md

name:: reference-class-forecasting
description:: Anchors predictions in historical reality by identifying a class of similar past events and using their statistical frequency as a baseline (outside view) before analyzing case-specific details. Use when starting a forecast, establishing base rates, testing "this time is different" claims, or when user mentions reference classes, outside view, base rates, or starting a new prediction.

Reference Class Forecasting

Interactive Menu
Quick Reference
Resource Files

Interactive Menu

What would you like to do?

Core Workflows

1. Find My Base Rate - Identify reference class and get statistical baseline

Guided process to select correct reference class
Search strategies for finding historical frequencies
Validation that you have the right anchor

2. Test "This Time Is Different" - Challenge uniqueness claims

Reversal test for uniqueness bias
Similarity matching framework
Burden of proof calculator

3. Calculate Funnel Base Rates - Multi-stage probability chains

When no single base rate exists
Sequential probability modeling
Product rule for compound events

4. Validate My Reference Class - Ensure you chose the right comparison set

Too broad vs too narrow test
Homogeneity check
Sample size evaluation

5. Learn the Framework - Deep dive into methodology

Read Outside View Principles
Read Reference Class Selection Guide
Read Common Pitfalls

6. Exit - Return to main forecasting workflow

1. Find My Base Rate

Let's establish your statistical baseline.

Step 1: What are you forecasting?

Tell me the specific event or outcome you're predicting.

Example prompts:

"Will this startup succeed?"
"Will this bill pass Congress?"
"Will this project launch on time?"

Step 2: Identify the Reference Class

I'll help you identify what bucket this belongs to.

Framework:

Too broad: "All companies" → meaningless
Just right: "Seed-stage B2B SaaS startups in fintech"
Too narrow: "Companies founded by people named Steve in 2024" → no data

Key Questions:

What type of entity is this? (company, bill, project, person, etc.)
What stage/size/category?
What industry/domain?
What time period is relevant?

I'll work with you to refine this until we have a specific, searchable class.

Step 3: Search for Historical Data

I'll help you find the base rate using:

Web search for published statistics
Academic studies on success rates
Government/industry reports
Proxy metrics if direct data unavailable

Search Strategy:

"historical success rate of [reference class]"
"[reference class] failure statistics"
"[reference class] survival rate"
"what percentage of [reference class]"

Step 4: Set Your Anchor

Once we find the base rate, that becomes your starting probability.

The Rule:

Treat this base rate as your starting point. Adjust only when you have specific, evidence-based reasons from your "inside view" analysis.

Default anchors if no data found:

Novel innovation: 10-20% (most innovations fail)
Established industry: 50% (uncertain)
Regulated/proven process: 70-80% (systems work)

Next: Return to menu or proceed to inside view analysis.

2. Test "This Time Is Different"

Challenge uniqueness bias.

When someone (including yourself) believes "this case is special," we need to stress-test that belief.

The Uniqueness Audit

Question 1: Similarity Matching

What are 5 historical cases that are most similar to this one?
For each, what was the outcome?
How is your case materially different from these?

Question 2: The Reversal Test

If someone claimed a different case was "unique" for the same reasons you're claiming, would you accept it?
Are you applying special pleading?

Question 3: Burden of Proof The base rate says [X]%. You claim it should be [Y]%.

Calculate the gap: |Y - X|

Required evidence strength:

Gap < 10%: Minimal evidence needed
Gap 10-30%: Moderate evidence needed (2-3 specific factors)
Gap > 30%: Extraordinary evidence needed (multiple independent strong signals)

Output

I'll tell you:

Whether "this time is different" is justified
How much you can reasonably adjust from the base rate
What evidence would be needed to justify larger moves

Next: Return to menu

3. Calculate Funnel Base Rates

For multi-stage processes without a single base rate.

When to Use

No direct statistic exists (e.g., "success rate of X")
Event requires multiple sequential steps
Each stage has independent probabilities

The Funnel Method

Example: "Will Bill X become law?"

No direct data on "Bill X success rate," but we can model the funnel:

Stage 1: Bills introduced → Bills that reach committee
- P(committee | introduced) = ?
Stage 2: Bills in committee → Bills that reach floor vote
- P(floor | committee) = ?
Stage 3: Bills voted on → Bills that pass
- P(pass | floor vote) = ?

Final Base Rate:

P(law) = P(committee) × P(floor) × P(pass)

Process

I'll help you:

Decompose the event into sequential stages
Search for statistics on each stage
Multiply probabilities using the product rule
Validate the model (are stages truly independent?)

Common Funnels

Startup success: Seed → Series A → Profitability → Exit
Drug approval: Discovery → Trials → FDA → Market
Project delivery: Planning → Development → Testing → Launch

Next: Return to menu

4. Validate My Reference Class

Ensure you chose the right comparison set.

The Three Tests

Test 1: Homogeneity

Are the members of this class actually similar enough?
Is there high variance in outcomes?
Should you subdivide further?

Example: "Tech startups" is too broad (consumer vs B2B vs hardware are very different). Subdivide.

Test 2: Sample Size

Do you have enough historical cases?
Minimum: 20-30 cases for meaningful statistics
If N < 20: Widen the class or acknowledge high uncertainty

Test 3: Relevance

Have conditions changed since the historical data?
Are there structural differences (regulation, technology, market)?
Time decay: Data from >10 years ago may be stale

Validation Checklist

I'll walk you through:

[ ] Class has 20+ historical examples
[ ] Members are reasonably homogeneous
[ ] Data is from relevant time period
[ ] No major structural changes since data collection
[ ] Class is specific enough to be meaningful
[ ] Class is broad enough to have data

Output: Confidence level in your reference class (High/Medium/Low)

Next: Return to menu

5. Learn the Framework

Deep dive into the methodology.

Resource Files

📄 Outside View Principles

Statistical thinking vs narrative thinking
Why the outside view beats experts
Kahneman's planning fallacy research
When outside view fails

📄 Reference Class Selection Guide

Systematic method for choosing comparison sets
Balancing specificity vs data availability
Similarity metrics and matching
Edge cases and judgment calls

📄 Common Pitfalls

Base rate neglect examples
"This time is different" bias
Overfitting to small samples
Ignoring regression to the mean
Availability bias in class selection

Next: Return to menu

Quick Reference

The Outside View Commandments

Base Rate First: Establish statistical baseline BEFORE analyzing specifics
Assume Average: Treat case as typical until proven otherwise
Burden of Proof: Large deviations from base rate require strong evidence
Class Precision: Reference class should be specific but data-rich
No Narratives: Resist compelling stories; trust frequencies

One-Sentence Summary

Find what usually happens to things like this, start there, and only move with evidence.

Integration with Other Skills

Before: Use estimation-fermi if you need to calculate base rate from components
After: Use bayesian-reasoning-calibration to update from base rate with new evidence
Companion: Use scout-mindset-bias-check to validate you're not cherry-picking the reference class

Resource Files

📁 resources/

outside-view-principles.md - Theory and research
reference-class-selection.md - Systematic selection method
common-pitfalls.md - What to avoid

Ready to start? Choose a number from the menu above.

Related Skills

lyndonkl/conf-theme-clustering

testing

VerifiedTrustedCommunity

Cluster a conference's event records into a small set of coarse themes with finer sub-clusters, an explicit outlier bucket, and soft (multi-membership) affinities — using the hybrid embed-then-label pipeline (embed abstracts, reduce, density-cluster, then LLM-label the clusters) when embedding libraries are available, and an LLM-reasoned hierarchical fallback when they are not. Embeddings do the grouping; the LLM only names the groups. Conference-agnostic. Use when turning structured event records into a navigable theme map for preference elicitation and scheduling, when you need 6-8 reasonable themes rather than 20 muddy ones, or when overlapping talks must belong to more than one theme. Trigger keywords - theme clustering, cluster talks, embed then label, soft membership, outlier talks, conference themes, topic map.

127SKILL.mdUpdated Jun 28, 2026

lyndonkl/conf-theme-clustering

lyndonkl/conf-schedule-optimization

development

VerifiedTrustedCommunity

Build a personal conference schedule as a constraint-optimization problem — hard constraints (no time overlap, room-to-room travel time, capacity/registration, the attendee's own must-attends and blackouts) plus a user-owned weighted objective trading interest against breadth, pacing (maximize contiguous free time), and serendipity. Surfaces unbreakable conflicts (two high-value overlapping talks the model cannot rank) as decisions for the human rather than silently picking, and reports what each choice traded away. Conference-agnostic. Use to turn a preference profile plus a theme map into a day-by-day plan, to resolve overlapping sessions, or to balance a packed vs paced schedule. Trigger keywords - schedule optimization, conference schedule, constraint optimization, overlapping talks, contiguous free time, conflict surfacing, packed vs paced.

127SKILL.mdUpdated Jun 28, 2026

lyndonkl/conf-schedule-optimization

lyndonkl/conf-program-extraction

development

VerifiedTrustedCommunity

Parse a heterogeneous conference program (markdown, HTML, PDF-derived text, or JSON) into normalized event records with per-field confidence scores and independent classification axes (topic, depth, format, prerequisites, recorded, capacity). Detects the program's format before extracting, treats every inferred field as uncertain (present vs inferred vs missing), and flags thin or missing abstracts so downstream enrichment can target them. Conference-agnostic. Use when ingesting a conference or event schedule into a structured store, normalizing a talk/session list, or extracting per-session metadata with calibrated confidence. Trigger keywords - program ingestion, parse schedule, session extraction, event records, conference program, talk metadata, per-field confidence.

127SKILL.mdUpdated Jun 28, 2026

lyndonkl/conf-program-extraction

lyndonkl/conf-preference-elicitation

development

VerifiedTrustedCommunity

Build a personalized preference profile from a small number of well-chosen, cluster-grounded questions instead of a long survey. Represents the person's interests as an uncertainty region over the theme map, picks the single highest-information-gain choice-based question (contrasting real talks from different clusters), balances exploiting known interests against exploring uncertain ones, deliberately injects outlier probes to fight selection bias, and stops as soon as the schedule would be stable. Also elicits the user-owned objective weights and hard constraints. Interactive — runs where it can actually ask the person. Conference-agnostic. Use to turn a theme map into a preference profile, to decide what to ask a conference attendee, or to elicit scheduling priorities. Trigger keywords - preference elicitation, ask few questions, information gain, choice-based questions, selection bias probe, objective weights, attendee preferences.

127SKILL.mdUpdated Jun 28, 2026

lyndonkl/conf-preference-elicitation

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/lyndonkl/claude.git

# Copy into Claude Code skills folder (global)
cp -r claude/skills/reference-class-forecasting ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

lyndonkl/claude

81 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

lyndonkl/reference-class-forecasting

$ install --global

Security Scan Results

SKILL.md

Reference Class Forecasting

Table of Contents

Interactive Menu

Core Workflows

1. Find My Base Rate

Step 1: What are you forecasting?

Step 2: Identify the Reference Class

Step 3: Search for Historical Data

Step 4: Set Your Anchor

2. Test "This Time Is Different"

The Uniqueness Audit

Output

3. Calculate Funnel Base Rates

When to Use

The Funnel Method

Process

Common Funnels

4. Validate My Reference Class

The Three Tests

Validation Checklist

5. Learn the Framework

Resource Files

Quick Reference

The Outside View Commandments

One-Sentence Summary

Integration with Other Skills

Resource Files

Related Skills

lyndonkl/conf-theme-clustering

lyndonkl/conf-schedule-optimization

lyndonkl/conf-program-extraction

lyndonkl/conf-preference-elicitation

lyndonkl/reference-class-forecasting

$ install --global

Security Scan Results

SKILL.md

Reference Class Forecasting

Table of Contents

Interactive Menu

Core Workflows

1. Find My Base Rate

Step 1: What are you forecasting?

Step 2: Identify the Reference Class

Step 3: Search for Historical Data

Step 4: Set Your Anchor

2. Test "This Time Is Different"

The Uniqueness Audit

Output

3. Calculate Funnel Base Rates

When to Use

The Funnel Method

Process

Common Funnels

4. Validate My Reference Class

The Three Tests

Validation Checklist

5. Learn the Framework

Resource Files

Quick Reference

The Outside View Commandments

One-Sentence Summary

Integration with Other Skills

Resource Files

Related Skills

lyndonkl/conf-theme-clustering

lyndonkl/conf-schedule-optimization

lyndonkl/conf-program-extraction

lyndonkl/conf-preference-elicitation