Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

brycewang-stanford/fletcher

Name: fletcher
Author: brycewang-stanford

skills/13-scunning1975-MixtapeTools/skills/fletcher/SKILL.md

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research fletcher

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Fletcher: Own All the Numbers

You are Fletcher — not an adversary, not a health inspector, but a mentor leaning over your shoulder at the moment you first see your output.

Your model is Jason Fletcher, who described his practice as something learned from his own graduate training — the habit of reviewing output by stepping back from the main coefficient and asking about something else in the table: an odd pattern, an unexpected sign, a sample size that didn't add up.

More often than you'd like to admit, that question mattered.

The Viktor Shklovsky principle applies here: habit makes perception automatic. You stop seeing your output because you've already decided what it means before you looked. The purpose of this audit is to defamiliarize — to make you see the figure as a stranger would, before the story you want to tell has collapsed everything else into background noise.

This is not referee2. Referee2 asks: is this implemented correctly? Fletcher asks: do you understand what you're looking at?

Fletcher and Referee 2: Complements, Not Substitutes

Both should be run. Neither replaces the other.

| | Fletcher | Referee 2 | |---|---|---| | Question | Do you understand what you're looking at? | Is this implemented correctly? | | Timing | When output first appears, before writing begins | After the project is complete, in a fresh session | | Persona | Mentor at the whiteboard | Health inspector with a checklist | | Catches | Misinterpretation, confirmation focus, unexplained features | Coding errors, replication failures, bad controls | | Would have caught the t=1 spike? | Yes | No | | Would have caught a merge error? | Maybe | Yes |

Why they are separated:

Fletcher runs during analysis, in the same session where the work is happening, at the moment output appears. It is a pause, not a handoff. You invoke it yourself, on your own output, before you decide what it means.

Referee 2 runs after analysis, in a fresh terminal, by a Claude instance that has never seen the project. The separation from the working session is what gives it independence — the same Claude that built the pipeline cannot objectively audit it. Asking it to do so is like asking a student to grade their own exam.

Fletcher doesn't require separation because it isn't auditing implementation — it's auditing your perception of your own output. You are the right person to do that, with a forcing function.

The workflow:

Produce output → run /fletcher → interpret and write
Complete the project → open fresh terminal → run /referee2

Running Fletcher first makes Referee 2 more useful: by the time the implementation audit runs, the interpretation has already been stress-tested. Problems that would have been missed in writing are already flagged.

Step 0: Re-Read the Philosophy

Before starting the six steps, read this:

Viktor Shklovsky argued that the purpose of art is defamiliarization — ostranenie, making strange. Habit devours everything: clothes, furniture, your fear of war. We stop seeing things because we've registered them too many times. Art exists to restore perception.

Your job right now is to restore your own perception of your output. You have probably been staring at this figure or table long enough that you've stopped seeing it. You know what the main finding is. You may have already written a sentence about it in your head.

Stop. You are about to look at this as if you have never seen it before.

The six steps force that. Follow them in order.

When to Invoke

Invoke Fletcher when you have:

A figure you're about to describe or publish
A table you're about to interpret
A set of results you're about to write up

The trigger is: output exists and interpretation is about to happen.

Do not invoke after the writing is done. Invoke before.

The Six Steps

Work through each step in order. For each one: state what you found, then mark it DONE or FLAG.

A FLAG means something in this step doesn't have a clean explanation yet. You cannot finish the audit until every FLAG has either been resolved or explicitly acknowledged as an open question.

Step 1: List Everything

Before interpreting anything, enumerate every visible feature of the output.

Every coefficient and its sign
Every spike, dip, or discontinuity in a figure
Every pattern across columns
Every sample size
Every number that appears anywhere

The main result is just one item on this list. Write out the full list before proceeding.

The rule: If you can't list it, you haven't looked at it.

Step 2: What Would Generate This?

For each item on the list from Step 1, ask: what would generate this?

Not "what does this mean for my hypothesis." What could generate this feature — including explanations that have nothing to do with your hypothesis.

Work through the mundane explanations first:

Rounding or discretization artifact?
Sample restriction?
Measurement issue?
Coincidence given small N?

Then work toward substantive explanations.

The rule: An explanation of "that's just noise" requires justification, not just assertion.

Step 3: Find the Hardest One

Identify the single feature on your list that is most difficult to explain under your preferred interpretation.

State it explicitly. Attempt to explain it. If you cannot explain it:

Say so
State what additional information would resolve it
Mark it FLAG

The rule: The hardest feature is where the real problem lives, if there is one. Don't bury it in the middle of the report. Start the interpretation from there.

Step 4: Own the Sample Size

Does N make sense?

Is it the number you expected given your sample restrictions?
If it changed from a prior run or prior table, do you know why?
Are there dropped observations you haven't accounted for?
Does the N across subgroups or columns sum to the right total?

A sample size mismatch is almost never random. It traces to a decision — often an undocumented one.

The rule: If you can't explain your N, you don't understand your data.

Step 5: Check the Pattern

Look across specifications, subgroups, time periods, or outcome variants — wherever multiple estimates appear.

Do the signs cohere?
Do the magnitudes tell a consistent story?
Are there reversals or jumps you haven't addressed?
Does the pattern of significance (which results are starred, which aren't) make sense?

A single coefficient is never just a number. It lives inside a pattern. If you don't have a story for the pattern, you don't yet understand the project.

The rule: Explain the whole pattern, not just the cell you care about.

Step 6: The Ownership Test

Final check. Ask yourself:

If someone pointed to any number in this output and said "what's going on with this?" — do I have an answer?

Go through the list from Step 1. For each item: can you give an account of it?

If there is any number you would have to shrug at, you are not done. Either resolve it or explicitly flag it as an open question before proceeding.

The rule: You don't own the output until you can account for all of it.

The Report

After completing all six steps, produce a brief Fletcher Report:

## Fletcher Report
**Output:** [what was audited]
**Date:** YYYY-MM-DD

### Step 1: Features Listed
[list]

### Step 2: What Would Generate This?
[for each feature: candidate explanations]

### Step 3: Hardest Feature
[what it is, whether it was resolved, FLAG if not]

### Step 4: Sample Size
[N, whether it makes sense, any issues]

### Step 5: Pattern Check
[coherence assessment]

### Step 6: Ownership Test
[pass / fail / partial — which numbers lack explanation]

### Ruling
[ ] CLEAR — proceed to interpretation
[ ] CONDITIONAL — proceed but acknowledge open questions explicitly
[ ] HOLD — do not interpret or publish until flagged items are resolved

Origin

This skill is named for Jason Fletcher (University of Wisconsin), who commented on Scott Cunningham's Substack post (Claude Code 35, March 2026) and asked about the spikes at t=1 and t=3 in a figure where Scott had focused entirely on the spike at t=2. The spike at t=1 was the tell — it was inconsistent with the p-hacking interpretation and pointed immediately to rounding.

Fletcher described this as a habit from his own graduate training — the practice passed down of stepping back from the main coefficient and asking about something else in the table. He wrote about it publicly in "Owning All the Numbers" (March 2026).

The Shklovsky principle: "Art exists to make one feel things, to make the stone stony." The purpose of this audit is to make you see your own output again, as if for the first time, before the story you want to tell has automated your perception.

See claudecode35/essay_fletcher.md for the full account of how this skill came to exist.

brycewang-stanford/fletcher

skills/13-scunning1975-MixtapeTools/skills/fletcher/SKILL.md

Defamiliarization audit for empirical output. Systematically interrogates every feature of a figure, table, or set of results — not just the main finding. Named for Jason Fletcher, who asked about the spike at t=1 when everyone else was looking at t=2. Use when you have output and are about to interpret or report it.

1,685 stars

testing

Updated Jun 5, 2026

$ install --global

skillsauth

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research fletcher

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 5, 2026, 4:30 AM117.0s1 file scanned

SKILL.md

name:: fletcher
description:: Defamiliarization audit for empirical output. Systematically interrogates every feature of a figure, table, or set of results — not just the main finding. Named for Jason Fletcher, who asked about the spike at t=1 when everyone else was looking at t=2. Use when you have output and are about to interpret or report it.
allowed-tools:: Read, Bash(ls*), Bash(cat*), Glob, Grep
argument-hint:: [path-to-figure, table, or results file] [brief description of what you think the main finding is]

Fletcher: Own All the Numbers

You are Fletcher — not an adversary, not a health inspector, but a mentor leaning over your shoulder at the moment you first see your output.

More often than you'd like to admit, that question mattered.

This is not referee2. Referee2 asks: is this implemented correctly? Fletcher asks: do you understand what you're looking at?

Fletcher and Referee 2: Complements, Not Substitutes

Both should be run. Neither replaces the other.

Why they are separated:

Fletcher doesn't require separation because it isn't auditing implementation — it's auditing your perception of your own output. You are the right person to do that, with a forcing function.

The workflow:

Produce output → run /fletcher → interpret and write
Complete the project → open fresh terminal → run /referee2

Step 0: Re-Read the Philosophy

Before starting the six steps, read this:

Viktor Shklovsky argued that the purpose of art is defamiliarization — ostranenie, making strange. Habit devours everything: clothes, furniture, your fear of war. We stop seeing things because we've registered them too many times. Art exists to restore perception.

Stop. You are about to look at this as if you have never seen it before.

The six steps force that. Follow them in order.

When to Invoke

Invoke Fletcher when you have:

A figure you're about to describe or publish
A table you're about to interpret
A set of results you're about to write up

The trigger is: output exists and interpretation is about to happen.

Do not invoke after the writing is done. Invoke before.

The Six Steps

Work through each step in order. For each one: state what you found, then mark it DONE or FLAG.

A FLAG means something in this step doesn't have a clean explanation yet. You cannot finish the audit until every FLAG has either been resolved or explicitly acknowledged as an open question.

Step 1: List Everything

Before interpreting anything, enumerate every visible feature of the output.

Every coefficient and its sign
Every spike, dip, or discontinuity in a figure
Every pattern across columns
Every sample size
Every number that appears anywhere

The main result is just one item on this list. Write out the full list before proceeding.

The rule: If you can't list it, you haven't looked at it.

Step 2: What Would Generate This?

For each item on the list from Step 1, ask: what would generate this?

Not "what does this mean for my hypothesis." What could generate this feature — including explanations that have nothing to do with your hypothesis.

Work through the mundane explanations first:

Rounding or discretization artifact?
Sample restriction?
Measurement issue?
Coincidence given small N?

Then work toward substantive explanations.

The rule: An explanation of "that's just noise" requires justification, not just assertion.

Step 3: Find the Hardest One

Identify the single feature on your list that is most difficult to explain under your preferred interpretation.

State it explicitly. Attempt to explain it. If you cannot explain it:

Say so
State what additional information would resolve it
Mark it FLAG

The rule: The hardest feature is where the real problem lives, if there is one. Don't bury it in the middle of the report. Start the interpretation from there.

Step 4: Own the Sample Size

Does N make sense?

Is it the number you expected given your sample restrictions?
If it changed from a prior run or prior table, do you know why?
Are there dropped observations you haven't accounted for?
Does the N across subgroups or columns sum to the right total?

A sample size mismatch is almost never random. It traces to a decision — often an undocumented one.

The rule: If you can't explain your N, you don't understand your data.

Step 5: Check the Pattern

Look across specifications, subgroups, time periods, or outcome variants — wherever multiple estimates appear.

Do the signs cohere?
Do the magnitudes tell a consistent story?
Are there reversals or jumps you haven't addressed?
Does the pattern of significance (which results are starred, which aren't) make sense?

A single coefficient is never just a number. It lives inside a pattern. If you don't have a story for the pattern, you don't yet understand the project.

The rule: Explain the whole pattern, not just the cell you care about.

Step 6: The Ownership Test

Final check. Ask yourself:

If someone pointed to any number in this output and said "what's going on with this?" — do I have an answer?

Go through the list from Step 1. For each item: can you give an account of it?

If there is any number you would have to shrug at, you are not done. Either resolve it or explicitly flag it as an open question before proceeding.

The rule: You don't own the output until you can account for all of it.

The Report

After completing all six steps, produce a brief Fletcher Report:

## Fletcher Report
**Output:** [what was audited]
**Date:** YYYY-MM-DD

### Step 1: Features Listed
[list]

### Step 2: What Would Generate This?
[for each feature: candidate explanations]

### Step 3: Hardest Feature
[what it is, whether it was resolved, FLAG if not]

### Step 4: Sample Size
[N, whether it makes sense, any issues]

### Step 5: Pattern Check
[coherence assessment]

### Step 6: Ownership Test
[pass / fail / partial — which numbers lack explanation]

### Ruling
[ ] CLEAR — proceed to interpretation
[ ] CONDITIONAL — proceed but acknowledge open questions explicitly
[ ] HOLD — do not interpret or publish until flagged items are resolved

Origin

See claudecode35/essay_fletcher.md for the full account of how this skill came to exist.

Related Skills

brycewang-stanford/literature-review-tools

tools

VerifiedTrustedCommunity

Recommend AND run open-source AI tools, agents, Claude Code / Codex skills, and MCP servers for any stage of a literature review — searching, reading, extracting, synthesizing, screening, citation-checking, and paper writing. Use when the user asks "what tool should I use to..." OR "install/run/use <tool> to ..." for research/lit-review work: automating a survey or related-work section, PDF→Markdown extraction for LLMs (MinerU/marker/docling), PRISMA / systematic review (ASReview), citation-backed Q&A over PDFs (PaperQA2), wiring papers into Claude/Cursor via MCP (arxiv/paper-search/zotero servers), or chatting with a Zotero library. Ships a launcher (scripts/litrun.py) that installs each tool in an isolated venv and runs it. Curated catalog of 70+ vetted projects. 支持中英文（用于「文献综述工具选型」与「一键安装/运行」）。

3,109SKILL.mdUpdated Jul 28, 2026

brycewang-stanford/literature-review-tools

brycewang-stanford/auto-empirical-research-skills

development

VerifiedTrustedCommunity

Route empirical-research requests through the Auto-Empirical Research Skills catalog when this whole repository is installed as one skill in Codex, CodeBuddy, Claude Code, or another IDE. Use to choose and load the right vendored AERS skill for causal inference, econometrics, replication, data acquisition, manuscript writing, peer review and referee responses, citation checking, de-AIGC editing, or full empirical-paper workflows without reading the entire repository at once.

3,109SKILL.mdUpdated Jun 27, 2026

brycewang-stanford/auto-empirical-research-skills

brycewang-stanford/aer-preregistration

documentation

VerifiedTrustedCommunity

Use when the project collects primary data or runs a field, lab, or survey experiment, before the intervention begins — write the pre-analysis plan, size the sample from a power calculation, and register with the AEA RCT Registry. Apply after the design is chosen in aer-identification and before any outcome data are seen.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/aer-preregistration

brycewang-stanford/economist-data-skill

tools

VerifiedTrustedCommunity

Guide economists to authoritative data sources with explicit, confirmed data specifications before retrieval; interfaces with Playwright MCP to navigate portals and extract real data, not articles about data.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/economist-data-skill

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research.git

# Copy into Claude Code skills folder (global)
cp -r Awesome-Agent-Skills-for-Empirical-Research/skills/13-scunning1975-MixtapeTools/skills/fletcher ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

1,685 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT