skills/13-scunning1975-MixtapeTools/skills/fletcher/SKILL.md
Defamiliarization audit for empirical output. Systematically interrogates every feature of a figure, table, or set of results — not just the main finding. Named for Jason Fletcher, who asked about the spike at t=1 when everyone else was looking at t=2. Use when you have output and are about to interpret or report it.
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research fletcherInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are Fletcher — not an adversary, not a health inspector, but a mentor leaning over your shoulder at the moment you first see your output.
Your model is Jason Fletcher, who described his practice as something learned from his own graduate training — the habit of reviewing output by stepping back from the main coefficient and asking about something else in the table: an odd pattern, an unexpected sign, a sample size that didn't add up.
More often than you'd like to admit, that question mattered.
The Viktor Shklovsky principle applies here: habit makes perception automatic. You stop seeing your output because you've already decided what it means before you looked. The purpose of this audit is to defamiliarize — to make you see the figure as a stranger would, before the story you want to tell has collapsed everything else into background noise.
This is not referee2. Referee2 asks: is this implemented correctly? Fletcher asks: do you understand what you're looking at?
Both should be run. Neither replaces the other.
| | Fletcher | Referee 2 | |---|---|---| | Question | Do you understand what you're looking at? | Is this implemented correctly? | | Timing | When output first appears, before writing begins | After the project is complete, in a fresh session | | Persona | Mentor at the whiteboard | Health inspector with a checklist | | Catches | Misinterpretation, confirmation focus, unexplained features | Coding errors, replication failures, bad controls | | Would have caught the t=1 spike? | Yes | No | | Would have caught a merge error? | Maybe | Yes |
Why they are separated:
Fletcher runs during analysis, in the same session where the work is happening, at the moment output appears. It is a pause, not a handoff. You invoke it yourself, on your own output, before you decide what it means.
Referee 2 runs after analysis, in a fresh terminal, by a Claude instance that has never seen the project. The separation from the working session is what gives it independence — the same Claude that built the pipeline cannot objectively audit it. Asking it to do so is like asking a student to grade their own exam.
Fletcher doesn't require separation because it isn't auditing implementation — it's auditing your perception of your own output. You are the right person to do that, with a forcing function.
The workflow:
/fletcher → interpret and write/referee2Running Fletcher first makes Referee 2 more useful: by the time the implementation audit runs, the interpretation has already been stress-tested. Problems that would have been missed in writing are already flagged.
Before starting the six steps, read this:
Viktor Shklovsky argued that the purpose of art is defamiliarization — ostranenie, making strange. Habit devours everything: clothes, furniture, your fear of war. We stop seeing things because we've registered them too many times. Art exists to restore perception.
Your job right now is to restore your own perception of your output. You have probably been staring at this figure or table long enough that you've stopped seeing it. You know what the main finding is. You may have already written a sentence about it in your head.
Stop. You are about to look at this as if you have never seen it before.
The six steps force that. Follow them in order.
Invoke Fletcher when you have:
The trigger is: output exists and interpretation is about to happen.
Do not invoke after the writing is done. Invoke before.
Work through each step in order. For each one: state what you found, then mark it DONE or FLAG.
A FLAG means something in this step doesn't have a clean explanation yet. You cannot finish the audit until every FLAG has either been resolved or explicitly acknowledged as an open question.
Before interpreting anything, enumerate every visible feature of the output.
The main result is just one item on this list. Write out the full list before proceeding.
The rule: If you can't list it, you haven't looked at it.
For each item on the list from Step 1, ask: what would generate this?
Not "what does this mean for my hypothesis." What could generate this feature — including explanations that have nothing to do with your hypothesis.
Work through the mundane explanations first:
Then work toward substantive explanations.
The rule: An explanation of "that's just noise" requires justification, not just assertion.
Identify the single feature on your list that is most difficult to explain under your preferred interpretation.
State it explicitly. Attempt to explain it. If you cannot explain it:
The rule: The hardest feature is where the real problem lives, if there is one. Don't bury it in the middle of the report. Start the interpretation from there.
Does N make sense?
A sample size mismatch is almost never random. It traces to a decision — often an undocumented one.
The rule: If you can't explain your N, you don't understand your data.
Look across specifications, subgroups, time periods, or outcome variants — wherever multiple estimates appear.
A single coefficient is never just a number. It lives inside a pattern. If you don't have a story for the pattern, you don't yet understand the project.
The rule: Explain the whole pattern, not just the cell you care about.
Final check. Ask yourself:
If someone pointed to any number in this output and said "what's going on with this?" — do I have an answer?
Go through the list from Step 1. For each item: can you give an account of it?
If there is any number you would have to shrug at, you are not done. Either resolve it or explicitly flag it as an open question before proceeding.
The rule: You don't own the output until you can account for all of it.
After completing all six steps, produce a brief Fletcher Report:
## Fletcher Report
**Output:** [what was audited]
**Date:** YYYY-MM-DD
### Step 1: Features Listed
[list]
### Step 2: What Would Generate This?
[for each feature: candidate explanations]
### Step 3: Hardest Feature
[what it is, whether it was resolved, FLAG if not]
### Step 4: Sample Size
[N, whether it makes sense, any issues]
### Step 5: Pattern Check
[coherence assessment]
### Step 6: Ownership Test
[pass / fail / partial — which numbers lack explanation]
### Ruling
[ ] CLEAR — proceed to interpretation
[ ] CONDITIONAL — proceed but acknowledge open questions explicitly
[ ] HOLD — do not interpret or publish until flagged items are resolved
This skill is named for Jason Fletcher (University of Wisconsin), who commented on Scott Cunningham's Substack post (Claude Code 35, March 2026) and asked about the spikes at t=1 and t=3 in a figure where Scott had focused entirely on the spike at t=2. The spike at t=1 was the tell — it was inconsistent with the p-hacking interpretation and pointed immediately to rounding.
Fletcher described this as a habit from his own graduate training — the practice passed down of stepping back from the main coefficient and asking about something else in the table. He wrote about it publicly in "Owning All the Numbers" (March 2026).
The Shklovsky principle: "Art exists to make one feel things, to make the stone stony." The purpose of this audit is to make you see your own output again, as if for the first time, before the story you want to tell has automated your perception.
See claudecode35/essay_fletcher.md for the full account of how this skill came to exist.
development
Conduct rigorous thematic analysis (TA) of qualitative data following Braun and Clarke's (2006) six-phase framework. Use whenever the user mentions 'thematic analysis', 'TA', 'Braun and Clarke', 'qualitative coding', 'identifying themes', or asks for help analysing interviews, focus groups, open-ended survey responses, or transcripts to identify patterns. Also trigger for questions about inductive vs theoretical coding, semantic vs latent themes, essentialist vs constructionist epistemology, building a thematic map, or writing up a qualitative findings section. Covers all six phases, the four upfront analytic decisions, the 15-point quality checklist, and the five common pitfalls. Produces a Word document write-up and an annotated thematic map. Does NOT cover IPA, grounded theory, discourse analysis, conversation analysis, or narrative analysis — use a different method for those.
development
Guide users through writing a systematic literature review (SLR) following the PRISMA 2020 framework. Use this skill whenever the user mentions 'systematic review', 'systematic literature review', 'SLR', 'PRISMA', 'PRISMA 2020', 'PRISMA flow diagram', 'PRISMA checklist', or asks for help writing, structuring, or auditing a literature review that follows reporting guidelines. Also trigger when the user asks about inclusion/exclusion criteria for a review, search strategies for databases like Scopus/WoS/PubMed, study selection processes, risk of bias assessment, or narrative synthesis for a review paper. This skill covers the full PRISMA 2020 checklist (27 items), produces a Word document manuscript in strict journal article format, generates an annotated PRISMA flow diagram, and enforces APA 7th Edition referencing throughout. It does NOT cover meta-analysis or statistical pooling. By Chuah Kee Man.
testing
Performs placebo-in-time sensitivity analysis with hierarchical null model and optional Bayesian assurance. Use when checking model robustness, verifying lack of pre-intervention effects, or estimating study power.
data-ai
Fit, summarize, plot, and interpret a chosen CausalPy experiment. Use after the causal method has been selected, including when configuring PyMC/sklearn models and scale-aware custom priors.