paper2skill/paper2skill-paradigm-challenge/SKILL.md
Convert papers that disprove conventional wisdom into paradigm-challenge skills. Extracts the prior belief, the falsifying experiment, and the revised principle. Use this skill when extracting skills from Category 3 (Paradigm Challenge) papers — papers that say 'rethinking', 'revisiting', or 'do we really need X', where the core move is adversarial (proving the community wrong).
npx skillsauth add ADu2021/skillXiv paper2skill-paradigm-challengeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Apply this skill when you encounter arXiv papers that:
Direction: ADVERSARIAL — the paper is arguing "you're wrong, here's why."
Examples: "Understanding Deep Learning Requires Rethinking Generalization" (proving that memorization doesn't explain generalization), "Rethinking Transformers in Solving POMDPs" (proving transformers are less capable than people thought in partially observable settings), "The Lottery Ticket Hypothesis" (proving neural networks contain subnetworks, invalidating assumptions about network necessity).
Do not use this skill for:
Extract what the community widely believed before this paper.
**Prior Belief:** What did the community assume to be true?
**Who Believes It:** Research groups, practitioners, textbooks, or folk wisdom?
**Why It Seemed True:** What evidence or intuition supported this belief?
**Implications:** What decisions did people make based on this belief?
Identify the controlled experiment that disproves the belief.
**Core Experiment:** What is the minimal experiment that proves the belief false?
**Experimental Design:**
- Control variables: What is held constant?
- Test variable: What is changed to test the belief?
- Measurement: How is the outcome measured?
- Sample size/conditions: How robust is the result?
**Why It's Convincing:** Why is this experiment hard to argue with?
**Edge Cases Tested:** Did the authors test boundary conditions?
Document the correct understanding that replaces the prior belief.
**Revised Principle:** What is now known to be true instead?
**Scope:** When does the revised principle apply? When doesn't it?
**Why It Matters:** How does this change practice?
**Remaining Questions:** What doesn't the revised principle explain?
Extract how the revised principle changes what practitioners should do.
**For Research:** What research directions are now invalid?
**For Engineering:** How should deployment strategies change?
**For Intuition:** What mental models need updating?
**False Hope:** What does the revised principle NOT enable?
Synthesize the change in understanding.
**Old Model:** [Prior belief and its predictions]
**New Model:** [Revised principle and its predictions]
**What Changed:** [The specific contradiction]
**Validation:** [How to verify the new model in your own work]
Generate a new SKILL.md with:
Frontmatter:
---
name: [kebab-case-paradigm-name]
title: [Paradigm Challenge: Rethinking {Belief}]
version: 0.0.2
engine: skillxiv-v0.0.2-claude-opus-4.6
license: MIT
url: [verified arxiv link to source paper]
keywords: [paradigm-challenge, falsified-belief, revised-principle, {domain}, {old-assumption}]
description: Overturn the assumption that {prior belief} by understanding {revised principle}. Includes the falsifying experiment from {paper source}, implications for {practice area}, enabling practitioners to {outcome} when {prior approach} fails.
---
Content structure:
Length: 150-250 lines
Paradigm Challenge (use this skill):
Exploratory (do NOT use this skill):
testing
Uses flow maps as look-ahead operators to enable principled reward-guided diffusion by predicting trajectory endpoints at any denoising step. Deploy when applying rewards or preferences to diffusion trajectories with meaningful gradients throughout generation.
testing
Train language models where each expert learns independently on closed datasets, enabling flexible inference with selective data inclusion or exclusion. 41% performance improvement while allowing users to opt out of specific data sources without retraining.
data-ai
Understand how token generation flexibility in diffusion LMs paradoxically constrains reasoning, as models exploit ordering flexibility to avoid uncertain tokens, and apply simplified approaches that preserve parallel decoding benefits. Use when optimizing diffusion-based language models for reasoning tasks.
devops
Enable LLM agents to improve continuously during deployment by constructing structured experience libraries through self-reflection on successes and failures—achieving 23% improvement on reasoning without gradient-based parameter updates or external training.