paper2skill/paper2skill-field-foundation/SKILL.md
Convert foundational papers that create new subfields into conceptual framework skills. Extracts problem definitions, vocabulary, founding experiments, and opened research directions. Use this skill when extracting skills from Category 8 (Field Foundation) papers — MAML-style paradigm-creating papers or 'Deep Learning' review-style papers that define entire research communities.
npx skillsauth add ADu2021/skillXiv paper2skill-field-foundationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill specializes in converting foundational papers — those that establish new fields, paradigms, or problem definitions — into structured agent skills that convey conceptual frameworks, vocabulary, and the empirical foundations of a research direction.
Foundational papers are the hardest category to extract skills from because they don't have a single "algorithm" to implement. Instead, they define a landscape: new vocabulary, new problem formulations, new ways of thinking about old problems, and opening moves in what becomes a field.
A field foundation paper does one or more of these:
Creates a new problem class. E.g., MAML introduced "meta-learning" as a distinct problem; before MAML, people did transfer learning or few-shot learning ad-hoc.
Introduces foundational vocabulary and concepts. The LeCun/Bengio/Hinton "Deep Learning" review established the vocabulary and conceptual framework the entire field uses.
Proposes a new paradigm over existing approaches. E.g., "Attention Is All You Need" (Transformers) didn't just introduce an architecture — it fundamentally changed how we think about sequence modeling.
Runs founding experiments that validate a new direction. E.g., GPT papers showed that scaling language model pretraining was a viable path forward (enabling the current era).
Opens multiple future research directions. A field foundation paper is one where you can point to dozens of follow-on papers and trace their lineage back.
A distinction: Infrastructure papers (PyTorch, HELM) introduce tools. Field foundation papers introduce ideas and problem definitions. Both can create new research areas, but in different ways.
Type 8A: Algorithmic Founding Papers Examples: MAML, Transformers, Vision Transformers These have a clear algorithm at their core, BUT they also define a problem class or paradigm.
Extraction: Do both — extract the algorithm (like you would for a technique paper), BUT also extract the paradigm they establish. What new problem do they define that didn't exist before?
Type 8B: Conceptual/Survey Founding Papers Examples: "Deep Learning" review (LeCun et al.), "A Primer on Neural Network Architectures" These have no single algorithm. Instead they establish vocabulary, categorize problems, and set the research agenda.
Extraction: This is purely conceptual. Extract vocabulary, problem taxonomy, opening directions, and founding experiments.
Type 8C: Paradigm-Shift Papers Examples: "Attention Is All You Need" (shifted from RNNs to Transformers), "Scaling Laws for Neural Language Models" These reframe how we think about a problem.
Extraction: What old assumptions did this paper invalidate? What new assumptions did it establish? What experiments prove the new paradigm works?
Does this paper... | Points | Notes ---|---|--- Introduce new terminology that becomes field-standard? | 1 | Check citations and follow-on papers Define a new problem class (not previously studied this way)? | 1 | Is there a clear "before and after" shift? Propose a new paradigm or alternative to existing approaches? | 1 | Does it reframe the problem? Run founding experiments validating the direction? | 1 | Are these canonical/replicated often? Open multiple subsequent research directions? | 1 | Do you see 10+ papers building on this idea?
Threshold: Need 3+ points to extract as a field foundation paper.
Read: intro, abstract, key figures
Read: method section (if there is one), early experiments
Read: related work (reframed through new vocabulary), discussion
Read: results, ablations, qualitative analysis
Read: conclusion, future work
PAPER: [title]
ARXIV: [verified ID]
URL: [full URL]
PROBLEM STATEMENT:
What is the problem this paper addresses?
Why is this an important problem?
What existing approaches are inadequate? (and why)
NEW PARADIGM OR PROBLEM CLASS:
What new way of thinking about this problem does it introduce?
What terminology becomes field-standard?
How does this reframe the problem?
CORE ALGORITHM/APPROACH:
High-level pseudocode or conceptual description (2-3 sentences)
Key innovation (what's novel vs prior work)
FOUNDING EXPERIMENTS:
What are the canonical experiments that validate this direction?
Key results (qualitative + quantitative)
Ablations: what matters, what doesn't
OPENED RESEARCH DIRECTIONS:
What follow-up questions does this paper raise?
What sub-problems or extensions are obvious next steps?
(List 3-5 obvious directions)
VOCABULARY INTRODUCED:
New terms or concepts that become standard (3-5 terms)
Brief definition for each
KEYWORDS: [5-10 foundational keywords]
PAPER: [title]
ARXIV: [verified ID]
URL: [full URL]
FIELD/PARADIGM BEING DEFINED:
What is the research area or way of thinking about a problem?
What makes this a coherent field/paradigm vs scattered topics?
PROBLEM TAXONOMY:
How does the paper categorize problems in this area?
Key axes of variation (e.g., supervised vs unsupervised, offline vs online)
Sub-problem classes with examples
FOUNDATIONAL CONCEPTS:
Core ideas that define how researchers in this field think (5-7 concepts)
For each: definition, why it's essential, relation to other concepts
VOCABULARY STANDARDIZED:
Terms that become field-standard after this paper (5-8 terms)
Brief definition for each, why the term matters
FOUNDING EXPERIMENTS & EVIDENCE:
What are the canonical experiments validating this paradigm?
Key empirical findings
What was surprising or counter-intuitive?
RESEARCH LANDSCAPE (BEFORE/AFTER):
How did people study this problem before?
What changed after this paper?
What became possible that wasn't before?
OPENED RESEARCH DIRECTIONS:
Explicit (listed in paper)
Implicit (obvious extensions of the framework)
Sub-fields this spawned (if applicable)
KEYWORDS: [5-10 vocabulary/paradigm keywords]
Title:
# [Technique/Paradigm Name]: [Outcome — what it lets you do]
Example: "Meta-Learning (MAML): Learn New Tasks From Few Examples"
Section 1: The Problem (2-3 paragraphs)
Example: "Before MAML, the field of learning from small datasets was scattered. Transfer learning worked via fine-tuning — but required many labeled examples. Few-shot learning was studied ad-hoc in specific domains (vision, NLP) with hand-engineered features. No one had a principled algorithm to learn how to learn from small datasets across different tasks."
Section 2: The Paradigm Shift
Example: "MAML reframed the problem. Instead of asking 'How do I fine-tune on small data?', it asked 'What initial parameters would let any model learn new tasks quickly with few examples?' This transformed learning-from-small-data into 'meta-learning' — a subfield with its own problem class, metrics, and benchmark tasks."
Section 3: The Core Idea (1-2 paragraphs)
For MAML: "The key insight: You don't need to learn the final model — you need to learn parameters that are positioned in loss landscape such that a few gradient steps land in a good solution. MAML optimizes the initial parameters to minimize loss after 1-5 fine-tuning steps on a new task."
Section 4: Founding Experiments (2-3 paragraphs)
Example table for MAML:
| Benchmark | MAML | Transfer Learning | Few-Shot Baseline | |-----------|------|------------------|-------------------| | Omniglot (1-shot) | 98.7% | 94.5% | 89.2% | | miniImageNet (5-shot) | 63.2% | 61.4% | 52.1% |
Also include: MAML worked across diverse domains (vision, RL, NLP) without modification — showing the paradigm was general.
Section 5: Vocabulary & Concepts
Example for MAML:
Section 6: Opened Research Directions
Example for MAML:
Section 7: When to Use This Paradigm When this direction is appropriate:
When NOT:
Reference:
Paper: https://arxiv.org/abs/XXXX.XXXXX
Code: [URL if available]
Related subfield: Meta-Learning
Title:
# [Field/Paradigm Name]: [Outcome — what thinking this enables]
Example: "Deep Learning: Representation Learning for AI and Massive Data"
Section 1: What is This Field? (2-3 paragraphs)
Example: "Deep Learning is the study of learning hierarchical representations from raw data. Unlike traditional machine learning (hand-engineered features + shallow learning), deep learning learns feature hierarchies automatically. The field emerged because: (1) neural networks are fundamentally scalable, (2) massive datasets changed what was tractable, (3) GPUs made large-scale training feasible."
Section 2: Problem Taxonomy
Example table for Deep Learning:
| Dimension | Variants | Examples | |-----------|----------|----------| | Architecture | CNNs, RNNs, Transformers | Image (CNN), Sequence (RNN/Transformer) | | Task | Supervised, Unsupervised, RL | Classification, Clustering, Policy Learning | | Scale | Small models, Medium, Large | ResNet-50, BERT, GPT | | Data Regime | Lots of labels, Few labels, No labels | ImageNet, Few-shot, Self-supervised |
Section 3: Foundational Concepts (3-5 paragraphs)
Example concepts for Deep Learning:
Section 4: Vocabulary & Terminology
Example for Deep Learning:
| Term | Definition | Used When | |------|-----------|-----------| | Activation Function | Element-wise nonlinearity (ReLU, tanh, sigmoid) that enables networks to learn nonlinear representations | Defining any neural network architecture | | Backpropagation | Algorithm to compute gradients of loss w.r.t. all parameters by reverse-mode AD | Training any supervised model | | Convolutional Layer | Parameter-efficient layer exploiting spatial structure via weight sharing and local receptive fields | Processing images, sequences, or grid-structured data | | Batch Normalization | Normalizing layer inputs to accelerate training and stabilize learning | Standard in modern deep nets for stability/speed | | Fine-tuning | Adapting a pre-trained model to a downstream task via continued training | Transfer learning workflow |
Section 5: Founding Experiments & Empirical Evidence
Example for Deep Learning:
Section 6: Pre vs. Post This Paper
Before this field/paradigm:
After this field/paradigm:
Example:
Before Deep Learning:
After Deep Learning:
99% on ImageNet is routine
Section 7: Opened Research Directions & Subfields
Example for Deep Learning:
Section 8: Scope & Limitations
Example for Deep Learning:
Reference:
Paper: https://arxiv.org/abs/XXXX.XXXXX
Related field: [Domain]
Survey/Review: [Yes/No, and if yes, what does it survey]
Don't oversimplify to a single algorithm. If the paper doesn't have one, say so. Foundational papers often define a landscape, not a technique.
Vocabulary matters. What terms does this paper introduce that become field-standard? These are the core extractable knowledge.
Reframe before/after. What changed about how people think about this problem after this paper?
Mine for opened directions. Field foundation papers are valuable because they open future research. Explicitly list what they make possible.
Include founding experiments. What are the canonical benchmarks or results that made this direction credible?
Be honest about limitations. No paradigm is universal. State where this field applies and where it fails.
When triaging foundational papers:
Field foundation extraction adapted from Anthropic's skills guide. Unlike algorithm or infrastructure extraction, foundational papers require distilling concepts, vocabulary, and paradigms. The output skill teaches frameworks and problem formulations, not procedures.
testing
Uses flow maps as look-ahead operators to enable principled reward-guided diffusion by predicting trajectory endpoints at any denoising step. Deploy when applying rewards or preferences to diffusion trajectories with meaningful gradients throughout generation.
testing
Train language models where each expert learns independently on closed datasets, enabling flexible inference with selective data inclusion or exclusion. 41% performance improvement while allowing users to opt out of specific data sources without retraining.
data-ai
Understand how token generation flexibility in diffusion LMs paradoxically constrains reasoning, as models exploit ordering flexibility to avoid uncertain tokens, and apply simplified approaches that preserve parallel decoding benefits. Use when optimizing diffusion-based language models for reasoning tasks.
devops
Enable LLM agents to improve continuously during deployment by constructing structured experience libraries through self-reflection on successes and failures—achieving 23% improvement on reasoning without gradient-based parameter updates or external training.