paper2skill/paper2skill-component-innovation/SKILL.md
Convert component innovation papers into drop-in replacement guides. Extracts what was swapped, why, conditions for when it helps, and the performance delta. Use this skill when extracting skills from Category 5 (Component Innovation) papers — BatchNorm-style papers, ResNet skip connections, new loss functions, or any paper proposing one elegant modification with outsized impact.
npx skillsauth add ADu2021/skillXiv paper2skill-component-innovationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this extraction for papers that:
Value signal: These papers enable drop-in replacements with known performance impact. Practitioners can immediately A/B test.
Examples: Batch Normalization, ResNet skip connections, Transformers Without Normalization, RMSprop optimizer, Focal Loss, Mixup data augmentation, Rotary Embeddings
Skip this category if:
Clearly name what is being replaced:
Component type: [normalization / attention mechanism / loss function / optimizer / data augmentation / activation / regularization]
Old component: [What existed before, or what this improves upon]
New component: [What is being proposed]
Positioned as: [A swap / A modification / A replacement]
Why was the old component insufficient?
Problem with the old approach:
- Technical limitation: [E.g., "Batch normalization has reduced expressiveness at test time"]
- Practical pain point: [E.g., "Difficult to train on sequences with variable length"]
- Theoretical gap: [E.g., "Inconsistency between training and inference"]
- Empirical observation: [E.g., "Performance saturates despite better architectures"]
The paper's insight:
[One-sentence explanation of why the new component is better]
Show the change as code (must be <20 lines for surgical changes):
# Clear explanation: what specifically changes from old to new
def old_component(x):
"""Original implementation"""
pass
def new_component(x):
"""Modified implementation — minimal surgical change"""
pass
If code is >20 lines, move to scripts/ folder and reference it.
Extract exact numbers from ablations:
Baseline (old component): [Metric] = X.XX
With new component:
- Test accuracy: +Y.YY percentage points (Z% relative improvement)
- Training time: [faster/slower] by W%
- Convergence speed: [number of steps/epochs to reach accuracy]
- Memory usage: [higher/lower] by V%
Ablation variants tested:
- Without [subcomponent of new approach]: -W.WW (shows each part matters)
- In combination with [related trick]: +Z.ZZ (shows interactions)
Critical: when does this swap help or hurt?
This component swap works best when:
- Model architecture: [transformer / CNN / RNN / specific family]
- Dataset scale: [small / medium / large / ImageNet-scale]
- Task domain: [vision / language / speech / multimodal]
- Training regime: [LR magnitude, batch size, optimization details]
- Data properties: [distribution assumptions, input characteristics]
This swap may hurt or provide no benefit when:
- [Opposite of above conditions]
- [Specific architectural conflicts, e.g., "incompatible with batch norm"]
- [Regime where old component was already optimal]
Surprising findings:
- [Unexpected interaction or limitation discovered in ablations]
Practical guidance for practitioners:
To swap this component:
1. Replace [old API] with [new API]
Code pattern: [2-3 line snippet showing the swap]
2. If your code has [specific pattern], adjust [specific way]
3. Verify: [what to measure to confirm the swap worked]
4. Optional tuning: [if performance doesn't match, try]
- [Hyperparameter A]: suggested range [X-Y]
- [Hyperparameter B]: suggested range [X-Y]
5. Known issues:
- [Issue 1]: workaround is [solution]
- [Issue 2]: usually not a problem unless [condition]
Generate a SKILL.md for the component swap:
---
name: [component-type-identifier]
title: [Paper title — action-oriented "Swap X with Y"]
version: 0.0.2
engine: skillxiv-v0.0.2-claude-opus-4.6
license: MIT
url: [arXiv HTML link]
keywords: [component-type, old-name, new-name, impact-metric, condition-tag]
description: |
Swap [old component] with [new component] to gain [X% improvement] on [metric]. Works best for [conditions].
Trigger: When optimizing [model family] on [task type] and want to improve [metric], test this component swap.
---
## What This Skill Does
Replace [old component] with [new component] to improve [outcome metric] by [X%] under [conditions].
## The Swap
[Minimal code showing the surgical change: old vs new]
## Performance Impact
- Improvement: +Y.YY on [metric] (Z% relative)
- Cost: [memory/speed/complexity change, if any]
- Ablation: [subcomponents that matter most]
## When to Use
- Optimizing [model family] on [task type]
- [Other condition where this swap applies]
- When you can verify improvement on your specific benchmark
## When NOT to Use
- If your model has [incompatible property]
- On [different task/domain] where it wasn't tested
- If [specific condition] is true for your setup
## Implementation Checklist
[Swap checklist with verification steps]
## Related Work
This builds on [prior approaches] and relates to [similar swaps].
For extraction success:
Common pitfalls to avoid:
testing
Uses flow maps as look-ahead operators to enable principled reward-guided diffusion by predicting trajectory endpoints at any denoising step. Deploy when applying rewards or preferences to diffusion trajectories with meaningful gradients throughout generation.
testing
Train language models where each expert learns independently on closed datasets, enabling flexible inference with selective data inclusion or exclusion. 41% performance improvement while allowing users to opt out of specific data sources without retraining.
data-ai
Understand how token generation flexibility in diffusion LMs paradoxically constrains reasoning, as models exploit ordering flexibility to avoid uncertain tokens, and apply simplified approaches that preserve parallel decoding benefits. Use when optimizing diffusion-based language models for reasoning tasks.
devops
Enable LLM agents to improve continuously during deployment by constructing structured experience libraries through self-reflection on successes and failures—achieving 23% improvement on reasoning without gradient-based parameter updates or external training.