skills/43-wentorai-research-plugins/skills/domains/ai-ml/kolmogorov-arnold-networks-guide/SKILL.md
Papers and tutorials on KAN learnable activation networks
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research kolmogorov-arnold-networks-guideInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Kolmogorov-Arnold Networks (KANs) are a novel neural network architecture that places learnable activation functions on edges (weights) instead of fixed activations on nodes. Based on the Kolmogorov-Arnold representation theorem, KANs use B-spline functions as learnable edge activations, achieving better accuracy and interpretability than MLPs with fewer parameters in certain domains. This collection tracks the rapidly growing KAN literature.
Traditional MLP:
x → [fixed activation(linear transform)] → y
Activations on nodes, weights on edges
KAN:
x → [learnable spline functions on edges] → sum → y
Each edge learns its own activation function (B-spline)
Kolmogorov-Arnold Theorem:
f(x₁,...,xₙ) = Σ Φᵢ(Σ φᵢⱼ(xⱼ))
Any multivariate continuous function = composition of
univariate functions and addition
@article{liu2024kan,
title={KAN: Kolmogorov-Arnold Networks},
author={Liu, Ziming and Wang, Yixuan and Vaidya, Sachin and
Ruehle, Fabian and Halverson, James and
Solja{\v{c}}i{\'c}, Marin and Hou, Thomas Y. and
Tegmark, Max},
journal={arXiv:2404.19756},
year={2024}
}
# Using pykan (official implementation)
# pip install pykan
from kan import KAN
import torch
# Create a KAN model
model = KAN(
width=[2, 5, 1], # Input: 2, Hidden: 5, Output: 1
grid=5, # Spline grid resolution
k=3, # Spline order (cubic)
)
# Training data
x = torch.randn(1000, 2)
y = torch.sin(x[:, 0]) + torch.cos(x[:, 1])
y = y.unsqueeze(1)
# Train
dataset = {"train_input": x[:800], "train_label": y[:800],
"test_input": x[800:], "test_label": y[800:]}
model.train(dataset, steps=100, lr=0.01)
# Visualize learned functions
model.plot()
# Prune and simplify
model = model.prune()
model.plot()
# Comparison on function approximation
from kan import KAN
import torch.nn as nn
# KAN: learnable activations on edges
kan_model = KAN(width=[2, 5, 1], grid=5, k=3)
# Parameters: ~150 (spline coefficients)
# MLP: fixed activations on nodes
class MLP(nn.Module):
def __init__(self):
super().__init__()
self.net = nn.Sequential(
nn.Linear(2, 50),
nn.ReLU(),
nn.Linear(50, 50),
nn.ReLU(),
nn.Linear(50, 1),
)
def forward(self, x):
return self.net(x)
mlp_model = MLP()
# Parameters: ~2,700
# KAN advantages:
# - Fewer parameters for same accuracy
# - Interpretable (visualize learned functions)
# - Better for scientific discovery (symbolic regression)
# - Grid refinement for progressive accuracy
# MLP advantages:
# - Faster training
# - Better scaling to high dimensions
# - More mature tooling and optimization
| Variant | Innovation | Application | |---------|-----------|-------------| | KAN 2.0 | MultKAN with multiplication nodes | Improved scaling | | Temporal KAN | Time-series adaptation | Forecasting | | ConvKAN | KAN + convolutions | Image processing | | GraphKAN | KAN on graph structures | Graph learning | | FourierKAN | Fourier basis instead of splines | Periodic functions | | WavKAN | Wavelet-based activations | Signal processing | | BSRBF-KAN | B-spline + radial basis | Function approximation |
# KAN for symbolic regression (discovering equations)
from kan import KAN
# Generate data from unknown equation: f(x,y) = x*exp(y)
import torch
x = torch.rand(1000, 2) * 2
y = x[:, 0:1] * torch.exp(x[:, 1:2])
dataset = {"train_input": x[:800], "train_label": y[:800],
"test_input": x[800:], "test_label": y[800:]}
model = KAN(width=[2, 1, 1], grid=10, k=3)
model.train(dataset, steps=200)
# Symbolic fitting — discover the equation
model.auto_symbolic()
# Output: f(x₁, x₂) = x₁ * exp(x₂)
# KAN can discover symbolic expressions from data
### Key Research Directions
1. **Scaling** — Making KANs work at LLM scale
2. **Efficiency** — Reducing spline computation overhead
3. **Theory** — Understanding approximation guarantees
4. **Architecture search** — Optimal KAN topologies
5. **Hybrid models** — Combining KAN and MLP strengths
6. **Domain applications** — Physics, chemistry, biology
7. **Interpretability** — Extracting symbolic knowledge
development
Conduct rigorous thematic analysis (TA) of qualitative data following Braun and Clarke's (2006) six-phase framework. Use whenever the user mentions 'thematic analysis', 'TA', 'Braun and Clarke', 'qualitative coding', 'identifying themes', or asks for help analysing interviews, focus groups, open-ended survey responses, or transcripts to identify patterns. Also trigger for questions about inductive vs theoretical coding, semantic vs latent themes, essentialist vs constructionist epistemology, building a thematic map, or writing up a qualitative findings section. Covers all six phases, the four upfront analytic decisions, the 15-point quality checklist, and the five common pitfalls. Produces a Word document write-up and an annotated thematic map. Does NOT cover IPA, grounded theory, discourse analysis, conversation analysis, or narrative analysis — use a different method for those.
development
Guide users through writing a systematic literature review (SLR) following the PRISMA 2020 framework. Use this skill whenever the user mentions 'systematic review', 'systematic literature review', 'SLR', 'PRISMA', 'PRISMA 2020', 'PRISMA flow diagram', 'PRISMA checklist', or asks for help writing, structuring, or auditing a literature review that follows reporting guidelines. Also trigger when the user asks about inclusion/exclusion criteria for a review, search strategies for databases like Scopus/WoS/PubMed, study selection processes, risk of bias assessment, or narrative synthesis for a review paper. This skill covers the full PRISMA 2020 checklist (27 items), produces a Word document manuscript in strict journal article format, generates an annotated PRISMA flow diagram, and enforces APA 7th Edition referencing throughout. It does NOT cover meta-analysis or statistical pooling. By Chuah Kee Man.
testing
Performs placebo-in-time sensitivity analysis with hierarchical null model and optional Bayesian assurance. Use when checking model robustness, verifying lack of pre-intervention effects, or estimating study power.
data-ai
Fit, summarize, plot, and interpret a chosen CausalPy experiment. Use after the causal method has been selected, including when configuring PyMC/sklearn models and scale-aware custom priors.