skills/skillxiv-v0.0.2-claude-opus-4.6/config-knowledge-distillation/SKILL.md
Improve student model robustness under covariate shift by using diffusion-based augmentation that targets spurious features via teacher-student disagreement.
npx skillsauth add ADu2021/skillXiv config-knowledge-distillationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When training data contains spurious features absent at test time, knowledge distillation can preserve these biases in student models. ConfiG addresses this by generating augmented images that maximize disagreement between teacher and student, targeting exactly the spurious correlations the student learned. This diffusion-based approach enables students to overcome dataset biases while maintaining knowledge transfer.
import torch
import numpy as np
from typing import Tuple, List
class CovariateShiftAnalyzer:
"""Analyze how spurious features affect generalization"""
def decompose_error(self, teacher_model, student_model,
train_data, test_data,
group_labels_test) -> Dict:
"""
Decompose generalization error into two components:
1. Teacher quality: how well teacher generalizes
2. Distributional gap: how much student differs from teacher distribution
Key insight: ConfiG targets reducing the gap, not improving teacher quality.
"""
# Evaluate on training distribution (in-distribution)
train_acc = self.evaluate_accuracy(student_model, train_data)
# Evaluate on test distribution
test_acc = self.evaluate_accuracy(student_model, test_data)
# Evaluate per test group
group_accs = {}
for group_id in np.unique(group_labels_test):
group_mask = group_labels_test == group_id
group_test = test_data[group_mask]
group_accs[group_id] = self.evaluate_accuracy(student_model, group_test)
# Decompose error
overall_gap = train_acc - test_acc
group_gaps = {g: train_acc - acc for g, acc in group_accs.items()}
print("=== Generalization Analysis ===")
print(f"Overall train accuracy: {train_acc:.1%}")
print(f"Overall test accuracy: {test_acc:.1%}")
print(f"Overall gap: {overall_gap:.1%}")
print("\nPer-group performance:")
for group_id, acc in group_accs.items():
print(f" Group {group_id}: {acc:.1%} (gap: {group_gaps[group_id]:.1%})")
return {
'overall_gap': overall_gap,
'group_gaps': group_gaps,
'group_accs': group_accs,
}
def identify_spurious_correlations(self, train_data,
train_labels,
group_labels) -> List[Tuple]:
"""
Identify features that are predictive in training but spurious.
Example: Blond hair → Female in CelebA, but this breaks in real data.
"""
spurious_correlations = []
# Analyze correlations between features and groups
unique_groups = np.unique(group_labels)
for group_id in unique_groups:
group_mask = group_labels == group_id
# Extract feature statistics for this group
group_data = train_data[group_mask]
other_data = train_data[~group_mask]
# Compute feature divergence between groups
# Features with high divergence are likely spurious
feature_divergence = self._compute_feature_divergence(
group_data, other_data
)
# Identify top divergent features
top_divergent = sorted(
enumerate(feature_divergence),
key=lambda x: x[1],
reverse=True
)[:5]
for feature_idx, divergence in top_divergent:
spurious_correlations.append({
'group': group_id,
'feature': feature_idx,
'divergence': divergence,
})
return spurious_correlations
def _compute_feature_divergence(self, group_data, other_data) -> np.ndarray:
"""Measure KL divergence of feature distributions"""
# Simplified: compute histogram divergence per feature
divergences = []
for feature_idx in range(group_data.shape[1]):
# Histogram of feature in group vs out
hist_group = np.histogram(group_data[:, feature_idx], bins=20)[0]
hist_other = np.histogram(other_data[:, feature_idx], bins=20)[0]
# Normalize
hist_group = hist_group / (np.sum(hist_group) + 1e-8)
hist_other = hist_other / (np.sum(hist_other) + 1e-8)
# KL divergence
kl = np.sum(hist_group * np.log((hist_group + 1e-8) / (hist_other + 1e-8)))
divergences.append(kl)
return np.array(divergences)
import torch.nn.functional as F
class ConfidenceGuidedDiffusion:
"""Generate augmented images targeting spurious features"""
def __init__(self, diffusion_model, teacher_model, student_model):
self.diffusion = diffusion_model # Pretrained diffusion (e.g., Stable Diffusion)
self.teacher = teacher_model
self.student = student_model
def generate_augmented_sample(self, image: torch.Tensor,
label: int,
gamma: float = 2.0,
num_iterations: int = 100) -> torch.Tensor:
"""
Generate augmented image by optimizing latent vector z.
Objective: maximize loss(z) = t(z)^γ + (1-f(z))^γ
where:
t(z) = teacher confidence on augmented sample
f(z) = student confidence on augmented sample
γ = 2.0 (empirically optimal)
This targets spurious features the student learned.
"""
# Initialize random latent in diffusion space
z = torch.randn(1, 4, image.shape[1]//8, image.shape[2]//8)
z.requires_grad = True
# Optimizer for latent variables
optimizer = torch.optim.Adam([z], lr=0.01)
for iteration in range(num_iterations):
# Decode latent to image
augmented_image = self.diffusion.decode(z)
# Get predictions
with torch.no_grad():
teacher_logits = self.teacher(augmented_image)
student_logits = self.student(augmented_image)
teacher_probs = F.softmax(teacher_logits, dim=-1)
student_probs = F.softmax(student_logits, dim=-1)
# Extract confidence for true label
teacher_conf = teacher_probs[0, label] # High = good, preserve label
student_conf = student_probs[0, label] # Low = good, challenge student
# Confidence-guided loss
loss = (teacher_conf ** gamma) + ((1.0 - student_conf) ** gamma)
# Backward step
optimizer.zero_grad()
(-loss).backward() # Maximize by minimizing negative
optimizer.step()
if (iteration + 1) % 20 == 0:
print(f"Iter {iteration + 1}: "
f"teacher_conf={teacher_conf:.3f}, "
f"student_conf={student_conf:.3f}, "
f"loss={loss:.3f}")
# Decode final latent
with torch.no_grad():
augmented = self.diffusion.decode(z)
return augmented
def generate_augmented_dataset(self, train_images: torch.Tensor,
train_labels: torch.Tensor,
augmentation_ratio: float = 0.5) -> Tuple[torch.Tensor, torch.Tensor]:
"""
Generate augmented samples for a subset of training data.
Focus on misclassified or uncertain samples.
"""
# Identify samples where student is uncertain
with torch.no_grad():
student_logits = self.student(train_images)
student_probs = F.softmax(student_logits, dim=-1)
student_confidence = torch.max(student_probs, dim=-1)[0]
# Select samples with low confidence (more need augmentation)
num_to_augment = int(len(train_images) * augmentation_ratio)
uncertain_indices = torch.argsort(student_confidence)[:num_to_augment]
# Generate augmentations
augmented_images = []
augmented_labels = []
for idx in uncertain_indices:
image = train_images[idx].unsqueeze(0)
label = train_labels[idx].item()
print(f"Generating augmentation {len(augmented_images) + 1}/{num_to_augment}")
augmented = self.generate_augmented_sample(image, label)
augmented_images.append(augmented)
augmented_labels.append(label)
# Concatenate original and augmented
all_images = torch.cat([train_images] + augmented_images, dim=0)
all_labels = torch.cat([
train_labels,
torch.tensor(augmented_labels, device=train_labels.device)
], dim=0)
return all_images, all_labels
class KDWithConfiG:
"""Knowledge distillation enhanced with confidence-guided augmentation"""
def __init__(self, teacher_model, student_model,
diffusion_model, temperature: float = 4.0):
self.teacher = teacher_model
self.student = student_model
self.diffusion = diffusion_model
self.temperature = temperature
self.confidence_aug = ConfidenceGuidedDiffusion(
diffusion_model, teacher_model, student_model
)
self.optimizer = torch.optim.Adam(student_model.parameters(), lr=1e-4)
def distillation_loss(self, student_logits: torch.Tensor,
teacher_logits: torch.Tensor) -> torch.Tensor:
"""KL divergence between student and teacher distributions"""
student_probs = F.softmax(student_logits / self.temperature, dim=-1)
teacher_probs = F.softmax(teacher_logits / self.temperature, dim=-1)
kl_loss = F.kl_div(
torch.log(student_probs + 1e-8),
teacher_probs.detach(),
reduction='batchmean'
)
return kl_loss * (self.temperature ** 2)
def training_step(self, batch_images: torch.Tensor,
batch_labels: torch.Tensor) -> float:
"""Single training step on original + augmented data"""
# Forward pass
student_logits = self.student(batch_images)
with torch.no_grad():
teacher_logits = self.teacher(batch_images)
# Compute distillation loss
loss = self.distillation_loss(student_logits, teacher_logits)
# Backward
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
return loss.item()
def train_with_augmentation(self, train_images: torch.Tensor,
train_labels: torch.Tensor,
num_epochs: int = 10) -> Dict:
"""Train student with periodic augmentation"""
history = {'loss': []}
for epoch in range(num_epochs):
print(f"\n=== Epoch {epoch + 1}/{num_epochs} ===")
# Original training step
loss = self.training_step(train_images, train_labels)
history['loss'].append(loss)
print(f"Original data loss: {loss:.4f}")
# Every N epochs: generate augmentations and finetune
if (epoch + 1) % 3 == 0:
print("Generating confidence-guided augmentations...")
aug_images, aug_labels = self.confidence_aug.generate_augmented_dataset(
train_images, train_labels, augmentation_ratio=0.3
)
print(f"Augmented dataset size: {len(aug_images)}")
# Fine-tune on augmented data
for aug_epoch in range(2):
aug_loss = self.training_step(aug_images, aug_labels)
print(f" Augmented epoch {aug_epoch + 1}: loss={aug_loss:.4f}")
return history
class HybridBiasMitigation:
"""Combine data-centric (ConfiG) and model-centric (TAB) approaches"""
def __init__(self, teacher_model, student_model, diffusion_model):
self.kd_config = KDWithConfiG(
teacher_model, student_model, diffusion_model
)
# Model-centric: Train-aware batch normalization (TAB)
self.model_centric = TrainAwareBatchNorm(student_model)
def train_hybrid(self, train_images: torch.Tensor,
train_labels: torch.Tensor,
group_labels: torch.Tensor) -> Dict:
"""
Data-centric: ConfiG augmentation targeting spurious features
Model-centric: TAB encouraging invariant features
"""
print("Starting hybrid bias mitigation training...")
print("Data-centric: Confidence-guided augmentation")
print("Model-centric: Train-aware batch normalization")
# Phase 1: Data-centric augmentation
kd_history = self.kd_config.train_with_augmentation(
train_images, train_labels, num_epochs=10
)
# Phase 2: Model-centric refinement with TAB
tab_history = self.model_centric.train_with_tab(
train_images, train_labels, group_labels, num_epochs=5
)
return {
'kd_history': kd_history,
'tab_history': tab_history,
}
Identify Spurious Correlations: Start by analyzing your training data for features that correlate with labels but are likely absent in test data (e.g., background, lighting, specific objects).
Teacher Quality Matters: ConfiG assumes teacher model is robust. If teacher is biased, augmentation won't help. Use a well-trained teacher or synthetic data.
Gamma Parameter: gamma=2.0 is empirically optimal. Higher γ concentrates learning on high-disagreement samples; lower γ spreads it more broadly.
Augmentation Ratio: Augment 30-50% of training data, focusing on uncertain samples. Over-augmentation can degrade in-distribution performance.
Hybrid Approach Works Best: Combine data-centric (ConfiG) augmentation with model-centric approaches (TAB, group normalization). They're complementary and achieve better results together.
Computational Cost: Diffusion-based augmentation is expensive (100+ iterations per image). Pre-generate augmented dataset in batch, don't do online.
testing
Uses flow maps as look-ahead operators to enable principled reward-guided diffusion by predicting trajectory endpoints at any denoising step. Deploy when applying rewards or preferences to diffusion trajectories with meaningful gradients throughout generation.
testing
Train language models where each expert learns independently on closed datasets, enabling flexible inference with selective data inclusion or exclusion. 41% performance improvement while allowing users to opt out of specific data sources without retraining.
data-ai
Understand how token generation flexibility in diffusion LMs paradoxically constrains reasoning, as models exploit ordering flexibility to avoid uncertain tokens, and apply simplified approaches that preserve parallel decoding benefits. Use when optimizing diffusion-based language models for reasoning tasks.
devops
Enable LLM agents to improve continuously during deployment by constructing structured experience libraries through self-reflection on successes and failures—achieving 23% improvement on reasoning without gradient-based parameter updates or external training.