engineering/advanced-ml-engineering/skills/self-healing-models/SKILL.md
This skill should be used when the user asks about "concept drift", "data drift", "model degradation", "model staleness", "production model monitoring", "self-healing", "automatic retraining", "shadow model", "champion-challenger", "blue-green deployment", "canary deployment", "Kolmogorov-Smirnov test", "PSI", "Population Stability Index", "model refresh", "continuous learning", "online learning", or when a production model's performance has degraded over time due to distribution shift.
npx skillsauth add harsh040506/claude-code-unified-skill-plugin-library self-healing-modelsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Provides the complete framework for detecting statistical distribution shifts in production ML systems and autonomously responding with shadow retraining and zero-downtime model updates. Implements the "Self-Healing Model" concept where concept drift triggers an end-to-end recovery cycle without manual intervention.
| Drift Type | Description | Detection Method | |---|---|---| | Sudden drift | Abrupt change at a specific time point (e.g., system change, world event) | Immediate KS test comparison | | Gradual drift | Slow, continuous shift (e.g., user behavior evolution) | Rolling window PSI trend | | Recurrent drift | Seasonal or cyclic patterns reappear (e.g., holiday behavior) | Season-aware baseline comparison | | Incremental drift | Monotonic directional drift over time | Linear trend test on PSI | | Feature drift | Input distribution changes but relationship to target is stable | Input-only KS/PSI (re-calibrate, don't retrain) | | Label drift | Relationship between features and target changes (true concept drift) | Requires label monitoring + performance tracking |
Kolmogorov-Smirnov (KS) Test for continuous features:
Chi-Square Test for categorical features:
Population Stability Index (PSI) for magnitude quantification:
Drift aggregation rule (avoids false positives from individual feature noise):
See references/drift-detection.md for complete test implementations, window size selection, and rolling-window monitoring patterns.
1. Drift Detected (KS + PSI thresholds exceeded)
↓
2. Shadow Training Triggered (new data distribution)
↓
3. Challenger Model Evaluated vs. Champion
↓
4. Statistical Significance Test (paired t-test, p < 0.05)
↓
5. Blue-Green Deployment (10% canary → 100% cutover)
↓
6. Champion model archived (24-hour rollback window)
Shadow training runs the full training pipeline on the drifted data distribution without disrupting the production model:
Before any deployment, the challenger must beat the champion on the current data distribution:
See references/deployment-strategies.md for blue-green, canary, and shadow traffic routing patterns with Kubernetes ingress configurations.
If post-deployment monitoring detects regression within 24 hours:
testing
Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations. Use when users request QC analysis, filtering low-quality cells, assessing data quality, or following scverse/scanpy best practices for single-cell analysis.
tools
Deep learning for single-cell analysis using scvi-tools. This skill should be used when users need (1) data integration and batch correction with scVI/scANVI, (2) ATAC-seq analysis with PeakVI, (3) CITE-seq multi-modal analysis with totalVI, (4) multiome RNA+ATAC analysis with MultiVI, (5) spatial transcriptomics deconvolution with DestVI, (6) label transfer and reference mapping with scANVI/scArches, (7) RNA velocity with veloVI, or (8) any deep learning-based single-cell method. Triggers include mentions of scVI, scANVI, totalVI, PeakVI, MultiVI, DestVI, veloVI, sysVI, scArches, variational autoencoder, VAE, batch correction, data integration, multi-modal, CITE-seq, multiome, reference mapping, latent space.
testing
This skill should be used when scientists need help with research problem selection, project ideation, troubleshooting stuck projects, or strategic scientific decisions. Use this skill when users ask to pitch a new research idea, work through a project problem, evaluate project risks, plan research strategy, navigate decision trees, or get help choosing what scientific problem to work on. Typical requests include "I have an idea for a project", "I'm stuck on my research", "help me evaluate this project", "what should I work on", or "I need strategic advice about my research".
development
Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or public datasets from GEO/SRA. Triggers on nf-core, Nextflow, FASTQ analysis, variant calling, gene expression, differential expression, GEO reanalysis, GSE/GSM/SRR accessions, or samplesheet creation.