engineering/advanced-ml-engineering/skills/mlops-pipeline/SKILL.md
This skill should be used when the user asks about "MLOps", "ML pipeline", "data pipeline", "feature engineering", "feature store", "data preprocessing", "model deployment", "model serving", "model registry", "experiment tracking", "MLflow", "Weights and Biases", "model versioning", "CI/CD for ML", "model monitoring", "data quality", "schema validation", "reproducibility", "technical debt in ML", or when operationalizing a machine learning model for production.
npx skillsauth add harsh040506/claude-code-unified-skill-plugin-library mlops-pipelineInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Provides systematic guidance for operationalizing machine learning models: from data ingestion and feature engineering through experiment tracking, deployment, and production monitoring. Directly addresses the "Hidden Technical Debt in ML Systems" (Sculley et al.) by making the surrounding infrastructure as rigorous as the model itself.
Data Sources → Feature Store → Training Pipeline → Model Registry → Serving Infrastructure → Monitoring
Each stage must be reproducible, versioned, and auditable.
Schema inference is the first gate:
Outlier detection (run before any feature engineering):
See references/data-engineering.md for full data quality checks, validation schemas, and data contract patterns.
Normalization (always log transformation params for inference parity):
Encoding:
Temporal features: extract as cyclical sin/cos pairs to preserve periodicity:
See references/feature-stores.md for feature store architecture, point-in-time correct joins, and online/offline serving patterns.
Every training run must record: | Artifact | Purpose | |---|---| | Git commit SHA | Code reproducibility | | Dataset hash (MD5/SHA256) | Data reproducibility | | Full hyperparameter config | Experiment reproducibility | | Random seed | Run reproducibility | | Environment (Python + library versions) | Dependency reproducibility |
Use MLflow or Weights & Biases for automatic artifact logging.
Dev: Local FastAPI endpoint for integration testing Staging: Docker container → Kubernetes (namespace: staging) + smoke tests Production: Blue-green or canary deployment (see self-healing-models skill)
Model serialization formats:
model.pt — PyTorch, for fine-tuning and retrainingmodel.onnx — runtime-agnostic, for cross-platform servingmodel.pkl — Scikit-learn pipeline including preprocessing stepsAlways include the preprocessing pipeline in the serialized model artifact to prevent training-serving skew.
See references/monitoring.md for production monitoring setup, alerting rules, and dashboard templates.
Before promoting any model to production:
testing
Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations. Use when users request QC analysis, filtering low-quality cells, assessing data quality, or following scverse/scanpy best practices for single-cell analysis.
tools
Deep learning for single-cell analysis using scvi-tools. This skill should be used when users need (1) data integration and batch correction with scVI/scANVI, (2) ATAC-seq analysis with PeakVI, (3) CITE-seq multi-modal analysis with totalVI, (4) multiome RNA+ATAC analysis with MultiVI, (5) spatial transcriptomics deconvolution with DestVI, (6) label transfer and reference mapping with scANVI/scArches, (7) RNA velocity with veloVI, or (8) any deep learning-based single-cell method. Triggers include mentions of scVI, scANVI, totalVI, PeakVI, MultiVI, DestVI, veloVI, sysVI, scArches, variational autoencoder, VAE, batch correction, data integration, multi-modal, CITE-seq, multiome, reference mapping, latent space.
testing
This skill should be used when scientists need help with research problem selection, project ideation, troubleshooting stuck projects, or strategic scientific decisions. Use this skill when users ask to pitch a new research idea, work through a project problem, evaluate project risks, plan research strategy, navigate decision trees, or get help choosing what scientific problem to work on. Typical requests include "I have an idea for a project", "I'm stuck on my research", "help me evaluate this project", "what should I work on", or "I need strategic advice about my research".
development
Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or public datasets from GEO/SRA. Triggers on nf-core, Nextflow, FASTQ analysis, variant calling, gene expression, differential expression, GEO reanalysis, GSE/GSM/SRR accessions, or samplesheet creation.