.claude/skills/mlops-engineer/SKILL.md
Provides MLOps patterns for ML CI/CD pipelines, model registries, monitoring, and data drift detection. Use when setting up ML infrastructure or when the user mentions MLOps, model deployment, ML pipeline, or model monitoring.
npx skillsauth add tranhieutt/software_development_department mlops-engineerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
| Need | Tool | When to use | |---|---|---| | Experiment tracking | MLflow | Open-source, self-hosted | | Experiment tracking | W&B | Cloud, rich visualization | | Pipeline orchestration | Kubeflow | Kubernetes-native | | Pipeline orchestration | Prefect | Python-first, dynamic | | Data version control | DVC | Git-based datasets & models | | Feature store | Feast | Open-source, online+offline | | Model serving | KServe | K8s serverless inference | | Model serving | SageMaker Endpoints | AWS managed | | Monitoring / drift | Evidently | Open-source, alerting | | CI/CD for ML | GitHub Actions + DVC | Lightweight |
import mlflow
import mlflow.sklearn
mlflow.set_tracking_uri("http://mlflow-server:5000")
mlflow.set_experiment("model-training")
with mlflow.start_run():
# Log params
mlflow.log_param("n_estimators", 100)
mlflow.log_param("max_depth", 5)
# Train
model = train(X_train, y_train)
metrics = evaluate(model, X_test, y_test)
# Log metrics
mlflow.log_metric("accuracy", metrics["accuracy"])
mlflow.log_metric("f1", metrics["f1"])
# Log model + register
mlflow.sklearn.log_model(
model, "model",
registered_model_name="fraud-detector",
)
# Promote to production via API
client = mlflow.tracking.MlflowClient()
client.transition_model_version_stage(
name="fraud-detector", version=3, stage="Production"
)
name: ML Pipeline
on:
push:
paths: ["data/**", "src/**", "params.yaml"]
jobs:
train-and-validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: iterative/setup-dvc@v1
- name: Pull data
run: dvc pull
- name: Run training pipeline
run: dvc repro
- name: Validate model metrics
run: |
python scripts/check_metrics.py \
--min-accuracy 0.92 \
--min-f1 0.88
- name: Register model if metrics pass
if: github.ref == 'refs/heads/main'
run: python scripts/register_model.py
env:
MLFLOW_TRACKING_URI: ${{ secrets.MLFLOW_URI }}
from fastapi import FastAPI
import mlflow.pyfunc
import os
app = FastAPI()
MODEL_NAME = os.environ["MODEL_NAME"]
MODEL_STAGE = os.environ.get("MODEL_STAGE", "Production")
# Load once on startup (cold start cost paid once)
model = mlflow.pyfunc.load_model(f"models:/{MODEL_NAME}/{MODEL_STAGE}")
@app.post("/predict")
async def predict(features: dict):
import pandas as pd
df = pd.DataFrame([features])
predictions = model.predict(df)
return {"predictions": predictions.tolist()}
@app.get("/health")
async def health():
return {"status": "healthy", "model": MODEL_NAME, "stage": MODEL_STAGE}
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset
import pandas as pd
def check_drift(reference_data: pd.DataFrame, production_data: pd.DataFrame) -> dict:
report = Report(metrics=[DataDriftPreset()])
report.run(reference_data=reference_data, current_data=production_data)
result = report.as_dict()
drift_detected = result["metrics"][0]["result"]["dataset_drift"]
drifted_features = [
f for f, v in result["metrics"][0]["result"]["drift_by_columns"].items()
if v["drift_detected"]
]
return {"drift_detected": drift_detected, "drifted_features": drifted_features}
# Trigger retraining if drift detected
if check_drift(ref, prod)["drift_detected"]:
trigger_retraining_pipeline()
pip freeze not just requirements.txtparams.yaml (DVC) or config.yaml for hyperparameters; never hardcode in training scriptsstages:
preprocess:
cmd: python src/preprocess.py
deps: [src/preprocess.py, data/raw/]
outs: [data/processed/]
params: [params.yaml:preprocess]
train:
cmd: python src/train.py
deps: [src/train.py, data/processed/]
outs: [models/model.pkl]
params: [params.yaml:train]
metrics: [metrics/train.json]
evaluate:
cmd: python src/evaluate.py
deps: [src/evaluate.py, models/model.pkl, data/processed/]
metrics: [metrics/eval.json]
testing
Generates high-fidelity architecture diagrams, sequence flows, and component maps for SDD projects. Use when finalizing a design phase, documenting system architecture, or visualizing agentic workflows. Default style: Style 6 (Claude Official).
data-ai
Provides vector database and semantic search patterns for Pinecone, Weaviate, Qdrant, Milvus, and pgvector in RAG and recommendation systems. Use when implementing vector search or when the user mentions vector database, semantic search, embeddings, or similarity search.
development
Updates docs/technical/CODEMAP.md by scanning the current codebase structure. Run after a significant feature merge, refactor, or when CODEMAP feels stale.
development
Unlocks the codebase after a release freeze or incident freeze period to resume normal development. Use when a freeze period ends or when the user mentions unfreezing or lifting the code freeze.