i18n/de/skills/monitor-model-drift/SKILL.md
Implementieren comprehensive model drift monitoring using Evidently AI, statistical tests (PSI, KS), and custom metrics to detect data drift and concept drift in production ML systems. Set up automated alerting and reporting workflows to catch degradation vor it impacts business metrics. Verwenden wenn production models show unexplained performance degradation, when new data distributions differ from training data, when seasonal shifts affect input features, or when regulatory requirements mandate model monitoring.
npx skillsauth add pjt222/agent-almanac monitor-model-driftInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
See Extended Examples for complete configuration files and templates.
Detect and alert on data drift and concept drift in production ML models using statistical tests and automated monitoring.
Einrichten the monitoring framework with appropriate Abhaengigkeiten.
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install Evidently and dependencies
pip install evidently pandas scikit-learn prometheus-client
# Create monitoring directory structure
mkdir -p monitoring/{reports,config,alerts}
Erstellen configuration file:
# monitoring/config/drift_config.py
from evidently.metric_preset import DataDriftPreset, TargetDriftPreset
from evidently.metrics import (
DatasetDriftMetric,
DatasetMissingValuesMetric,
ColumnDriftMetric,
)
# ... (see EXAMPLES.md for complete implementation)
Erwartet: Configuration file created with thresholds matching your model's tolerance.
Bei Fehler: Starten with conservative thresholds (PSI > 0.2, KS p-value < 0.01) and tune basierend auf false positive rate.
Erstellen drift detection pipeline with multiple statistical tests.
# monitoring/drift_detector.py
import pandas as pd
import numpy as np
from scipy.stats import ks_2samp, chi2_contingency
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset
from evidently.metrics import ColumnDriftMetric, DatasetDriftMetric
from datetime import datetime, timedelta
# ... (see EXAMPLES.md for complete implementation)
Erwartet: Drift detection runs erfolgreich, produces JSON report with per-feature statistics, and identifies drifted features.
Bei Fehler: Pruefen auf missing values (impute or drop), ensure reference and current data have same columns, verify data types match zwischen datasets.
Erstellen visual HTML reports for human review and debugging.
# monitoring/generate_reports.py
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset, TargetDriftPreset
from evidently.metrics import (
ColumnDriftMetric,
DatasetDriftMetric,
DatasetMissingValuesMetric,
)
# ... (see EXAMPLES.md for complete implementation)
Erwartet: HTML reports generated in monitoring/reports/, viewable in browser with interactive charts showing distribution comparisons.
Bei Fehler: Verifizieren write Berechtigungs to output directory, check that Evidently version is >= 0.4.0, ensure data frames have sufficient rows (>100 recommended).
Ueberwachen prediction performance to detect concept drift (relationship zwischen features and target changes).
# monitoring/concept_drift.py
import pandas as pd
import numpy as np
from sklearn.metrics import roc_auc_score, mean_squared_error, accuracy_score
from typing import Dict, List
import json
# ... (see EXAMPLES.md for complete implementation)
Erwartet: Performance monitoring detects when model accuracy/AUC drops unter threshold, signaling potential concept drift.
Bei Fehler: Sicherstellen ground truth labels are available (may require delayed validation batch job), verify prediction scores are ordnungsgemaess calibrated (0-1 range for classification), check for label leakage in features.
Integrieren drift detection with alerting systems (Slack, PagerDuty, email).
# monitoring/alerting.py
import requests
import json
from typing import Dict, List
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# ... (see EXAMPLES.md for complete implementation)
Erwartet: Alerts sent to Slack/PagerDuty when drift detected, with severity basierend auf drift share and critical feature involvement.
Bei Fehler: Testen webhook URLs with curl first, verify PagerDuty integration key has correct Berechtigungs, check firewall rules for outbound HTTPS, implement retry logic for transient network failures.
Automate drift detection to run on schedule (daily or weekly).
# monitoring/scheduler.py
import schedule
import time
import logging
from datetime import datetime, timedelta
import pandas as pd
logging.basicConfig(
# ... (see EXAMPLES.md for complete implementation)
Alternatively, use cron:
# Add to crontab (crontab -e)
# Run daily at 2 AM
0 2 * * * cd /path/to/monitoring && /path/to/venv/bin/python scheduler.py >> logs/cron.log 2>&1
Or use Airflow DAG:
# airflow/dags/drift_monitoring_dag.py
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime, timedelta
default_args = {
'owner': 'ml-team',
'depends_on_past': False,
# ... (see EXAMPLES.md for complete implementation)
Erwartet: Monitoring runs automatisch on schedule, generates reports, sends alerts only when drift exceeds thresholds, logs all activity.
Bei Fehler: Check scheduler process is running (ps aux | grep scheduler), verify cron service is active, ensure Datenquelles are accessible, review logs for exceptions, set up dead man's switch alert if job doesn't run.
detect-anomalies-aiops - Time series anomaly detection for operational metricsdeploy-ml-model-serving - Modellieren deployment patterns and versioningsetup-prometheus-monitoring - Infrastructure metrics collectionreview-data-analysis - Statistical analysis validation and peer reviewtesting
Launch all available agents in parallel waves for open-ended hypothesis generation on problems where the correct domain is unknown. Use when facing a cross-domain problem with no clear starting point, when single-agent approaches have stalled, or when diverse perspectives are more valuable than deep expertise. Produces a ranked hypothesis set with convergence analysis and adversarial refinement.
tools
Write integration tests for a Node.js CLI application using the built-in node:test module. Covers the exec helper pattern, output assertions, filesystem state verification, cleanup hooks, JSON output parsing, error case testing, and state restoration after destructive tests. Use when adding tests to an existing CLI, testing a new command, verifying adapter behavior across frameworks, or setting up CI for a CLI tool.
development
Screen a proposed trademark for conflicts and distinctiveness before filing. Covers trademark database searches (TMview, WIPO Global Brand Database, USPTO TESS), distinctiveness analysis using the Abercrombie spectrum, likelihood of confusion assessment using DuPont factors and EUIPO relative grounds, common law rights evaluation, and goods/services overlap analysis. Produces a conflict report with a risk matrix. Use before adopting a new brand name, logo, or slogan — distinct from patent prior art search, which uses different databases, legal frameworks, and analysis methods.
tools
Scaffold a new CLI command using Commander.js with options, action handler, three output modes (human-readable, quiet, JSON), and optional ceremony variant. Covers command naming, option design, shared context patterns, error handling, and integration testing. Use when adding a command to an existing Commander.js CLI, designing a new CLI tool from scratch, or standardizing command structure across a multi-command CLI.