Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

curiositech/data-quality-guardian

Name: data-quality-guardian
Author: curiositech

skills/data-quality-guardian/SKILL.md

npx skillsauth add curiositech/windags-skills data-quality-guardian

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Data Quality Guardian

Implement comprehensive data quality frameworks with automated validation, anomaly detection, data contracts, and SLA monitoring.

Activation Triggers

Activate on: "data quality", "data validation", "Great Expectations", "data contract", "anomaly detection", "SLA", "freshness check", "schema validation", "data observability", "Soda", "elementary"

NOT for: dbt project layout → dbt-analytics-engineer | Schema evolution strategy → schema-evolution-manager | Data migration validation → data-migration-specialist

Quick Start

Define data contracts — agree on schema, freshness, volume, and value ranges with upstream producers
Implement tests — dbt tests for SQL models, Great Expectations for raw/external data
Monitor freshness — alert when source tables are stale beyond SLA
Detect anomalies — statistical checks for volume, distribution, and null rate changes
Build quality dashboard — centralized view of all quality metrics across pipelines

Core Capabilities

| Domain | Technologies | |--------|-------------| | Testing | dbt tests, Great Expectations 1.x, Soda Core 3.x | | Observability | elementary (dbt), Monte Carlo, Anomalo | | Contracts | Soda data contracts, dbt model contracts, Protobuf schemas | | Anomaly Detection | elementary anomaly monitors, custom z-score, Prophet | | Alerting | Slack/PagerDuty integration, SLA miss alerts |

Architecture Patterns

Multi-Layer Quality Checks

Raw Data (Landing)
    ↓
Layer 1: Schema Validation
    - Column types match contract
    - No unexpected NULLs in required fields
    - Row count within expected range
    ↓
Layer 2: Business Rule Validation
    - Referential integrity (FK relationships hold)
    - accepted_values constraints
    - Custom SQL assertions (e.g., revenue >= 0)
    ↓
Layer 3: Statistical Anomaly Detection
    - Row count deviation from 7-day rolling average
    - Null rate spike detection
    - Distribution shift (KL divergence)
    ↓
Layer 4: Freshness & SLA Monitoring
    - Source loaded within 2 hours
    - Downstream models built within 4 hours
    - Dashboard data <6 hours old

Data Contract YAML

# contracts/payments-contract.yml
contract:
  name: stripe_payments
  owner: payments-team
  version: "2.0"
  sla:
    freshness: 2h          # must be updated within 2 hours
    volume_min: 1000        # at least 1000 rows per day
    volume_max: 500000      # no more than 500k (anomaly if exceeded)
  schema:
    - name: payment_id
      type: string
      required: true
      unique: true
    - name: amount_cents
      type: integer
      required: true
      checks:
        - type: range
          min: 0
          max: 10000000     # $100k max
    - name: status
      type: string
      required: true
      checks:
        - type: accepted_values
          values: [succeeded, failed, pending, refunded]
    - name: created_at
      type: timestamp
      required: true
      checks:
        - type: not_in_future

Anomaly Detection with elementary

# dbt model with elementary anomaly monitors
version: 2
models:
  - name: fct_orders
    tests:
      - elementary.volume_anomalies:
          timestamp_column: created_at
          where: "created_at > dateadd(day, -30, current_date())"
          sensitivity: 3    # z-score threshold
      - elementary.column_anomalies:
          column_name: total_amount
          where: "created_at > dateadd(day, -30, current_date())"
      - elementary.freshness_anomalies:
          timestamp_column: _loaded_at
          sensitivity: 2

Anti-Patterns

Testing only in production — run quality checks in CI on sample data before merging; do not discover issues in production
Alert fatigue — too many low-severity alerts get ignored; tier alerts: P1 (data loss), P2 (quality regression), P3 (informational)
No contract with upstream — without an agreed contract, schema changes break pipelines silently
Static thresholds only — hardcoded "row count > 1000" breaks during holidays; use statistical anomaly detection
Quality checks at the end — validate at each layer (landing, staging, marts); catching errors early is cheaper

Quality Checklist

[ ] Data contracts defined for all critical sources (schema, freshness, volume)
[ ] dbt tests on every model: unique, not_null, accepted_values, relationships
[ ] Freshness monitoring with SLA thresholds and alerting
[ ] Anomaly detection on row counts, null rates, and value distributions
[ ] Quality dashboard shows current status across all pipelines
[ ] Alert severity tiered (P1/P2/P3) to prevent fatigue
[ ] Quality checks run in CI (pre-merge) and production (post-load)
[ ] Root cause analysis supported via data lineage integration
[ ] Monthly data quality review with trending metrics
[ ] Upstream teams notified within 15 minutes of contract violations

curiositech/data-quality-guardian

skills/data-quality-guardian/SKILL.md

Great Expectations, dbt tests, anomaly detection, and data contracts for data quality. Activate on: data quality, data validation, Great Expectations, data contract, anomaly detection, SLA, freshness check, schema validation. NOT for: dbt model structure (use dbt-analytics-engineer), schema evolution (use schema-evolution-manager).

testing

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add curiositech/windags-skills data-quality-guardian

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 2:12 PM141.7s1 file scanned

SKILL.md

license:: Apache-2.0
name:: data-quality-guardian
description:: Great Expectations, dbt tests, anomaly detection, and data contracts for data quality. Activate on: data quality, data validation, Great Expectations, data contract, anomaly detection, SLA, freshness check, schema validation. NOT for: dbt model structure (use dbt-analytics-engineer), schema evolution (use schema-evolution-manager).
allowed-tools:: Read,Write,Edit,Bash(npm:*,npx:*,python:*,dbt:*)
category:: Data & Analytics
- skill:: airflow-dag-orchestrator
reason:: Quality checks run as orchestrated pipeline steps

Data Quality Guardian

Implement comprehensive data quality frameworks with automated validation, anomaly detection, data contracts, and SLA monitoring.

Activation Triggers

NOT for: dbt project layout → dbt-analytics-engineer | Schema evolution strategy → schema-evolution-manager | Data migration validation → data-migration-specialist

Quick Start

Define data contracts — agree on schema, freshness, volume, and value ranges with upstream producers
Implement tests — dbt tests for SQL models, Great Expectations for raw/external data
Monitor freshness — alert when source tables are stale beyond SLA
Detect anomalies — statistical checks for volume, distribution, and null rate changes
Build quality dashboard — centralized view of all quality metrics across pipelines

Core Capabilities

Architecture Patterns

Multi-Layer Quality Checks

Raw Data (Landing)
    ↓
Layer 1: Schema Validation
    - Column types match contract
    - No unexpected NULLs in required fields
    - Row count within expected range
    ↓
Layer 2: Business Rule Validation
    - Referential integrity (FK relationships hold)
    - accepted_values constraints
    - Custom SQL assertions (e.g., revenue >= 0)
    ↓
Layer 3: Statistical Anomaly Detection
    - Row count deviation from 7-day rolling average
    - Null rate spike detection
    - Distribution shift (KL divergence)
    ↓
Layer 4: Freshness & SLA Monitoring
    - Source loaded within 2 hours
    - Downstream models built within 4 hours
    - Dashboard data <6 hours old

Data Contract YAML

# contracts/payments-contract.yml
contract:
  name: stripe_payments
  owner: payments-team
  version: "2.0"
  sla:
    freshness: 2h          # must be updated within 2 hours
    volume_min: 1000        # at least 1000 rows per day
    volume_max: 500000      # no more than 500k (anomaly if exceeded)
  schema:
    - name: payment_id
      type: string
      required: true
      unique: true
    - name: amount_cents
      type: integer
      required: true
      checks:
        - type: range
          min: 0
          max: 10000000     # $100k max
    - name: status
      type: string
      required: true
      checks:
        - type: accepted_values
          values: [succeeded, failed, pending, refunded]
    - name: created_at
      type: timestamp
      required: true
      checks:
        - type: not_in_future

Anomaly Detection with elementary

# dbt model with elementary anomaly monitors
version: 2
models:
  - name: fct_orders
    tests:
      - elementary.volume_anomalies:
          timestamp_column: created_at
          where: "created_at > dateadd(day, -30, current_date())"
          sensitivity: 3    # z-score threshold
      - elementary.column_anomalies:
          column_name: total_amount
          where: "created_at > dateadd(day, -30, current_date())"
      - elementary.freshness_anomalies:
          timestamp_column: _loaded_at
          sensitivity: 2

Anti-Patterns

Testing only in production — run quality checks in CI on sample data before merging; do not discover issues in production
Alert fatigue — too many low-severity alerts get ignored; tier alerts: P1 (data loss), P2 (quality regression), P3 (informational)
No contract with upstream — without an agreed contract, schema changes break pipelines silently
Static thresholds only — hardcoded "row count > 1000" breaks during holidays; use statistical anomaly detection
Quality checks at the end — validate at each layer (landing, staging, marts); catching errors early is cheaper

Quality Checklist

[ ] Data contracts defined for all critical sources (schema, freshness, volume)
[ ] dbt tests on every model: unique, not_null, accepted_values, relationships
[ ] Freshness monitoring with SLA thresholds and alerting
[ ] Anomaly detection on row counts, null rates, and value distributions
[ ] Quality dashboard shows current status across all pipelines
[ ] Alert severity tiered (P1/P2/P3) to prevent fatigue
[ ] Quality checks run in CI (pre-merge) and production (post-load)
[ ] Root cause analysis supported via data lineage integration
[ ] Monthly data quality review with trending metrics
[ ] Upstream teams notified within 15 minutes of contract violations

Related Skills

curiositech/revisiting-interview-data-analysing-turn

data-ai

VerifiedTrustedCommunity

license: Apache-2.0 NOT for unrelated tasks outside this domain.

8SKILL.mdUpdated Jul 19, 2026

curiositech/revisiting-interview-data-analysing-turn

curiositech/redis-patterns-expert

development

VerifiedTrustedCommunity

Use when designing caching strategies (cache-aside, write-through, write-behind), implementing distributed locks, building rate limiters, leaderboards, real-time streams (XADD/consumer groups), pub/sub, or tuning eviction policies. Triggers: thundering-herd on cache miss, dogpile on key expiry, Redlock vs SET-NX-PX choice, sliding-window rate limiter, hot-key on a single cluster slot, big-key blowup, MULTI/EXEC across slots, KEYS in production. NOT for Redis Cluster operations/admin (different domain), embedded KV (SQLite, leveldb), in-process LRU caches, or Memcached.

8SKILL.mdUpdated Jul 19, 2026

curiositech/redis-patterns-expert

curiositech/react-server-components-boundary

tools

VerifiedTrustedCommunity

Drawing the `'use client'` boundary correctly in React Server Components apps (Next.js App Router, RSC frameworks) — leaf-pushing, slot composition, serialization rules, and environment poisoning prevention. Grounded in react.dev and Next.js 16 docs.

8SKILL.mdUpdated Jul 19, 2026

curiositech/react-server-components-boundary

curiositech/rate-limiting-strategy

development

VerifiedTrustedCommunity

Use when designing rate limiting for an API, choosing between token bucket / sliding window / leaky bucket / fixed window, implementing it in Redis, deciding edge (Cloudflare/Upstash) vs origin enforcement, sizing per-user vs per-IP vs per-endpoint quotas, returning the right 429 response with Retry-After, or fixing the boundary-burst bug in fixed-window limiters. Triggers: 429 too many requests, INCR + EXPIRE, ZADD + ZREMRANGEBYSCORE + ZCARD, X-RateLimit-Remaining header, Cloudflare WAF rate limiting rules, Upstash @upstash/ratelimit, leaky bucket shaping vs policing, distributed rate limiter consistency. NOT for DDoS mitigation specifically (different scale), CAPTCHA / bot management, full WAF design, or per-user quota billing.

8SKILL.mdUpdated Jul 19, 2026

curiositech/rate-limiting-strategy

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/curiositech/windags-skills.git

# Copy into Claude Code skills folder (global)
cp -r windags-skills/skills/data-quality-guardian ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

curiositech/windags-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT