Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

curiositech/airflow-dag-orchestrator

Name: airflow-dag-orchestrator
Author: curiositech

skills/airflow-dag-orchestrator/SKILL.md

npx skillsauth add curiositech/windags-skills airflow-dag-orchestrator

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Airflow DAG Orchestrator

Design and operate Apache Airflow DAGs for reliable data pipeline orchestration with proper dependency management, SLAs, and monitoring.

Activation Triggers

Activate on: "Airflow", "DAG", "operator", "sensor", "scheduler", "task dependency", "SLA", "backfill", "XCom", "TaskFlow API", "MWAA", "Cloud Composer"

NOT for: dbt model execution → dbt-analytics-engineer (though Airflow can trigger dbt) | Stream processing → streaming-pipeline-architect | Workflow engine (Temporal) → distributed-transaction-manager

Quick Start

Define DAG — use TaskFlow API (@dag, @task decorators) for Python-native DAGs
Set schedule — cron or timetable, with catchup=False unless backfill is intentional
Configure retries — retries=2, retry_delay=timedelta(minutes=5) on every task
Add SLAs — sla=timedelta(hours=2) on critical path tasks
Test locally — airflow dags test my_dag 2026-01-01 before deploying

Core Capabilities

| Domain | Technologies | |--------|-------------| | Airflow | Apache Airflow 2.10+, MWAA, Cloud Composer 3 | | Operators | BashOperator, PythonOperator, KubernetesPodOperator | | Providers | apache-airflow-providers-{snowflake, google, aws, dbt-cloud} | | Executors | CeleryExecutor, KubernetesExecutor, LocalExecutor | | Monitoring | SLA misses, task duration, Airflow metrics → Prometheus |

Architecture Patterns

TaskFlow API DAG

from airflow.decorators import dag, task
from datetime import datetime, timedelta

@dag(
    schedule="0 6 * * *",          # daily at 6am UTC
    start_date=datetime(2026, 1, 1),
    catchup=False,
    default_args={
        "retries": 2,
        "retry_delay": timedelta(minutes=5),
        "sla": timedelta(hours=2),
    },
    tags=["finance", "daily"],
)
def daily_revenue_pipeline():

    @task()
    def extract_payments() -> dict:
        """Extract from Stripe API"""
        data = stripe_client.list_payments(date=today())
        return {"count": len(data), "path": "s3://raw/payments/"}

    @task()
    def extract_orders() -> dict:
        """Extract from Shopify API"""
        data = shopify_client.list_orders(date=today())
        return {"count": len(data), "path": "s3://raw/orders/"}

    @task()
    def transform(payments: dict, orders: dict) -> str:
        """Join and transform in DuckDB"""
        result_path = run_duckdb_transform(payments["path"], orders["path"])
        return result_path

    @task()
    def load(path: str):
        """Load to Snowflake"""
        snowflake_copy_into("fct_revenue", path)

    # Define dependencies via function calls
    payments = extract_payments()
    orders = extract_orders()
    transformed = transform(payments, orders)
    load(transformed)

daily_revenue_pipeline()

Dynamic Task Mapping (Fan-Out/Fan-In)

@task()
def get_partitions() -> list[str]:
    return ["2026-01-01", "2026-01-02", "2026-01-03"]

@task()
def process_partition(partition_date: str) -> dict:
    """Runs in parallel for each partition"""
    return {"date": partition_date, "rows": process(partition_date)}

@task()
def aggregate(results: list[dict]):
    """Fan-in: receives all partition results"""
    total = sum(r["rows"] for r in results)
    log.info(f"Processed {total} total rows")

# Dynamically maps process_partition across all partitions
partitions = get_partitions()
results = process_partition.expand(partition_date=partitions)
aggregate(results)

dbt + Airflow Integration

from airflow.operators.bash import BashOperator
from cosmos import DbtDag, ProjectConfig, ProfileConfig

# Option 1: cosmos (recommended)
dbt_dag = DbtDag(
    project_config=ProjectConfig("/opt/airflow/dbt/"),
    profile_config=ProfileConfig(
        profile_name="default",
        target_name="prod",
    ),
    schedule="@daily",
    dag_id="dbt_daily",
)

# Option 2: BashOperator (simple)
dbt_run = BashOperator(
    task_id="dbt_run",
    bash_command="cd /opt/airflow/dbt && dbt build --select tag:daily",
)

Anti-Patterns

Fat tasks — tasks should be atomic units; do not put extract+transform+load in one task
XCom for large data — XCom is for metadata (paths, counts); pass data via S3/GCS, not XCom pickles
catchup=True by accident — unless you want backfill, set catchup=False; otherwise Airflow runs every missed interval
No retries — transient failures are common; always set retries >= 1 with a delay
Top-level code in DAG files — DAG files are parsed every 30s; heavy imports or API calls at module level slow the scheduler

Quality Checklist

[ ] TaskFlow API used for Python-native DAGs (not legacy operator chaining)
[ ] catchup=False unless backfill is intentional
[ ] Retries configured on all tasks (minimum 1 retry)
[ ] SLA set on critical path tasks
[ ] XCom passes references (S3 paths), not data payloads
[ ] Dynamic task mapping used for fan-out parallelism
[ ] DAG tested locally with airflow dags test before deployment
[ ] Task idempotency: re-running a task produces the same result
[ ] Alerting configured for task failures and SLA misses
[ ] DAG tags applied for filtering in Airflow UI

curiositech/airflow-dag-orchestrator

skills/airflow-dag-orchestrator/SKILL.md

Apache Airflow DAGs, operators, SLA monitoring, and workflow orchestration. Activate on: Airflow, DAG, operator, sensor, scheduler, task dependency, SLA, backfill, XCom. NOT for: dbt transformations (use dbt-analytics-engineer), streaming pipelines (use streaming-pipeline-architect).

development

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add curiositech/windags-skills airflow-dag-orchestrator

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 1:30 PM153.0s1 file scanned

SKILL.md

license:: Apache-2.0
name:: airflow-dag-orchestrator
description:: Apache Airflow DAGs, operators, SLA monitoring, and workflow orchestration. Activate on: Airflow, DAG, operator, sensor, scheduler, task dependency, SLA, backfill, XCom. NOT for: dbt transformations (use dbt-analytics-engineer), streaming pipelines (use streaming-pipeline-architect).
allowed-tools:: Read,Write,Edit,Bash(npm:*,npx:*,python:*,airflow:*)
category:: Agent & Orchestration
- skill:: streaming-pipeline-architect
reason:: Batch orchestration complements streaming pipelines

Airflow DAG Orchestrator

Design and operate Apache Airflow DAGs for reliable data pipeline orchestration with proper dependency management, SLAs, and monitoring.

Activation Triggers

Activate on: "Airflow", "DAG", "operator", "sensor", "scheduler", "task dependency", "SLA", "backfill", "XCom", "TaskFlow API", "MWAA", "Cloud Composer"

Quick Start

Define DAG — use TaskFlow API (@dag, @task decorators) for Python-native DAGs
Set schedule — cron or timetable, with catchup=False unless backfill is intentional
Configure retries — retries=2, retry_delay=timedelta(minutes=5) on every task
Add SLAs — sla=timedelta(hours=2) on critical path tasks
Test locally — airflow dags test my_dag 2026-01-01 before deploying

Core Capabilities

Architecture Patterns

TaskFlow API DAG

from airflow.decorators import dag, task
from datetime import datetime, timedelta

@dag(
    schedule="0 6 * * *",          # daily at 6am UTC
    start_date=datetime(2026, 1, 1),
    catchup=False,
    default_args={
        "retries": 2,
        "retry_delay": timedelta(minutes=5),
        "sla": timedelta(hours=2),
    },
    tags=["finance", "daily"],
)
def daily_revenue_pipeline():

    @task()
    def extract_payments() -> dict:
        """Extract from Stripe API"""
        data = stripe_client.list_payments(date=today())
        return {"count": len(data), "path": "s3://raw/payments/"}

    @task()
    def extract_orders() -> dict:
        """Extract from Shopify API"""
        data = shopify_client.list_orders(date=today())
        return {"count": len(data), "path": "s3://raw/orders/"}

    @task()
    def transform(payments: dict, orders: dict) -> str:
        """Join and transform in DuckDB"""
        result_path = run_duckdb_transform(payments["path"], orders["path"])
        return result_path

    @task()
    def load(path: str):
        """Load to Snowflake"""
        snowflake_copy_into("fct_revenue", path)

    # Define dependencies via function calls
    payments = extract_payments()
    orders = extract_orders()
    transformed = transform(payments, orders)
    load(transformed)

daily_revenue_pipeline()

Dynamic Task Mapping (Fan-Out/Fan-In)

@task()
def get_partitions() -> list[str]:
    return ["2026-01-01", "2026-01-02", "2026-01-03"]

@task()
def process_partition(partition_date: str) -> dict:
    """Runs in parallel for each partition"""
    return {"date": partition_date, "rows": process(partition_date)}

@task()
def aggregate(results: list[dict]):
    """Fan-in: receives all partition results"""
    total = sum(r["rows"] for r in results)
    log.info(f"Processed {total} total rows")

# Dynamically maps process_partition across all partitions
partitions = get_partitions()
results = process_partition.expand(partition_date=partitions)
aggregate(results)

dbt + Airflow Integration

from airflow.operators.bash import BashOperator
from cosmos import DbtDag, ProjectConfig, ProfileConfig

# Option 1: cosmos (recommended)
dbt_dag = DbtDag(
    project_config=ProjectConfig("/opt/airflow/dbt/"),
    profile_config=ProfileConfig(
        profile_name="default",
        target_name="prod",
    ),
    schedule="@daily",
    dag_id="dbt_daily",
)

# Option 2: BashOperator (simple)
dbt_run = BashOperator(
    task_id="dbt_run",
    bash_command="cd /opt/airflow/dbt && dbt build --select tag:daily",
)

Anti-Patterns

Fat tasks — tasks should be atomic units; do not put extract+transform+load in one task
XCom for large data — XCom is for metadata (paths, counts); pass data via S3/GCS, not XCom pickles
catchup=True by accident — unless you want backfill, set catchup=False; otherwise Airflow runs every missed interval
No retries — transient failures are common; always set retries >= 1 with a delay
Top-level code in DAG files — DAG files are parsed every 30s; heavy imports or API calls at module level slow the scheduler

Quality Checklist

[ ] TaskFlow API used for Python-native DAGs (not legacy operator chaining)
[ ] catchup=False unless backfill is intentional
[ ] Retries configured on all tasks (minimum 1 retry)
[ ] SLA set on critical path tasks
[ ] XCom passes references (S3 paths), not data payloads
[ ] Dynamic task mapping used for fan-out parallelism
[ ] DAG tested locally with airflow dags test before deployment
[ ] Task idempotency: re-running a task produces the same result
[ ] Alerting configured for task failures and SLA misses
[ ] DAG tags applied for filtering in Airflow UI

Related Skills

curiositech/revisiting-interview-data-analysing-turn

data-ai

VerifiedTrustedCommunity

license: Apache-2.0 NOT for unrelated tasks outside this domain.

8SKILL.mdUpdated Jul 19, 2026

curiositech/revisiting-interview-data-analysing-turn

curiositech/redis-patterns-expert

development

VerifiedTrustedCommunity

Use when designing caching strategies (cache-aside, write-through, write-behind), implementing distributed locks, building rate limiters, leaderboards, real-time streams (XADD/consumer groups), pub/sub, or tuning eviction policies. Triggers: thundering-herd on cache miss, dogpile on key expiry, Redlock vs SET-NX-PX choice, sliding-window rate limiter, hot-key on a single cluster slot, big-key blowup, MULTI/EXEC across slots, KEYS in production. NOT for Redis Cluster operations/admin (different domain), embedded KV (SQLite, leveldb), in-process LRU caches, or Memcached.

8SKILL.mdUpdated Jul 19, 2026

curiositech/redis-patterns-expert

curiositech/react-server-components-boundary

tools

VerifiedTrustedCommunity

Drawing the `'use client'` boundary correctly in React Server Components apps (Next.js App Router, RSC frameworks) — leaf-pushing, slot composition, serialization rules, and environment poisoning prevention. Grounded in react.dev and Next.js 16 docs.

8SKILL.mdUpdated Jul 19, 2026

curiositech/react-server-components-boundary

curiositech/rate-limiting-strategy

development

VerifiedTrustedCommunity

Use when designing rate limiting for an API, choosing between token bucket / sliding window / leaky bucket / fixed window, implementing it in Redis, deciding edge (Cloudflare/Upstash) vs origin enforcement, sizing per-user vs per-IP vs per-endpoint quotas, returning the right 429 response with Retry-After, or fixing the boundary-burst bug in fixed-window limiters. Triggers: 429 too many requests, INCR + EXPIRE, ZADD + ZREMRANGEBYSCORE + ZCARD, X-RateLimit-Remaining header, Cloudflare WAF rate limiting rules, Upstash @upstash/ratelimit, leaky bucket shaping vs policing, distributed rate limiter consistency. NOT for DDoS mitigation specifically (different scale), CAPTCHA / bot management, full WAF design, or per-user quota billing.

8SKILL.mdUpdated Jul 19, 2026

curiositech/rate-limiting-strategy

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/curiositech/windags-skills.git

# Copy into Claude Code skills folder (global)
cp -r windags-skills/skills/airflow-dag-orchestrator ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

curiositech/windags-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT