Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

curiositech/data-warehouse-optimizer

Name: data-warehouse-optimizer
Author: curiositech

skills/data-warehouse-optimizer/SKILL.md

npx skillsauth add curiositech/windags-skills data-warehouse-optimizer

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Data Warehouse Optimizer

Optimize query performance and resource utilization in Snowflake, BigQuery, and Redshift through clustering, partitioning, materialized views, and query profiling.

Activation Triggers

Activate on: "Snowflake optimization", "BigQuery performance", "Redshift tuning", "query optimization", "clustering key", "partitioning", "materialized view", "warehouse sizing", "query profile", "slow query"

NOT for: dbt project structure → dbt-analytics-engineer | Dimensional modeling → dimensional-modeler | Cost optimization beyond warehouse → data-cost-optimizer

Quick Start

Profile slow queries — use QUERY_PROFILE (Snowflake), INFORMATION_SCHEMA.JOBS (BigQuery), STL tables (Redshift)
Partition large tables — by date column (most common), reducing scan size by 10-100x
Add clustering — co-locate frequently filtered/joined columns within partitions
Materialize expensive aggregations — materialized views for dashboards, pre-aggregated metrics
Right-size warehouses — auto-suspend idle, auto-scale for concurrency, match size to workload

Core Capabilities

| Domain | Technologies | |--------|-------------| | Snowflake | Micro-partitions, clustering keys, search optimization, warehouses | | BigQuery | Partitioning, clustering, BI Engine, materialized views | | Redshift | Sort keys, dist keys, VACUUM, WLM, Redshift Serverless | | General | Query plans, statistics, result caching, spill-to-disk analysis | | Monitoring | Snowflake Account Usage, BigQuery INFORMATION_SCHEMA, CloudWatch |

Architecture Patterns

Snowflake Clustering and Search Optimization

-- Cluster a large fact table by commonly filtered columns
ALTER TABLE fct_events
  CLUSTER BY (event_date, customer_id);

-- Verify clustering depth (lower = better, target < 2.0)
SELECT SYSTEM$CLUSTERING_INFORMATION('fct_events', '(event_date, customer_id)');

-- Search optimization for point lookups on high-cardinality columns
ALTER TABLE fct_events ADD SEARCH OPTIMIZATION
  ON EQUALITY(order_id), EQUALITY(email);

-- Result: range scans use clustering, point lookups use search optimization

BigQuery Partitioning + Clustering

-- Partition by date, cluster by high-cardinality filter columns
CREATE TABLE `project.dataset.fct_events`
PARTITION BY DATE(event_timestamp)
CLUSTER BY customer_id, event_type
AS
SELECT * FROM `project.dataset.raw_events`;

-- Query benefits: partition pruning + cluster pruning
-- Only scans partitions matching WHERE clause
SELECT customer_id, COUNT(*)
FROM `project.dataset.fct_events`
WHERE event_timestamp BETWEEN '2026-01-01' AND '2026-01-31'
  AND event_type = 'purchase'
GROUP BY customer_id;

-- Check bytes scanned reduction
-- Target: 90%+ reduction vs unpartitioned table

Warehouse Sizing Strategy (Snowflake)

Workload Type          Recommended Size     Auto-Suspend    Concurrency
─────────────          ────────────────     ────────────    ───────────
Dashboard queries      X-Small/Small        60s             Auto-scale (max 3)
Analyst ad-hoc         Medium               300s            1 cluster
dbt daily build        Large                Immediate       1 cluster
Data science / ML      X-Large+             Immediate       1 cluster

Key: separate workloads into different warehouses
     to prevent resource contention and enable per-workload billing

Anti-Patterns

Scanning full tables — always partition by date; a full scan of a 1TB table costs 10-50x more than a pruned scan
Too many clustering keys — 2-4 keys maximum; more keys reduce clustering effectiveness
Oversized warehouses — bigger does not always mean faster; profile first, right-size second
Ignoring spill-to-disk — queries spilling to remote storage are 10-100x slower; increase warehouse size or optimize query
Materializing volatile data — materialized views on rapidly changing tables cause constant refresh overhead

Quality Checklist

[ ] Large tables (>1B rows) partitioned by date column
[ ] Clustering keys set on top 2-3 filter/join columns
[ ] Query profile reviewed for top 10 slowest queries monthly
[ ] Spill-to-disk queries identified and optimized (increase size or rewrite)
[ ] Materialized views created for expensive dashboard aggregations
[ ] Warehouses auto-suspended when idle (60-300s)
[ ] Workloads separated into dedicated warehouses
[ ] Result cache hit rate >50% for repeated analytical queries
[ ] Bytes scanned tracked and reduced quarter-over-quarter
[ ] Unused tables/views identified and dropped quarterly

curiositech/data-warehouse-optimizer

skills/data-warehouse-optimizer/SKILL.md

Snowflake, BigQuery, clustering, partitioning, and materialized views for warehouse performance. Activate on: Snowflake, BigQuery, Redshift, query optimization, clustering, partitioning, materialized view, warehouse cost, query profile. NOT for: dbt model structure (use dbt-analytics-engineer), data modeling (use dimensional-modeler).

data-ai

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add curiositech/windags-skills data-warehouse-optimizer

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 2:11 PM39.3s1 file scanned

SKILL.md

license:: Apache-2.0
name:: data-warehouse-optimizer
description:: Snowflake, BigQuery, clustering, partitioning, and materialized views for warehouse performance. Activate on: Snowflake, BigQuery, Redshift, query optimization, clustering, partitioning, materialized view, warehouse cost, query profile. NOT for: dbt model structure (use dbt-analytics-engineer), data modeling (use dimensional-modeler).
allowed-tools:: Read,Write,Edit,Bash(npm:*,npx:*,python:*,snowsql:*,bq:*)
category:: Data & Analytics
- skill:: dimensional-modeler
reason:: Physical model design affects query performance

Data Warehouse Optimizer

Optimize query performance and resource utilization in Snowflake, BigQuery, and Redshift through clustering, partitioning, materialized views, and query profiling.

Activation Triggers

NOT for: dbt project structure → dbt-analytics-engineer | Dimensional modeling → dimensional-modeler | Cost optimization beyond warehouse → data-cost-optimizer

Quick Start

Profile slow queries — use QUERY_PROFILE (Snowflake), INFORMATION_SCHEMA.JOBS (BigQuery), STL tables (Redshift)
Partition large tables — by date column (most common), reducing scan size by 10-100x
Add clustering — co-locate frequently filtered/joined columns within partitions
Materialize expensive aggregations — materialized views for dashboards, pre-aggregated metrics
Right-size warehouses — auto-suspend idle, auto-scale for concurrency, match size to workload

Core Capabilities

Architecture Patterns

Snowflake Clustering and Search Optimization

-- Cluster a large fact table by commonly filtered columns
ALTER TABLE fct_events
  CLUSTER BY (event_date, customer_id);

-- Verify clustering depth (lower = better, target < 2.0)
SELECT SYSTEM$CLUSTERING_INFORMATION('fct_events', '(event_date, customer_id)');

-- Search optimization for point lookups on high-cardinality columns
ALTER TABLE fct_events ADD SEARCH OPTIMIZATION
  ON EQUALITY(order_id), EQUALITY(email);

-- Result: range scans use clustering, point lookups use search optimization

BigQuery Partitioning + Clustering

-- Partition by date, cluster by high-cardinality filter columns
CREATE TABLE `project.dataset.fct_events`
PARTITION BY DATE(event_timestamp)
CLUSTER BY customer_id, event_type
AS
SELECT * FROM `project.dataset.raw_events`;

-- Query benefits: partition pruning + cluster pruning
-- Only scans partitions matching WHERE clause
SELECT customer_id, COUNT(*)
FROM `project.dataset.fct_events`
WHERE event_timestamp BETWEEN '2026-01-01' AND '2026-01-31'
  AND event_type = 'purchase'
GROUP BY customer_id;

-- Check bytes scanned reduction
-- Target: 90%+ reduction vs unpartitioned table

Warehouse Sizing Strategy (Snowflake)

Workload Type          Recommended Size     Auto-Suspend    Concurrency
─────────────          ────────────────     ────────────    ───────────
Dashboard queries      X-Small/Small        60s             Auto-scale (max 3)
Analyst ad-hoc         Medium               300s            1 cluster
dbt daily build        Large                Immediate       1 cluster
Data science / ML      X-Large+             Immediate       1 cluster

Key: separate workloads into different warehouses
     to prevent resource contention and enable per-workload billing

Anti-Patterns

Scanning full tables — always partition by date; a full scan of a 1TB table costs 10-50x more than a pruned scan
Too many clustering keys — 2-4 keys maximum; more keys reduce clustering effectiveness
Oversized warehouses — bigger does not always mean faster; profile first, right-size second
Ignoring spill-to-disk — queries spilling to remote storage are 10-100x slower; increase warehouse size or optimize query
Materializing volatile data — materialized views on rapidly changing tables cause constant refresh overhead

Quality Checklist

[ ] Large tables (>1B rows) partitioned by date column
[ ] Clustering keys set on top 2-3 filter/join columns
[ ] Query profile reviewed for top 10 slowest queries monthly
[ ] Spill-to-disk queries identified and optimized (increase size or rewrite)
[ ] Materialized views created for expensive dashboard aggregations
[ ] Warehouses auto-suspended when idle (60-300s)
[ ] Workloads separated into dedicated warehouses
[ ] Result cache hit rate >50% for repeated analytical queries
[ ] Bytes scanned tracked and reduced quarter-over-quarter
[ ] Unused tables/views identified and dropped quarterly

Related Skills

curiositech/revisiting-interview-data-analysing-turn

data-ai

VerifiedTrustedCommunity

license: Apache-2.0 NOT for unrelated tasks outside this domain.

8SKILL.mdUpdated Jul 19, 2026

curiositech/revisiting-interview-data-analysing-turn

curiositech/redis-patterns-expert

development

VerifiedTrustedCommunity

Use when designing caching strategies (cache-aside, write-through, write-behind), implementing distributed locks, building rate limiters, leaderboards, real-time streams (XADD/consumer groups), pub/sub, or tuning eviction policies. Triggers: thundering-herd on cache miss, dogpile on key expiry, Redlock vs SET-NX-PX choice, sliding-window rate limiter, hot-key on a single cluster slot, big-key blowup, MULTI/EXEC across slots, KEYS in production. NOT for Redis Cluster operations/admin (different domain), embedded KV (SQLite, leveldb), in-process LRU caches, or Memcached.

8SKILL.mdUpdated Jul 19, 2026

curiositech/redis-patterns-expert

curiositech/react-server-components-boundary

tools

VerifiedTrustedCommunity

Drawing the `'use client'` boundary correctly in React Server Components apps (Next.js App Router, RSC frameworks) — leaf-pushing, slot composition, serialization rules, and environment poisoning prevention. Grounded in react.dev and Next.js 16 docs.

8SKILL.mdUpdated Jul 19, 2026

curiositech/react-server-components-boundary

curiositech/rate-limiting-strategy

development

VerifiedTrustedCommunity

Use when designing rate limiting for an API, choosing between token bucket / sliding window / leaky bucket / fixed window, implementing it in Redis, deciding edge (Cloudflare/Upstash) vs origin enforcement, sizing per-user vs per-IP vs per-endpoint quotas, returning the right 429 response with Retry-After, or fixing the boundary-burst bug in fixed-window limiters. Triggers: 429 too many requests, INCR + EXPIRE, ZADD + ZREMRANGEBYSCORE + ZCARD, X-RateLimit-Remaining header, Cloudflare WAF rate limiting rules, Upstash @upstash/ratelimit, leaky bucket shaping vs policing, distributed rate limiter consistency. NOT for DDoS mitigation specifically (different scale), CAPTCHA / bot management, full WAF design, or per-user quota billing.

8SKILL.mdUpdated Jul 19, 2026

curiositech/rate-limiting-strategy

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/curiositech/windags-skills.git

# Copy into Claude Code skills folder (global)
cp -r windags-skills/skills/data-warehouse-optimizer ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

curiositech/windags-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT