Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

curiositech/dag-performance-profiler

Name: dag-performance-profiler
Author: curiositech

skills/dag-performance-profiler/SKILL.md

npx skillsauth add curiositech/windags-skills dag-performance-profiler

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

DAG Performance Profiler

You analyze DAG execution performance to identify bottlenecks and optimization opportunities through systematic profiling of latency, token usage, cost, and resource consumption.

DECISION POINTS

1. Bottleneck Classification

Primary bottleneck detected?
├─ High latency (>3x avg node time)
│  ├─ Sequential dependency chain → Restructure for parallelization
│  └─ Single slow node → Break into smaller tasks or downgrade model
├─ High cost (>40% of total budget)
│  ├─ Token usage >5000/node → Context reduction strategy
│  └─ Expensive model overuse → Model selection optimization
└─ Resource contention (wait time >50% execution time)
   ├─ Tool latency bottleneck → Cache or parallelize tool calls
   └─ Dependency blocking → DAG restructuring

2. Optimization Priority Matrix

Impact vs Effort analysis:
├─ High Impact (>20% improvement) + Low Effort → IMMEDIATE (same day)
│  ├─ Model downgrade for simple tasks → Execute immediately
│  └─ Remove obvious sequential dependencies → Execute immediately
├─ High Impact + High Effort → PLANNED (next sprint)
│  ├─ Major DAG restructuring → Schedule with stakeholders
│  └─ Tool replacement/caching → Plan implementation
├─ Low Impact (<10% improvement) → DEFER
│  └─ Minor optimizations → Document but don't implement
└─ Negative Impact → REJECT
   └─ Optimizations that hurt other metrics → Explicitly reject

3. Cost-Latency Trade-off Decision

Performance requirement context?
├─ Cost-sensitive (budget constrained)
│  ├─ Accept 20% latency increase for 30%+ cost reduction → Recommend
│  └─ <20% cost savings → Keep current configuration
├─ Latency-critical (real-time requirements)
│  ├─ Accept 40%+ cost increase for 20% latency reduction → Recommend
│  └─ <15% latency improvement → Reject cost increase
└─ Balanced requirements
   ├─ Cost/latency ratio improvement >15% → Recommend
   └─ <10% improvement either metric → No change recommended

FAILURE MODES

Over-Optimization Syndrome

Symptoms: Recommending micro-optimizations that save <5% while ignoring major bottlenecks Detection: If optimization list has >5 items with <10% individual impact each Fix: Rank by impact percentage, focus only on top 2-3 optimizations with >15% impact. Defer others explicitly.

False Bottleneck Attribution

Symptoms: Misidentifying wait time as execution bottleneck, blaming wrong nodes Detection: If "slow node" has high wait time but normal execution time relative to task complexity Fix: Separate wait time from execution time in analysis. Focus on dependency structure causing waits, not node speed.

Cost Underestimation Trap

Symptoms: Providing token savings calculations without accounting for model pricing differences Detection: If cost savings percentages don't match token reduction ratios by model type Fix: Always calculate actual cost: (token_change / 1000) × model_price_per_1k. Show both token AND dollar impact.

Parallelization Fantasy

Symptoms: Suggesting parallelization for inherently sequential tasks with data dependencies Detection: If recommending parallel execution for nodes where output of A feeds input of B Fix: Map actual data dependencies before suggesting parallelization. Only truly independent nodes can run parallel.

Single-Metric Tunnel Vision

Symptoms: Optimizing one metric while catastrophically degrading another Detection: If optimizing for cost increases latency >50% or optimizing latency increases cost >100% Fix: Always provide trade-off analysis: "20% cost savings, 15% latency increase, 5% accuracy impact"

WORKED EXAMPLES

Code Review DAG Analysis

Initial State: 5-node code review DAG: 45s total, $0.42 cost

extract-code: 4.2s, 2,400 tokens, Sonnet
analyze-complexity: 8.1s (3.4s wait + 4.7s exec), 4,200 tokens, Sonnet
check-security: 6.8s, 3,100 tokens, Sonnet
review-performance: 12.4s, 8,900 tokens, Opus
generate-report: 13.5s (9.2s wait + 4.3s exec), 5,200 tokens, Sonnet

Step 1 - Bottleneck Classification Primary bottleneck: review-performance at 12.4s (27% of total) - Single slow node pattern Secondary: Dependency blocking causing 12.6s total wait time

Step 2 - Apply Decision Tree High latency bottleneck + resource contention → Restructure for parallelization + break down slow node

Step 3 - Optimization Recommendations

HIGH IMPACT: Split review-performance into check-patterns (3s, Sonnet) + assess-complexity (4s, Sonnet)
- Saves: 5.4s latency, $0.08 cost (model downgrade)
MEDIUM IMPACT: Parallelize analyze-complexity + check-security (currently sequential)
- Saves: 6.8s latency by removing wait time
DEFER: Context reduction could save $0.05 but <5% impact

Final Result: 28s total (38% faster), $0.34 cost (19% cheaper)

High-Cost Analytics Pipeline

Initial State: 8-node data analysis: 67s total, $2.40 cost, 95% Opus usage

Step 1 - Cost Analysis Discovery

extract-tables: 2,800 tokens, Opus ($0.42) - Simple extraction task
clean-data: 3,200 tokens, Opus ($0.48) - Pattern matching task
statistical-analysis: 12,600 tokens, Opus ($1.89) - Complex reasoning
generate-insights: 9,400 tokens, Opus ($1.41) - Moderate analysis

Step 2 - Model Selection Decision Tree Using complexity assessment:

Extract/clean: Simple → Haiku ($0.007 vs $0.15/1k)
Statistical: Complex reasoning → Keep Opus
Insights: Moderate → Sonnet ($0.003 vs $0.15/1k)

Step 3 - Impact Calculation

Extract + clean: 6,000 tokens × $0.143 savings/1K = $0.86 savings
Insights: 9,400 tokens × $0.012 savings/1K = $0.11 savings
Total: $0.97 savings (40% cost reduction), 2s latency increase (3%)

Expert Decision: Accept trade-off - massive cost savings for minimal latency impact in non-critical analytics pipeline.

QUALITY GATES

Performance profiling complete when:

[ ] Execution metrics parsed with node-by-node timing breakdown
[ ] Token usage calculated per node with model cost attribution
[ ] Wait time separated from execution time for each node
[ ] Critical path identified with percentage of total duration
[ ] Bottlenecks ranked by impact (>15% improvement threshold)
[ ] Cost-latency trade-offs quantified for major recommendations
[ ] Model selection recommendations matched to task complexity levels
[ ] Parallelization suggestions verified against actual data dependencies
[ ] Implementation effort estimated (immediate/planned/complex) for each optimization
[ ] Performance improvement projections include confidence intervals

NOT-FOR BOUNDARIES

This skill should NOT be used for:

Real-time execution monitoring → Use dag-execution-tracer instead
Failure root cause analysis → Use dag-failure-analyzer instead
DAG structural design from scratch → Use dag-architect instead
Automatic optimization implementation → Use dag-auto-optimizer instead
Resource allocation planning → Use dag-task-scheduler instead

Delegate when:

Need live execution logs or traces → dag-execution-tracer
Performance issue is masking failures → dag-failure-analyzer
Recommendations require major DAG redesign → dag-architect
User wants hands-off optimization → dag-auto-optimizer
Need to schedule optimized DAG → dag-task-scheduler

This skill focuses on analysis and actionable recommendations, not monitoring, design, or automatic implementation.

curiositech/dag-performance-profiler

skills/dag-performance-profiler/SKILL.md

Profiles DAG execution performance including latency, token usage, cost, and resource consumption. Identifies bottlenecks and optimization opportunities. Activate on 'performance profile', 'execution metrics', 'latency analysis', 'token usage', 'cost analysis'. NOT for execution tracing (use dag-execution-tracer) or failure analysis (use dag-failure-analyzer).

testing

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add curiositech/windags-skills dag-performance-profiler

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 2:09 PM19.4s1 file scanned

SKILL.md

license:: BSL-1.1
name:: dag-performance-profiler
description:: Profiles DAG execution performance including latency, token usage, cost, and resource consumption. Identifies bottlenecks and optimization opportunities. Activate on 'performance profile', 'execution metrics', 'latency analysis', 'token usage', 'cost analysis'. NOT for execution tracing (use dag-execution-tracer) or failure analysis (use dag-failure-analyzer).
category:: Agent & Orchestration
- skill:: dag-task-scheduler
reason:: Scheduling optimization

DAG Performance Profiler

You analyze DAG execution performance to identify bottlenecks and optimization opportunities through systematic profiling of latency, token usage, cost, and resource consumption.

DECISION POINTS

1. Bottleneck Classification

Primary bottleneck detected?
├─ High latency (>3x avg node time)
│  ├─ Sequential dependency chain → Restructure for parallelization
│  └─ Single slow node → Break into smaller tasks or downgrade model
├─ High cost (>40% of total budget)
│  ├─ Token usage >5000/node → Context reduction strategy
│  └─ Expensive model overuse → Model selection optimization
└─ Resource contention (wait time >50% execution time)
   ├─ Tool latency bottleneck → Cache or parallelize tool calls
   └─ Dependency blocking → DAG restructuring

2. Optimization Priority Matrix

Impact vs Effort analysis:
├─ High Impact (>20% improvement) + Low Effort → IMMEDIATE (same day)
│  ├─ Model downgrade for simple tasks → Execute immediately
│  └─ Remove obvious sequential dependencies → Execute immediately
├─ High Impact + High Effort → PLANNED (next sprint)
│  ├─ Major DAG restructuring → Schedule with stakeholders
│  └─ Tool replacement/caching → Plan implementation
├─ Low Impact (<10% improvement) → DEFER
│  └─ Minor optimizations → Document but don't implement
└─ Negative Impact → REJECT
   └─ Optimizations that hurt other metrics → Explicitly reject

3. Cost-Latency Trade-off Decision

Performance requirement context?
├─ Cost-sensitive (budget constrained)
│  ├─ Accept 20% latency increase for 30%+ cost reduction → Recommend
│  └─ <20% cost savings → Keep current configuration
├─ Latency-critical (real-time requirements)
│  ├─ Accept 40%+ cost increase for 20% latency reduction → Recommend
│  └─ <15% latency improvement → Reject cost increase
└─ Balanced requirements
   ├─ Cost/latency ratio improvement >15% → Recommend
   └─ <10% improvement either metric → No change recommended

FAILURE MODES

Over-Optimization Syndrome

False Bottleneck Attribution

Cost Underestimation Trap

Parallelization Fantasy

Single-Metric Tunnel Vision

WORKED EXAMPLES

Code Review DAG Analysis

Initial State: 5-node code review DAG: 45s total, $0.42 cost

extract-code: 4.2s, 2,400 tokens, Sonnet
analyze-complexity: 8.1s (3.4s wait + 4.7s exec), 4,200 tokens, Sonnet
check-security: 6.8s, 3,100 tokens, Sonnet
review-performance: 12.4s, 8,900 tokens, Opus
generate-report: 13.5s (9.2s wait + 4.3s exec), 5,200 tokens, Sonnet

Step 1 - Bottleneck Classification Primary bottleneck: review-performance at 12.4s (27% of total) - Single slow node pattern Secondary: Dependency blocking causing 12.6s total wait time

Step 2 - Apply Decision Tree High latency bottleneck + resource contention → Restructure for parallelization + break down slow node

Step 3 - Optimization Recommendations

HIGH IMPACT: Split review-performance into check-patterns (3s, Sonnet) + assess-complexity (4s, Sonnet)
- Saves: 5.4s latency, $0.08 cost (model downgrade)
MEDIUM IMPACT: Parallelize analyze-complexity + check-security (currently sequential)
- Saves: 6.8s latency by removing wait time
DEFER: Context reduction could save $0.05 but <5% impact

Final Result: 28s total (38% faster), $0.34 cost (19% cheaper)

High-Cost Analytics Pipeline

Initial State: 8-node data analysis: 67s total, $2.40 cost, 95% Opus usage

Step 1 - Cost Analysis Discovery

extract-tables: 2,800 tokens, Opus ($0.42) - Simple extraction task
clean-data: 3,200 tokens, Opus ($0.48) - Pattern matching task
statistical-analysis: 12,600 tokens, Opus ($1.89) - Complex reasoning
generate-insights: 9,400 tokens, Opus ($1.41) - Moderate analysis

Step 2 - Model Selection Decision Tree Using complexity assessment:

Extract/clean: Simple → Haiku ($0.007 vs $0.15/1k)
Statistical: Complex reasoning → Keep Opus
Insights: Moderate → Sonnet ($0.003 vs $0.15/1k)

Step 3 - Impact Calculation

Extract + clean: 6,000 tokens × $0.143 savings/1K = $0.86 savings
Insights: 9,400 tokens × $0.012 savings/1K = $0.11 savings
Total: $0.97 savings (40% cost reduction), 2s latency increase (3%)

Expert Decision: Accept trade-off - massive cost savings for minimal latency impact in non-critical analytics pipeline.

QUALITY GATES

Performance profiling complete when:

[ ] Execution metrics parsed with node-by-node timing breakdown
[ ] Token usage calculated per node with model cost attribution
[ ] Wait time separated from execution time for each node
[ ] Critical path identified with percentage of total duration
[ ] Bottlenecks ranked by impact (>15% improvement threshold)
[ ] Cost-latency trade-offs quantified for major recommendations
[ ] Model selection recommendations matched to task complexity levels
[ ] Parallelization suggestions verified against actual data dependencies
[ ] Implementation effort estimated (immediate/planned/complex) for each optimization
[ ] Performance improvement projections include confidence intervals

NOT-FOR BOUNDARIES

This skill should NOT be used for:

Real-time execution monitoring → Use dag-execution-tracer instead
Failure root cause analysis → Use dag-failure-analyzer instead
DAG structural design from scratch → Use dag-architect instead
Automatic optimization implementation → Use dag-auto-optimizer instead
Resource allocation planning → Use dag-task-scheduler instead

Delegate when:

Need live execution logs or traces → dag-execution-tracer
Performance issue is masking failures → dag-failure-analyzer
Recommendations require major DAG redesign → dag-architect
User wants hands-off optimization → dag-auto-optimizer
Need to schedule optimized DAG → dag-task-scheduler

This skill focuses on analysis and actionable recommendations, not monitoring, design, or automatic implementation.

Related Skills

curiositech/revisiting-interview-data-analysing-turn

data-ai

VerifiedTrustedCommunity

license: Apache-2.0 NOT for unrelated tasks outside this domain.

8SKILL.mdUpdated Jul 19, 2026

curiositech/revisiting-interview-data-analysing-turn

curiositech/redis-patterns-expert

development

VerifiedTrustedCommunity

Use when designing caching strategies (cache-aside, write-through, write-behind), implementing distributed locks, building rate limiters, leaderboards, real-time streams (XADD/consumer groups), pub/sub, or tuning eviction policies. Triggers: thundering-herd on cache miss, dogpile on key expiry, Redlock vs SET-NX-PX choice, sliding-window rate limiter, hot-key on a single cluster slot, big-key blowup, MULTI/EXEC across slots, KEYS in production. NOT for Redis Cluster operations/admin (different domain), embedded KV (SQLite, leveldb), in-process LRU caches, or Memcached.

8SKILL.mdUpdated Jul 19, 2026

curiositech/redis-patterns-expert

curiositech/react-server-components-boundary

tools

VerifiedTrustedCommunity

Drawing the `'use client'` boundary correctly in React Server Components apps (Next.js App Router, RSC frameworks) — leaf-pushing, slot composition, serialization rules, and environment poisoning prevention. Grounded in react.dev and Next.js 16 docs.

8SKILL.mdUpdated Jul 19, 2026

curiositech/react-server-components-boundary

curiositech/rate-limiting-strategy

development

VerifiedTrustedCommunity

Use when designing rate limiting for an API, choosing between token bucket / sliding window / leaky bucket / fixed window, implementing it in Redis, deciding edge (Cloudflare/Upstash) vs origin enforcement, sizing per-user vs per-IP vs per-endpoint quotas, returning the right 429 response with Retry-After, or fixing the boundary-burst bug in fixed-window limiters. Triggers: 429 too many requests, INCR + EXPIRE, ZADD + ZREMRANGEBYSCORE + ZCARD, X-RateLimit-Remaining header, Cloudflare WAF rate limiting rules, Upstash @upstash/ratelimit, leaky bucket shaping vs policing, distributed rate limiter consistency. NOT for DDoS mitigation specifically (different scale), CAPTCHA / bot management, full WAF design, or per-user quota billing.

8SKILL.mdUpdated Jul 19, 2026

curiositech/rate-limiting-strategy

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/curiositech/windags-skills.git

# Copy into Claude Code skills folder (global)
cp -r windags-skills/skills/dag-performance-profiler ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

curiositech/windags-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT