Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

curiositech/dag-quality

Name: dag-quality
Author: curiositech

skills/dag-quality/SKILL.md

npx skillsauth add curiositech/windags-skills dag-quality

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

DAG Quality

Validates outputs, scores confidence, detects hallucinations, monitors convergence, decides on iteration, and synthesizes feedback. The quality gate between DAG nodes. Consolidates dag-output-validator, dag-confidence-scorer, dag-hallucination-detector, dag-convergence-monitor, dag-iteration-detector, and dag-feedback-synthesizer.

When to Use

✅ Use for:

Validating node output against declared schema
Scoring confidence on agent outputs (0-1)
Detecting fabricated content, false citations, unverifiable claims
Deciding whether to iterate (re-execute a node or loop)
Generating structured improvement feedback for re-execution
Monitoring quality trends across iterations

❌ NOT for:

Executing nodes (use dag-runtime)
Planning DAG structure (use dag-planner)
Grading skills themselves (use skill-grader)

Quality Pipeline

flowchart TD
  O[Node output] --> SV[Schema validation]
  SV -->|Invalid| REJ[Reject + specific errors]
  SV -->|Valid| CV[Content validation]
  CV --> CS[Confidence scoring]
  CS --> HD[Hallucination detection]
  HD --> D{Quality above threshold?}
  D -->|Yes| ACC[Accept → pass to downstream]
  D -->|Below threshold, iteration < max| FB[Generate feedback]
  FB --> RE[Re-execute with feedback]
  D -->|Below threshold, iteration = max| ESC[Escalate to human]

Schema Validation

Structural check: does the output match the node's declared output contract?

Required fields present
Types correct (string, number, array, object)
Constraints met (min/max length, ranges, enums)
Nested structures valid

Content Validation

Semantic check: is the content reasonable?

Non-empty meaningful content (not just filler)
Length within expected range for the task
Internal consistency (no contradictions)
References exist (cited files, URLs, identifiers)

Confidence Scoring

Aggregate four evaluator signals (see skill-lifecycle.md for full architecture):

| Evaluator | Weight | Signal | |-----------|--------|--------| | Self-evaluation | 0.15 | Agent's own assessment (sycophancy-biased) | | Peer evaluation | 0.25 | Separate judge agent with skill-grader | | Downstream evaluation | 0.35 | Next node reports usability | | Human evaluation | 0.50 | At human gates, gold standard |

Final score = weighted average of available signals (normalize weights to sum to 1.0).

Hallucination Detection

Specific checks for fabricated content:

Citation verification: Do cited sources exist? Do they say what's claimed?
Internal consistency: Does the output contradict itself or the input?
Confidence calibration: High-confidence claims on topics where uncertainty is expected
Entity verification: Do named entities (people, tools, APIs) actually exist?

Iteration Decision

If quality_score >= 0.8: ACCEPT
If quality_score < 0.8 AND iterations < max_iterations: ITERATE with feedback
If quality_score < 0.5 AND iterations >= max_iterations: ESCALATE to human
If quality_score < 0.3 on first attempt: ESCALATE immediately (fundamentally wrong)

Feedback Synthesis

When iterating, produce structured improvement guidance:

{
  "overall_score": 0.65,
  "specific_issues": [
    {"field": "recommendations", "issue": "Only 2 of 5 required recommendations provided", "fix": "Add 3 more recommendations addressing scalability, testing, and deployment"},
    {"field": "citations", "issue": "Source [3] returns 404", "fix": "Replace with a working source or remove the claim"}
  ],
  "strengths_to_preserve": ["Clear structure", "Good code examples"],
  "iteration_guidance": "Focus on completeness (missing recommendations) and citation accuracy. Do not rewrite the well-structured sections."
}

Convergence Monitoring

Track quality scores across iterations to detect:

Improving: Score trending up → continue iterating
Plateauing: Score stable for 2+ iterations → stop, accept current best
Degrading: Score declining → stop, revert to best previous iteration
Oscillating: Score alternating up/down → stop, the feedback is contradictory

Replaces

Consolidates: dag-output-validator, dag-confidence-scorer, dag-hallucination-detector, dag-convergence-monitor, dag-iteration-detector, dag-feedback-synthesizer

curiositech/dag-quality

skills/dag-quality/SKILL.md

Validates agent outputs against schemas and quality criteria, scores confidence, detects hallucinations, monitors convergence, decides when to iterate, and synthesizes actionable feedback. Use when checking if a node's output is acceptable, scoring confidence, detecting fabricated content, deciding whether to re-execute, or generating improvement feedback. Activate on "validate output", "check quality", "confidence score", "hallucination check", "should we iterate", "improvement feedback". NOT for executing DAGs (use dag-runtime), planning DAGs (use dag-planner), or matching skills (use dag-skills-matcher).

testing

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add curiositech/windags-skills dag-quality

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 2:05 PM23.1s1 file scanned

SKILL.md

license:: BSL-1.1
name:: dag-quality
description:: Validates agent outputs against schemas and quality criteria, scores confidence, detects hallucinations, monitors convergence, decides when to iterate, and synthesizes actionable feedback. Use when checking if a node's output is acceptable, scoring confidence, detecting fabricated content, deciding whether to re-execute, or generating improvement feedback. Activate on "validate output", "check quality", "confidence score", "hallucination check", "should we iterate", "improvement feedback". NOT for executing DAGs (use dag-runtime), planning DAGs (use dag-planner), or matching skills (use dag-skills-matcher).
allowed-tools:: Read,Write,Edit,Grep,Glob
category:: Agent & Orchestration

DAG Quality

When to Use

✅ Use for:

Validating node output against declared schema
Scoring confidence on agent outputs (0-1)
Detecting fabricated content, false citations, unverifiable claims
Deciding whether to iterate (re-execute a node or loop)
Generating structured improvement feedback for re-execution
Monitoring quality trends across iterations

❌ NOT for:

Executing nodes (use dag-runtime)
Planning DAG structure (use dag-planner)
Grading skills themselves (use skill-grader)

Quality Pipeline

flowchart TD
  O[Node output] --> SV[Schema validation]
  SV -->|Invalid| REJ[Reject + specific errors]
  SV -->|Valid| CV[Content validation]
  CV --> CS[Confidence scoring]
  CS --> HD[Hallucination detection]
  HD --> D{Quality above threshold?}
  D -->|Yes| ACC[Accept → pass to downstream]
  D -->|Below threshold, iteration < max| FB[Generate feedback]
  FB --> RE[Re-execute with feedback]
  D -->|Below threshold, iteration = max| ESC[Escalate to human]

Schema Validation

Structural check: does the output match the node's declared output contract?

Required fields present
Types correct (string, number, array, object)
Constraints met (min/max length, ranges, enums)
Nested structures valid

Content Validation

Semantic check: is the content reasonable?

Non-empty meaningful content (not just filler)
Length within expected range for the task
Internal consistency (no contradictions)
References exist (cited files, URLs, identifiers)

Confidence Scoring

Aggregate four evaluator signals (see skill-lifecycle.md for full architecture):

Final score = weighted average of available signals (normalize weights to sum to 1.0).

Hallucination Detection

Specific checks for fabricated content:

Citation verification: Do cited sources exist? Do they say what's claimed?
Internal consistency: Does the output contradict itself or the input?
Confidence calibration: High-confidence claims on topics where uncertainty is expected
Entity verification: Do named entities (people, tools, APIs) actually exist?

Iteration Decision

If quality_score >= 0.8: ACCEPT
If quality_score < 0.8 AND iterations < max_iterations: ITERATE with feedback
If quality_score < 0.5 AND iterations >= max_iterations: ESCALATE to human
If quality_score < 0.3 on first attempt: ESCALATE immediately (fundamentally wrong)

Feedback Synthesis

When iterating, produce structured improvement guidance:

{
  "overall_score": 0.65,
  "specific_issues": [
    {"field": "recommendations", "issue": "Only 2 of 5 required recommendations provided", "fix": "Add 3 more recommendations addressing scalability, testing, and deployment"},
    {"field": "citations", "issue": "Source [3] returns 404", "fix": "Replace with a working source or remove the claim"}
  ],
  "strengths_to_preserve": ["Clear structure", "Good code examples"],
  "iteration_guidance": "Focus on completeness (missing recommendations) and citation accuracy. Do not rewrite the well-structured sections."
}

Convergence Monitoring

Track quality scores across iterations to detect:

Improving: Score trending up → continue iterating
Plateauing: Score stable for 2+ iterations → stop, accept current best
Degrading: Score declining → stop, revert to best previous iteration
Oscillating: Score alternating up/down → stop, the feedback is contradictory

Replaces

Consolidates: dag-output-validator, dag-confidence-scorer, dag-hallucination-detector, dag-convergence-monitor, dag-iteration-detector, dag-feedback-synthesizer

Related Skills

curiositech/revisiting-interview-data-analysing-turn

data-ai

VerifiedTrustedCommunity

license: Apache-2.0 NOT for unrelated tasks outside this domain.

8SKILL.mdUpdated Jul 19, 2026

curiositech/revisiting-interview-data-analysing-turn

curiositech/redis-patterns-expert

development

VerifiedTrustedCommunity

Use when designing caching strategies (cache-aside, write-through, write-behind), implementing distributed locks, building rate limiters, leaderboards, real-time streams (XADD/consumer groups), pub/sub, or tuning eviction policies. Triggers: thundering-herd on cache miss, dogpile on key expiry, Redlock vs SET-NX-PX choice, sliding-window rate limiter, hot-key on a single cluster slot, big-key blowup, MULTI/EXEC across slots, KEYS in production. NOT for Redis Cluster operations/admin (different domain), embedded KV (SQLite, leveldb), in-process LRU caches, or Memcached.

8SKILL.mdUpdated Jul 19, 2026

curiositech/redis-patterns-expert

curiositech/react-server-components-boundary

tools

VerifiedTrustedCommunity

Drawing the `'use client'` boundary correctly in React Server Components apps (Next.js App Router, RSC frameworks) — leaf-pushing, slot composition, serialization rules, and environment poisoning prevention. Grounded in react.dev and Next.js 16 docs.

8SKILL.mdUpdated Jul 19, 2026

curiositech/react-server-components-boundary

curiositech/rate-limiting-strategy

development

VerifiedTrustedCommunity

Use when designing rate limiting for an API, choosing between token bucket / sliding window / leaky bucket / fixed window, implementing it in Redis, deciding edge (Cloudflare/Upstash) vs origin enforcement, sizing per-user vs per-IP vs per-endpoint quotas, returning the right 429 response with Retry-After, or fixing the boundary-burst bug in fixed-window limiters. Triggers: 429 too many requests, INCR + EXPIRE, ZADD + ZREMRANGEBYSCORE + ZCARD, X-RateLimit-Remaining header, Cloudflare WAF rate limiting rules, Upstash @upstash/ratelimit, leaky bucket shaping vs policing, distributed rate limiter consistency. NOT for DDoS mitigation specifically (different scale), CAPTCHA / bot management, full WAF design, or per-user quota billing.

8SKILL.mdUpdated Jul 19, 2026

curiositech/rate-limiting-strategy

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/curiositech/windags-skills.git

# Copy into Claude Code skills folder (global)
cp -r windags-skills/skills/dag-quality ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

curiositech/windags-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT