skills/windags-architect/SKILL.md
Build WinDAGs — the orchestration platform where AI agents accumulate genuine expertise through DAGs of skillful agents. Covers DAG design, execution engines, meta-DAG architecture, skill selection, dynamic mutation, visualization, and deployment. Activate on "windags", "agent DAG", "DAG of agents", "workflow orchestration", "agent pipeline", "dynamic DAG", "meta-DAG", "build windags", "implement windags". NOT for understanding WHY decisions were made (use windags-avatar), creating individual skills (use skill-architect), or managing skill libraries (use windags-librarian).
npx skillsauth add curiositech/windags-skills windags-architectInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Build WinDAGs: directed acyclic graphs of skillful agents that accumulate genuine expertise. Each node is an agent with curated skills; each edge is a typed dependency. The system builds DAGs, executes them in waves, mutates them at runtime, evaluates quality across four layers, and crystallizes reusable skills from execution data.
Problem type assessment:
├── Simple task (< 5 nodes, single domain)
│ ├── If user has CLI access → Use local mode
│ └── If web interface needed → Use embedded mode
├── Medium task (5-20 nodes, 2-3 domains)
│ ├── If team collaboration needed → Use web mode
│ ├── If execution history important → Use web mode
│ └── If isolated/private → Use local mode
└── Complex task (20+ nodes, multiple domains)
├── If distributed execution needed → Use web mode
├── If real-time monitoring required → Use web mode
└── If cost optimization critical → Use web mode with smart routing
Problem complexity:
├── Sequential chain (A→B→C)
│ ├── If error-prone steps → Add approval gates between nodes
│ └── If simple pipeline → Use basic data-flow edges
├── Fan-out pattern (A→[B,C,D]→E)
│ ├── If independent work streams → Use parallel execution
│ ├── If failure domains overlap → Isolate to separate waves
│ └── If results must merge → Add aggregator node
├── Iterative refinement (A→B→C→B)
│ ├── If quality improvement loop → Use loop_back mutation
│ ├── If human oversight needed → Add approval gates
│ └── If automatic iteration → Use quality threshold triggers
└── Multi-stage validation (parallel checks)
├── If binary validation → Use Floor/Wall evaluation only
├── If process quality matters → Enable Ceiling evaluation
└── If stress testing needed → Monitor Envelope metrics
Task certainty assessment:
├── COMMITTED (implementation clear)
│ ├── If skill signature known → Direct skill assignment
│ ├── If template available → Use template with parameters
│ └── If standard pattern → Apply seed template
├── TENTATIVE (approach unclear)
│ ├── If 2-3 viable approaches → Create parallel exploration nodes
│ ├── If dependency on prior results → Mark as vague node
│ └── If expert consultation needed → Add human approval gate
└── EXPLORATORY (research needed)
├── If domain expertise required → Use research-synthesis template
├── If multiple unknowns → Break into sub-DAG
└── If high uncertainty → Add PreMortem analysis
Risk assessment:
├── Node-level breakers
│ ├── If node failure rate > 20% → Set 3-attempt limit with escalation
│ ├── If expensive model calls → Set cost threshold per node
│ └── If time-sensitive → Set timeout with fallback skill
├── Skill-level breakers
│ ├── If skill success rate < 60% → Disable for 1 hour, use backup
│ ├── If skill shows degrading performance → Trigger Thompson sampling reset
│ └── If skill causes cascade failures → Add to temporary blacklist
└── Model-level breakers
├── If API rate limits hit → Route to secondary provider
├── If model availability < 90% → Use tier fallback (Sonnet→Haiku)
└── If cost drift > 30% of budget → Switch to cheaper model tier
Execution state analysis:
├── Quality below threshold (< 0.6 floor score)
│ ├── If single node failure → Try replace_node with different skill
│ ├── If edge protocol mismatch → Try add_edge for missing dependency
│ └── If topology issue → Try split_parallel for ambiguous task
├── Resource constraints hit
│ ├── If cost exceeds budget → Try remove_node for redundant tasks
│ ├── If time exceeds deadline → Try split_parallel for faster execution
│ └── If model availability issues → Try replace_node with different model tier
└── Escalation ladder exhausted
├── If 3+ mutation attempts failed → escalate_human with full context
├── If critical path blocked → escalate_human with options
└── If novel failure pattern → escalate_human for learning
Detection: Multiple nodes fail simultaneously with correlated error patterns Symptoms: Wave fails entirely when one node fails; cascade failure rate > 15% Fix: Implement failure domain isolation (BC-PLAN-003). Separate nodes using same provider/model/skill into different waves. Add circuit breakers at provider level.
Detection: >30% of nodes remain vague after 2 wave completions; planning time exceeds execution time Symptoms: "Analysis paralysis" in Wave 1; Decomposer creates vague nodes instead of commitments Fix: Force commitment threshold: If Pass 1 recognition < 0.6, escalate to human. Use domain meta-skills to add structure. Apply seed templates for common patterns.
Detection: New skills selected despite poor performance; selection algorithm ignores obvious quality signals Symptoms: Repeatedly selecting untested skills over proven ones; ignoring Haiku ranking confidence Fix: Implement warm-start Beta priors from skill signature similarity. Trust Haiku ranking at cold start (Alpha=5, Beta=1). Only apply Thompson perturbation after 10+ executions.
Detection: Mutator creates mutations that trigger more mutations; execution never stabilizes Symptoms: >5 mutation cycles in single DAG; Mutator modifies its own topology Fix: Implement mutation depth limits (max 3 per DAG). Prevent meta-agent self-modification. Add "mutation storm" detection with automatic escalation.
Detection: Stage 2 (Ceiling) evaluation costs exceed node execution costs; evaluation time > 40% of total runtime
Symptoms: Expensive quality checks on trivial nodes; Ceiling evaluation triggered inappropriately
Fix: Tune conditional trigger: failureProbability × downstreamWaste > reviewCost. Skip Ceiling for nodes with < $0.01 downstream impact. Use Haiku for Ceiling evaluation on low-stakes nodes.
Detection: Embedding narrowing bypassed; Thompson sampling applied to full 191-skill library Symptoms: Selection time > 2 seconds per node; poor skill matches despite good library coverage Fix: Enforce 3-step cascade: embeddings → Haiku → Thompson. Never skip Step 1 unless library < 15 skills. Log cascade timing and enforce budget limits per step.
Detection: Nodes in Wave N depend on incomplete nodes from Wave N+1; topological sort failures Symptoms: Deadlock during wave execution; "circular dependency" errors in scheduler Fix: Validate wave assignments with strict topological ordering. Prevent vague nodes from creating forward dependencies. Use Kahn's algorithm validation before wave execution starts.
Problem: "Review this pull request for security issues and code quality"
Decision Process:
DAG Construction:
const dag = builder('code-review-pr')
.skillNode('security-scan', 'security-auditor') // Wave 0
.skillNode('quality-check', 'code-reviewer') // Wave 1
.dependsOn('security-scan')
.skillNode('summary', 'review-synthesizer') // Wave 2
.dependsOn('quality-check')
.approvalGate('human-review', { // Wave 3
prompt: 'Approve changes?',
options: [
{ id: 'approve', label: 'LGTM', action: 'approve' },
{ id: 'revise', label: 'Needs work', action: 'revise', branchTo: 'quality-check' }
]
}).dependsOn('summary')
Execution Flow:
Expert vs Novice: Expert catches that security scan should use Sonnet (complex reasoning), while quality-check can use Haiku (pattern matching). Novice uses same model for all nodes.
Problem: "Design and implement a caching layer for our API"
Decision Process:
Initial DAG (with vague nodes):
const dag = builder('api-caching-layer')
.skillNode('requirements', 'system-analyst') // Wave 0 - COMMITTED
.vagueNode('cache-strategy', { // Wave 1 - TENTATIVE
role_description: 'Choose caching strategy (Redis/Memcached/in-memory)',
dependency_list: ['requirements']
})
.vagueNode('implementation', { // Wave 2 - EXPLORATORY
role_description: 'Implement chosen caching solution',
dependency_list: ['cache-strategy']
})
.skillNode('integration', 'integration-tester') // Wave 3 - COMMITTED
.dependsOn('implementation')
Wave-by-Wave Resolution:
Wave 0: system-analyst produces requirements (recognition = 0.95 → plan Wave 1 immediately)
Wave 1: cache-strategy node needs resolution
distributed-systems-architect skillWave 2: implementation node needs resolution
redis-implementation skillExpert vs Novice: Expert recognizes that cache-strategy requires trade-off analysis and uses expensive model (Sonnet) for decision quality. Novice assigns cheap model and gets poor architectural decisions. Expert also anticipates implementation complexity and pre-plans for split_parallel mutation.
Problem: "Analyze customer churn data and recommend retention strategies"
Scenario: Mid-execution, data-analyzer node fails repeatedly due to malformed data
Initial Failure:
Mutation Decision Tree:
Failure analysis:
├── Error type: Data format issue (not reasoning failure)
│ ├── Try replace_node with data-preprocessing skill
│ └── If still fails → Try add_node for format conversion
├── Check dependencies: Previous node output format unknown
│ ├── Try add_edge for explicit format specification
│ └── If format mismatch → Try split_parallel for multiple parsers
└── Resource check: Still within budget and time limits
└── Apply mutation, don't escalate yet
Mutation Applied: replace_node
Quality Check: Stage 2 evaluation triggered (high downstream impact)
Expert vs Novice: Expert recognizes data format issues early and chooses data-preprocessing skill. Expert also sets appropriate circuit breaker thresholds (3 attempts for data issues, 1 attempt for API issues). Novice doesn't differentiate error types and applies wrong mutation.
Do NOT use windags-architect for:
windags-avatar for ADR provenance, tradition attribution, and constitutional detailsskill-architect for YAML skill creation, L3 procedural content, and skill validationwindags-librarian for skill discovery, curation, and library organizationmermaid-graph-renderer for flowcharts and visual documentationcognitive-debugger for task analysis and error diagnosisllm-cost-optimizer for model selection and provider routingwindags-observer for execution monitoring and alertingDelegate instead when:
windags-avatarskill-architectwindags-librarianwindags-observertools
Building resilient distributed systems with circuit breakers, retries with full-jitter exponential backoff, retry budgets (per-request 3-attempt + per-client 10% ratio per Google SRE), deadline propagation, and the cascading-failure math (4 layers × 3 retries = 64x amplification). Grounded in Resilience4j, Microsoft Cloud Patterns, AWS Architecture Blog (Marc Brooker), and Google SRE Book.
testing
Designing HTTP cache headers that work correctly across browsers, CDNs, and shared proxies — `Cache-Control` directives per RFC 9111, `stale-while-revalidate` and `stale-if-error` per RFC 5861, the Vary header for varying responses, and surrogate keys for tag-based purging. Grounded in IETF RFCs and Cloudflare/Fastly docs.
development
Use when designing or fixing a Content Security Policy on a real site, choosing between nonce-based and hash-based CSP, adding strict-dynamic, debugging "Refused to execute inline script" errors, deploying CSP in report-only mode first, configuring report-to / report-uri, or auditing an existing policy for unsafe-inline / unsafe-eval / wildcards. Triggers: "CSP blocks legitimate inline script", strict-dynamic, nonce-{RANDOM}, sha256-{HASH}, object-src none, base-uri none, frame-ancestors, Trusted Types, X-Content-Security-Policy obsolete, report-only vs enforced. NOT for general HTTP security headers (HSTS, COOP/COEP), Trusted Types deep dive, CORS configuration, or building a WAF.
tools
Choosing and operating an HTTP API versioning strategy that doesn't break clients — Stripe's date-based pinned versions, the Deprecation/Sunset header pair (RFC 9745 + RFC 8594), URI vs header vs media-type approaches, and the version-transformer pattern. Grounded in Stripe's published architecture and IETF RFCs.