skills/dag-skills-matcher/SKILL.md
Matches natural language task descriptions to appropriate skills using semantic similarity, ranks candidates by fit and performance history, and maintains the skill catalog. Use when assigning skills to DAG nodes, searching for the right skill for a task, ranking competing skills, or browsing the skill catalog. Activate on "find skill", "match skill", "which skill", "skill for this task", "skill catalog", "rank skills", "best skill". NOT for executing DAGs (use dag-runtime), creating skills (use skill-architect), or grading skills (use skill-grader).
npx skillsauth add curiositech/windags-skills dag-skills-matcherInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Matches tasks to skills, ranks candidates, and maintains the skill catalog.
task_description → extract_intent()
├─ fit_score ≥ 0.5 for top candidate?
│ ├─ YES → check NOT clauses
│ │ ├─ passes → assign skill
│ │ └─ fails → try next candidate
│ └─ NO → retry with broader search terms
│ └─ still no fit_score ≥ 0.5?
│ └─ escalate to skill-architect (new skill needed)
task_complexity = count(technical_terms, domain_words, constraints)
├─ complexity ≤ 3 → keyword_search_only
├─ 4-8 → keyword + semantic_similarity
└─ >8 → full_pipeline (keyword + semantic + domain_tags + thompson)
candidate_count → ranking_strategy
├─ 1-2 candidates → simple fit_score ranking
├─ 3-5 candidates → weighted: fit(0.5) + elo(0.3) + cost(0.2)
└─ >5 candidates → full scoring with thompson sampling
task_priority + budget_constraints → selection_criteria
├─ high_priority + unlimited_budget → maximize fit_score + elo
├─ medium_priority → balanced scoring (default weights)
└─ low_priority + cost_sensitive → weight cost(0.5) + fit(0.3) + elo(0.2)
Symptoms: fit_scores consistently < 0.3, many "skill not found" escalations Detection: If >20% of searches in 24h escalate to skill-architect Fix: Update skill descriptions with recent task language patterns
Symptoms: Same 2-3 skills always selected, no skill performance comparison data Detection: If top skill selection rate > 80% for any domain over 100 tasks Fix: Increase Thompson sampling beta parameter by 10%, force exploration
Symptoms: Skills assigned to incompatible tasks, high downstream failure rates Detection: If downstream acceptance_rate < 0.7 for any skill Fix: Strengthen NOT-clause checking, add regex patterns for exclusion terms
Symptoms: Skills matched on superficial word similarity, not actual capability Detection: If fit_score > 0.7 but downstream acceptance_rate < 0.5 Fix: Add domain-specific negative embeddings, weight keyword matching higher
Symptoms: Always selecting cheapest skills, degrading output quality Detection: If avg_cost_per_use drops >30% while acceptance_rate drops >15% Fix: Set minimum fit_score threshold (0.5), reject candidates below threshold regardless of cost
Task: "Build a scikit-learn classifier for customer churn prediction with hyperparameter tuning"
Process:
sklearn-tuner (fit=0.85, elo=1750), automl-skill (fit=0.72, elo=1820), python-ml-basic (fit=0.65, elo=1680)Novice miss: Would pick automl-skill due to higher Elo, missing that sklearn-tuner is more specifically matched Expert catch: Recognizes hyperparameter tuning keyword strongly favors sklearn-tuner despite slightly lower Elo
Task: "Review this code for issues"
Process:
code-review-general (fit=0.55)Expert insight: Recognizes that vague tasks need either clarification or new specialized skills
This skill should NOT be used for:
skill-architect insteadskill-grader insteaddag-runtime insteadskill-configurator insteadskill-lifecycle-manager insteadDomain boundaries:
skill-marketplace insteadskill-analytics insteadskill-federation insteadtools
Building resilient distributed systems with circuit breakers, retries with full-jitter exponential backoff, retry budgets (per-request 3-attempt + per-client 10% ratio per Google SRE), deadline propagation, and the cascading-failure math (4 layers × 3 retries = 64x amplification). Grounded in Resilience4j, Microsoft Cloud Patterns, AWS Architecture Blog (Marc Brooker), and Google SRE Book.
testing
Designing HTTP cache headers that work correctly across browsers, CDNs, and shared proxies — `Cache-Control` directives per RFC 9111, `stale-while-revalidate` and `stale-if-error` per RFC 5861, the Vary header for varying responses, and surrogate keys for tag-based purging. Grounded in IETF RFCs and Cloudflare/Fastly docs.
development
Use when designing or fixing a Content Security Policy on a real site, choosing between nonce-based and hash-based CSP, adding strict-dynamic, debugging "Refused to execute inline script" errors, deploying CSP in report-only mode first, configuring report-to / report-uri, or auditing an existing policy for unsafe-inline / unsafe-eval / wildcards. Triggers: "CSP blocks legitimate inline script", strict-dynamic, nonce-{RANDOM}, sha256-{HASH}, object-src none, base-uri none, frame-ancestors, Trusted Types, X-Content-Security-Policy obsolete, report-only vs enforced. NOT for general HTTP security headers (HSTS, COOP/COEP), Trusted Types deep dive, CORS configuration, or building a WAF.
tools
Choosing and operating an HTTP API versioning strategy that doesn't break clients — Stripe's date-based pinned versions, the Deprecation/Sunset header pair (RFC 9745 + RFC 8594), URI vs header vs media-type approaches, and the version-transformer pattern. Grounded in Stripe's published architecture and IETF RFCs.