skills/conditions-for-intuitive-expertise-a-fai/SKILL.md
Diagnose when intuitive judgment, agent confidence, or expert routing can be trusted by classifying environment validity, feedback quality, and task-boundary fit. Use for confidence calibration, agent routing, expertise audits, and escalation design. NOT for deterministic implementation tasks, pure syntax debugging, or domains with explicit verifiable answers.
npx skillsauth add curiositech/windags-skills conditions-for-intuitive-expertise-a-faiInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill when the hard problem is not generating an answer, but deciding whether intuition, expert judgment, or agent confidence deserves trust in the first place.
This skill is not the primary lens for:
Expert intuition is only trustworthy when the environment has stable, learnable regularities. If the domain is mostly noise, confidence will still form; it just will not track truth.
Even a valid environment will not produce expertise without timely, accurate feedback. Delayed, corrupted, or selectively remembered feedback produces practiced error rather than practiced skill.
Confidence tracks how internally consistent the available cues feel. In low-validity environments, high confidence is often a hazard signal rather than reassurance.
Skill at one subtask does not reliably transfer to a neighboring subtask that merely feels similar. The boundary problem matters more than the prestige of the prior success.
Use judgment-heavy approaches when tacit cues are real and feedback is clean. Use algorithms or structured ensembles when noise dominates and intuitions cannot be trained safely.
flowchart TD
A[New decision domain] --> B{Stable cues that predict outcomes?}
B -->|No| C[Low-validity environment]
B -->|Yes| D{Fast, accurate, repeated feedback?}
D -->|No| E[Wicked or degraded learning environment]
D -->|Yes| F[High-validity environment]
C --> G[Prefer algorithmic or structured ensemble path]
E --> H[Use hybrid path with suppressed confidence and external checks]
F --> I{Task matches demonstrated expertise boundary?}
I -->|No| J[Escalate or re-route for fractionated expertise]
I -->|Yes| K[Allow expert judgment or recognition-primed handling]
The system treats fluent, high-confidence output as proof of correctness. This repeats the exact illusion-of-validity failure the paper warns about.
A team assumes repetition in a noisy domain will produce expertise. It instead produces stronger stories, better rhetoric, and no real predictive gain.
Feedback arrives late, is politically distorted, or only surfaces successes. Agents then learn the wrong cues and become confidently brittle.
A skill that works for one subtask is invoked on a neighboring task with different causal structure. The output sounds plausible because the vocabulary overlaps, but the competence boundary has already been crossed.
Human review is added only after confidence gets high, rather than when environment validity or task-boundary fit is poor. Review then becomes a rubber stamp instead of a real safeguard.
A team wants an "expert market intuition" agent to decide whether a stock is underpriced. The framework says the environment is low-validity and feedback is heavily confounded, so the agent should not get autonomy based on fluent market narratives. Route instead to an actuarial or ensemble baseline, then use judgment only for anomaly explanation.
A radiology-adjacent agent is strong on spotting abnormalities in one imaging modality and is asked to generalize to another that shares surface vocabulary but different cue structure. The framework flags fractionated expertise, so the system should degrade autonomy and require modality-specific validation before trusting the carryover.
| File | Load when... |
| --- | --- |
| references/validity-environment-and-agent-trust.md | You need the full environment-validity diagnostic and its implications for trust. |
| references/algorithms-vs-intuition-routing-framework.md | You are choosing between algorithmic, judgment-heavy, or hybrid routing. |
| references/overconfidence-and-the-illusion-of-validity.md | High-confidence output needs scrutiny and calibration policy. |
| references/fractured-expertise-and-domain-boundary-detection.md | A skill appears adjacent to the task, but you are unsure the competence really transfers. |
| references/recognition-primed-decision-making-for-agents.md | The environment is moderately valid and time pressure favors recognition plus simulation over exhaustive analysis. |
| references/the-boundary-problem-knowing-when-not-to-trust-yourself.md | You are designing escalation logic or competence self-monitoring. |
You have internalized this skill if you naturally ask:
tools
Building resilient distributed systems with circuit breakers, retries with full-jitter exponential backoff, retry budgets (per-request 3-attempt + per-client 10% ratio per Google SRE), deadline propagation, and the cascading-failure math (4 layers × 3 retries = 64x amplification). Grounded in Resilience4j, Microsoft Cloud Patterns, AWS Architecture Blog (Marc Brooker), and Google SRE Book.
testing
Designing HTTP cache headers that work correctly across browsers, CDNs, and shared proxies — `Cache-Control` directives per RFC 9111, `stale-while-revalidate` and `stale-if-error` per RFC 5861, the Vary header for varying responses, and surrogate keys for tag-based purging. Grounded in IETF RFCs and Cloudflare/Fastly docs.
development
Use when designing or fixing a Content Security Policy on a real site, choosing between nonce-based and hash-based CSP, adding strict-dynamic, debugging "Refused to execute inline script" errors, deploying CSP in report-only mode first, configuring report-to / report-uri, or auditing an existing policy for unsafe-inline / unsafe-eval / wildcards. Triggers: "CSP blocks legitimate inline script", strict-dynamic, nonce-{RANDOM}, sha256-{HASH}, object-src none, base-uri none, frame-ancestors, Trusted Types, X-Content-Security-Policy obsolete, report-only vs enforced. NOT for general HTTP security headers (HSTS, COOP/COEP), Trusted Types deep dive, CORS configuration, or building a WAF.
tools
Choosing and operating an HTTP API versioning strategy that doesn't break clients — Stripe's date-based pinned versions, the Deprecation/Sunset header pair (RFC 9745 + RFC 8594), URI vs header vs media-type approaches, and the version-transformer pattern. Grounded in Stripe's published architecture and IETF RFCs.