skills/cse-state-of-practice/SKILL.md
Review of Cognitive Systems Engineering applications and current practice in safety-critical domains
npx skillsauth add curiositech/windags-skills cse-state-of-practiceInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Apply insights from cognitive systems engineering research to design resilient agent architectures, diagnose coordination failures, and encode expert knowledge that performs under pressure.
START: Need to design multi-agent system
│
├─ Is this a well-defined, stable task sequence?
│ ├─ YES → Use pipeline architecture BUT build 3 failure recovery paths
│ └─ NO → Use goal-oriented architecture with alternative methods
│
├─ Does task require expertise under time pressure?
│ ├─ YES → Implement Recognition-Primed Decision Making pattern
│ │ (situation recognition → rapid simulation → action)
│ └─ NO → Standard deliberative architecture acceptable
│
└─ Will humans supervise or intervene?
├─ YES → Mandatory: mode transparency + shared state representation
└─ NO → Focus on agent-to-agent coordination interfaces
Given complex task to decompose:
│
├─ Can expert describe complete process reliably?
│ ├─ YES → Verify with Critical Decision Method anyway
│ └─ NO → Use structured cognitive task analysis FIRST
│
├─ Are there natural failure/degradation points?
│ ├─ YES → Design alternative paths for each failure mode
│ └─ NO → Suspicious - dig deeper for hidden failure modes
│
└─ Will agents need to adapt methods to context?
├─ YES → Separate goals from methods in specification
└─ NO → Fixed sequence acceptable (rare case)
System failing unexpectedly:
│
├─ Does it work in demos but fail in production?
│ └─ YES → Invariant sequence assumption violated
│
├─ Are humans surprised by agent actions?
│ └─ YES → Automation surprise - check mode transparency
│
├─ Do agents fail when tools/data unavailable?
│ └─ YES → Missing alternative paths in decomposition
│
└─ Does agent understand but can't execute effectively?
└─ YES → Knowing-doing gap - check situated context
Symptoms: System works perfectly in happy path, crashes at first unexpected condition Detection Rule: If you hear "we need to handle the edge case" more than once, you're in this anti-pattern Root Cause: Designed for idealized sequence, no alternative paths Fix: Redesign with goal/method separation, build 3 recovery paths for most common failures
Symptoms: Agent mimics expert actions but can't adapt to novel situations Detection Rule: If expert says "I don't know how I knew that," but system specification doesn't capture cue recognition Root Cause: Encoded surface behavior without underlying reasoning structure Fix: Use Critical Decision Method to elicit tacit knowledge and situation assessment patterns
Symptoms: Humans/agents surprised when system changes behavior or strategy Detection Rule: If stakeholders say "I had no idea it was doing that," automation surprise is occurring Root Cause: State changes not communicated across coordination boundaries Fix: Make every mode transition an explicit coordination event with shared state updates
Symptoms: Agent handoffs produce errors despite individual agents working correctly Detection Rule: If output from Agent A is misinterpreted by Agent B consistently Root Cause: Agents maintain different models of task/world state Fix: Explicit shared ontology and interface state verification
Symptoms: System follows rules perfectly but fails under pressure or novel conditions Detection Rule: If system can't explain WHY it chose an action, only WHAT rule it followed Root Cause: Rule-based architecture deployed for expertise-requiring task Fix: Upgrade to recognition-primed or case-based reasoning architecture
Scenario: Design system where Agent A writes code, Agent B reviews, Agent C handles deployment
Initial Design (Flawed):
CSE Analysis Reveals:
Improved Design:
Agent A: Code Generation
├─ Includes intention metadata (what problem solving, why this approach)
├─ Context flags (urgency, risk level, author confidence)
Agent B: Recognition-Primed Review
├─ Situation assessment (code type, risk factors, author patterns)
├─ Pattern matching against failure libraries
├─ Graduated response: approve/iterate/escalate/reject
Agent C: Context-Sensitive Deployment
├─ Deployment strategy adapts to review confidence + context flags
├─ Rollback paths pre-planned based on risk assessment
Key Decision Points Applied:
Problem: AI customer service agent handles routine queries well but escalates too often on complex issues
Diagnosis Process:
Root Cause: Behavioral specification fallacy - system trained on successful interaction transcripts but missing expert reasoning about when/how to adapt
Solution:
Task completion checklist for CSE-informed agent design:
[ ] Alternative Path Coverage: System has defined recovery paths for 3 most likely failure modes
[ ] Situation Recognition: Agent can classify situation type, not just process inputs
[ ] Mode Transparency: All state changes are observable by supervisors/coordinators
[ ] Representation Alignment: Agent handoffs use explicit, shared state models
[ ] Expertise Stage Match: Architecture complexity matches required expertise level
[ ] Tacit Knowledge Elicitation: Used structured methods (not just self-report) for expert knowledge
[ ] Context Sensitivity: System adapts methods to situational factors
[ ] Knowing-Doing Verification: Tested execution capability, not just comprehension
[ ] Coordination Failure Recovery: System handles representational divergence gracefully
[ ] Automation Surprise Prevention: Mode changes communicated across all coordination boundaries
This skill is NOT for:
Use OTHER skills for:
This skill IS specifically for:
tools
Building resilient distributed systems with circuit breakers, retries with full-jitter exponential backoff, retry budgets (per-request 3-attempt + per-client 10% ratio per Google SRE), deadline propagation, and the cascading-failure math (4 layers × 3 retries = 64x amplification). Grounded in Resilience4j, Microsoft Cloud Patterns, AWS Architecture Blog (Marc Brooker), and Google SRE Book.
testing
Designing HTTP cache headers that work correctly across browsers, CDNs, and shared proxies — `Cache-Control` directives per RFC 9111, `stale-while-revalidate` and `stale-if-error` per RFC 5861, the Vary header for varying responses, and surrogate keys for tag-based purging. Grounded in IETF RFCs and Cloudflare/Fastly docs.
development
Use when designing or fixing a Content Security Policy on a real site, choosing between nonce-based and hash-based CSP, adding strict-dynamic, debugging "Refused to execute inline script" errors, deploying CSP in report-only mode first, configuring report-to / report-uri, or auditing an existing policy for unsafe-inline / unsafe-eval / wildcards. Triggers: "CSP blocks legitimate inline script", strict-dynamic, nonce-{RANDOM}, sha256-{HASH}, object-src none, base-uri none, frame-ancestors, Trusted Types, X-Content-Security-Policy obsolete, report-only vs enforced. NOT for general HTTP security headers (HSTS, COOP/COEP), Trusted Types deep dive, CORS configuration, or building a WAF.
tools
Choosing and operating an HTTP API versioning strategy that doesn't break clients — Stripe's date-based pinned versions, the Deprecation/Sunset header pair (RFC 9745 + RFC 8594), URI vs header vs media-type approaches, and the version-transformer pattern. Grounded in Stripe's published architecture and IETF RFCs.