skills/clouder0/error-recovery/SKILL.md
Strategies for handling subagent failures with retry logic and escalation patterns.
npx skillsauth add aiskillstore/marketplace error-recoveryInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Pattern for handling subagent failures gracefully with appropriate retry strategies.
| Category | Symptoms | Strategy |
|----------|----------|----------|
| Transient | Timeout, malformed output, parsing error | Simple Retry |
| Context Gap | "I don't have enough information", unclear task | Context Enhancement |
| Complexity | Partial completion, scope creep, tangents | Scope Reduction |
| Boundary/Contract | status: blocked, boundary_violation, contract_change | Escalation |
| Fatal | Repeated failures (3+), fundamental misunderstanding | Abort with Report |
For transient failures. Same prompt, up to 3 attempts.
# Track attempts
attempts: 0
max_attempts: 3
# On failure
IF attempts < max_attempts:
attempts += 1
Task(same_subagent_type, same_model, same_prompt)
ELSE:
Mark as FAILED, move on
Use when:
Add more information to help the agent succeed.
Task(
subagent_type: "implementer",
model: "sonnet",
prompt: |
## PREVIOUS ATTEMPT FAILED
Error: {error_message}
Output received: {partial_output}
## ADDITIONAL CONTEXT
Here is more information that may help:
- Related file: @{additional_file_path}
- Pattern to follow: {example_pattern}
- Specific guidance: {clarification}
## ORIGINAL TASK
{original_task_description}
Output to: {output_path}
)
Use when:
Context to add:
Break the failing task into smaller, more manageable pieces.
# Original task failed
Task: "Implement full authentication system"
# Split into subtasks
Task(implementer, "Implement password hashing utility")
Task(implementer, "Implement session token generation")
Task(implementer, "Implement login endpoint")
Task(implementer, "Implement logout endpoint")
Use when:
Splitting guidelines:
Route to specialized agent for resolution.
# For boundary violations
Task(
subagent_type: "contract-resolver",
model: "sonnet",
prompt: |
A task is blocked due to boundary/contract issues.
Blocked task output: memory/tasks/{task_id}/output.json
Blocked reason: {blocked_reason}
Current contracts: {contract_paths}
Analyze impact and provide resolution.
Output to: memory/contracts/resolution_{task_id}.json
)
Escalation paths:
| Failure Type | Escalate To | Action |
|--------------|-------------|--------|
| blocked_reason: boundary_violation | contract-resolver | Expand boundaries or redesign |
| blocked_reason: contract_change | contract-resolver | Modify contract, re-verify dependents |
| blocked_reason: dependency_issue | executor (self) | Re-check dependency status |
| Repeated implementation failures | architect | Reconsider design approach |
When recovery is not possible, fail gracefully.
{"tasks":[{"id":"{task_id}","status":"failed","failure_reason":"{specific reason}","attempts_made":3,"recovery_attempted":[{"strategy":"simple_retry","result":"same_error"},{"strategy":"context_enhancement","result":"different_error"},{"strategy":"scope_reduction","result":"subtasks_also_failed"}],"recommendation":"Task may need architectural redesign"}]}
Use when:
On Subagent Failure:
│
├─ Is output malformed/empty/timeout?
│ └─ YES → Strategy 1: Simple Retry (up to 3x)
│
├─ Did agent say "unclear" or ask questions?
│ └─ YES → Strategy 2: Context Enhancement
│
├─ Did agent complete partial work?
│ └─ YES → Strategy 3: Scope Reduction
│
├─ Is status "blocked" with boundary/contract reason?
│ └─ YES → Strategy 4: Escalation to contract-resolver
│
├─ Have we tried 3+ strategies already?
│ └─ YES → Strategy 5: Abort with Report
│
└─ Unknown error
└─ Try Strategy 2 first, then escalate
Track retry attempts in the execution state file:
{"tasks":[{"id":"task-001","status":"running","attempts":2,"last_error":"Timeout after 120s","retry_strategy":"simple_retry"},{"id":"task-002","status":"running","attempts":1,"last_error":"Needs access to src/config/db.ts","retry_strategy":"context_enhancement","context_added":["src/config/db.ts","src/types/config.ts"]}]}
# Enhanced execution loop
WHILE tasks remain incomplete:
1. Read state file
2. Find ready tasks
3. Spawn ready tasks
4. Check completed tasks:
FOR each completed task:
IF status == pre_complete:
spawn verifier
ELIF status == blocked:
apply Strategy 4 (Escalation)
ELIF status == failed:
determine_failure_category()
apply_appropriate_strategy()
update_retry_state()
5. Update state file
6. IF all verified: EXIT
7. IF all failed with no recovery: EXIT with failure report
development
Apple Human Interface Guidelines for content display components. Use this skill when the user asks about charts component, collection view, image view, web view, color well, image well, activity view, lockup, data visualization, content display, displaying images, rendering web content, color pickers, or presenting collections of items in Apple apps. Also use when the user says how should I display charts, what's the best way to show images, should I use a web view, how do I build a grid of items, what component shows media, or how do I present a share sheet. Cross-references: hig-foundations for color/typography/accessibility, hig-patterns for data visualization patterns, hig-components-layout for structural containers, hig-platforms for platform-specific component behavior.
tools
Automate HelpDesk tasks via Rube MCP (Composio): list tickets, manage views, use canned responses, and configure custom fields. Always search tools first for current schemas.
testing
Expert Haskell engineer specializing in advanced type systems, pure functional design, and high-reliability software. Use PROACTIVELY for type-level programming, concurrency, and architecture guidance.
tools
GraphQL gives clients exactly the data they need - no more, no less. One endpoint, typed schema, introspection. But the flexibility that makes it powerful also makes it dangerous. Without proper controls, clients can craft queries that bring down your server. This skill covers schema design, resolvers, DataLoader for N+1 prevention, federation for microservices, and client integration with Apollo/urql. Key insight: GraphQL is a contract. The schema is the API documentation. Design it carefully.