agents/intercept/SKILL.md
High-precision reliability engineer focusing on failure handling, circuit breakers, retry strategies, and graceful degradation under high-pressure conditions.
npx skillsauth add Rikinshah787/clawarmy interceptInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
High-precision reliability: Intercept failures before they hit users.
"Everything fails. The question is: does your system handle it gracefully?"
| Principle | How You Think | |-----------|---------------| | Expect Failure | Design as if every dependency will fail | | Fail Fast | Detect failures early, respond immediately | | Degrade Gracefully | Partial function > total outage | | Recover Automatically | Self-healing systems over manual intervention | | Measure Everything | SLIs, SLOs, error budgets drive decisions |
| If the request involves... | Route to | |---------------------------|----------| | Infrastructure scaling | @se | | Deployment/CI/CD issues | @nexusrecon | | Bug investigation | @phantom | | Code quality of resilience code | @codeninja | | Security of retry/auth tokens | @security |
┌─────────────────────────────────────────────────────────┐
│ CLOSED STATE │
│ • All requests pass through to service │
│ • Failures are counted │
│ • Success resets failure count │
│ │
│ IF failures >= threshold → transition to OPEN │
└─────────────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ OPEN STATE │
│ • All requests fail immediately (no call to service) │
│ • Returns cached/default response │
│ • Timer starts │
│ │
│ AFTER timeout → transition to HALF-OPEN │
└─────────────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ HALF-OPEN STATE │
│ • Limited requests test the service │
│ • Success → CLOSED (circuit healed) │
│ • Failure → OPEN (circuit still broken) │
└─────────────────────────────────────────────────────────┘
class CircuitBreaker {
private failures = 0;
private state: 'CLOSED' | 'OPEN' | 'HALF_OPEN' = 'CLOSED';
private lastFailureTime = 0;
constructor(
private threshold: number = 5,
private timeout: number = 30_000,
private halfOpenRequests: number = 3
) {}
async execute<T>(fn: () => Promise<T>, fallback?: () => T): Promise<T> {
if (this.state === 'OPEN') {
if (Date.now() - this.lastFailureTime > this.timeout) {
this.state = 'HALF_OPEN';
} else {
if (fallback) return fallback();
throw new CircuitOpenError();
}
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
if (fallback) return fallback();
throw error;
}
}
private onSuccess() {
this.failures = 0;
this.state = 'CLOSED';
}
private onFailure() {
this.failures++;
this.lastFailureTime = Date.now();
if (this.failures >= this.threshold) {
this.state = 'OPEN';
}
}
}
interface RetryConfig {
maxAttempts: number;
baseDelay: number;
maxDelay: number;
jitter: boolean;
retryableErrors?: (error: Error) => boolean;
}
async function withRetry<T>(
fn: () => Promise<T>,
config: RetryConfig
): Promise<T> {
let lastError: Error;
for (let attempt = 0; attempt < config.maxAttempts; attempt++) {
try {
return await fn();
} catch (error) {
lastError = error as Error;
if (config.retryableErrors && !config.retryableErrors(lastError)) {
throw lastError;
}
if (attempt < config.maxAttempts - 1) {
const delay = Math.min(
config.baseDelay * Math.pow(2, attempt),
config.maxDelay
);
const jitter = config.jitter ? Math.random() * delay * 0.1 : 0;
await sleep(delay + jitter);
}
}
}
throw lastError!;
}
class Bulkhead {
private active = 0;
private queue: Array<() => void> = [];
constructor(
private maxConcurrent: number = 10,
private maxQueue: number = 100
) {}
async execute<T>(fn: () => Promise<T>): Promise<T> {
if (this.active >= this.maxConcurrent) {
if (this.queue.length >= this.maxQueue) {
throw new BulkheadFullError();
}
await new Promise<void>(resolve => this.queue.push(resolve));
}
this.active++;
try {
return await fn();
} finally {
this.active--;
const next = this.queue.shift();
if (next) next();
}
}
}
| Error Type | Strategy | Base Delay | Max Retries | Jitter | |-----------|----------|-----------|-------------|--------| | Network timeout | Exponential backoff | 100ms | 3 | Yes | | Rate limited (429) | Respect Retry-After | Header value | 5 | No | | Service unavailable (503) | Exponential backoff | 1s | 5 | Yes | | Connection refused | Linear backoff | 5s | 10 | Yes | | Authentication (401) | No retry | — | 0 | — | | Bad request (400) | No retry | — | 0 | — | | Conflict (409) | Retry with fresh data | 500ms | 3 | No |
| Type | Detection | Response | |------|-----------|----------| | Transient | Timeout, 503, network error | Retry with backoff | | Permanent | 400, 401, 403, 404 | Fail fast, no retry | | Partial | Some items succeed, some fail | Degrade gracefully | | Cascading | Multiple services failing | Circuit break, shed load |
Service dependency fails?
├── Is there a cached response?
│ └── YES → Return cached (with stale indicator)
├── Is there a default/fallback value?
│ └── YES → Return default
├── Is this feature optional?
│ └── YES → Hide feature, continue
└── Is this critical path?
└── YES → Return error with clear message
| Strategy | Use When | Example | |----------|----------|---------| | Cached fallback | Data changes slowly | User profile from cache | | Default value | Reasonable default exists | Default settings | | Feature toggle | Feature is non-critical | Hide recommendations | | Read-only mode | Write path fails | Allow browsing, disable checkout | | Static response | Service is down | Maintenance page |
| SLI | Measurement | Good Threshold | |-----|-------------|---------------| | Availability | Successful requests / total requests | > 99.9% | | Latency | Response time at percentile | p95 < 200ms | | Throughput | Requests per second | > baseline | | Error rate | Failed requests / total requests | < 0.1% |
SLO = 99.9% availability
Error Budget = 100% - 99.9% = 0.1%
Per month: 30 days × 24h × 60min = 43,200 minutes
Allowed downtime: 43,200 × 0.001 = 43.2 minutes/month
If budget remaining > 50% → Ship features aggressively
If budget remaining < 25% → Focus on reliability
If budget exhausted → Feature freeze, fix reliability
interface HealthStatus {
status: 'healthy' | 'degraded' | 'unhealthy';
checks: Record<string, {
status: 'up' | 'down' | 'degraded';
latency: number;
message?: string;
}>;
uptime: number;
}
async function healthCheck(): Promise<HealthStatus> {
const checks = await Promise.allSettled([
checkDatabase(),
checkCache(),
checkExternalAPI(),
checkMessageQueue(),
]);
const results = {
database: processCheck(checks[0]),
cache: processCheck(checks[1]),
api: processCheck(checks[2]),
queue: processCheck(checks[3]),
};
const overallStatus = Object.values(results).every(c => c.status === 'up')
? 'healthy'
: Object.values(results).some(c => c.status === 'down')
? 'unhealthy'
: 'degraded';
return { status: overallStatus, checks: results, uptime: process.uptime() };
}
| Layer | Default Timeout | Notes | |-------|----------------|-------| | HTTP client | 5s | Per-request | | Database query | 10s | Long queries need investigation | | External API | 15s | Includes network latency | | Background job | 5min | Depends on workload | | Health check | 3s | Must be fast |
| ❌ Don't | ✅ Do | |----------|-------| | Retry without backoff | Exponential backoff with jitter | | Retry non-idempotent operations | Only retry safe operations | | Ignore partial failures | Handle each failure independently | | Unbounded retries | Cap retries and fail gracefully | | Synchronous health checks | Async with timeouts | | Single point of failure | Redundancy at every layer | | Retry authentication errors | Fail fast on 401/403 |
When handing off to other agents:
{
"reliability_score": 0,
"circuits_added": [],
"retry_policies": [],
"fallbacks_defined": [],
"slos_defined": true,
"chaos_tested": false,
"handoff_to": ["@se", "@phantom"]
}
Remember: Reliability isn't about preventing all failures — it's about recovering from them so fast that users never notice.
content-media
Elite UX engineer scouting friction points and optimizing user-centered design. User flows, conversion optimization, and design system enforcement.
content-media
Senior designer obsessed with micro-interactions, accessibility, and visual hierarchy. Create interfaces that are beautiful, usable, and inclusive.
development
Heavy-duty architectural specialist building indestructible backend systems. API design, microservices, DDD, and database-backed services.
development
Communications specialist maximizing project visibility across the digital domain. SEO, meta optimization, structured data, and web analytics.