plugins/patterns/skills/reliability-patterns/SKILL.md
Build resilient event systems with retry strategies, dead letter queues, and EventBus persistence. Handle failures gracefully in production deployments.
npx skillsauth add adaptive-enforcement-lab/claude-skills reliability-patternsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Production event systems must handle failures gracefully. Network blips, service outages, and malformed events are inevitable. These patterns ensure events don't get lost and systems recover automatically. For the complete reference, see the official Argo Events reliability docs.
See the full implementation guide in the source documentation.
Multiple layers of protection prevent event loss:
flowchart TD
A[Event Arrives] --> B[EventSource Retry]
B -->|Success| C[EventBus Persistence]
C --> D[Sensor Processing]
D -->|Trigger Fails| E[Trigger Retry]
E -->|Exhausted| F[Dead Letter Queue]
D -->|Success| G[Action Complete]
%% Ghostty Hardcore Theme
style A fill:#65d9ef,color:#1b1d1e
style B fill:#fd971e,color:#1b1d1e
style C fill:#515354,color:#f8f8f3
style D fill:#f92572,color:#1b1d1e
style E fill:#9e6ffe,color:#1b1d1e
style F fill:#f92572,color:#1b1d1e
style G fill:#a7e22e,color:#1b1d1e
| Pattern | Purpose | Complexity | | --------- | --------- | ------------ | | Retry Strategies | Handle transient failures | Low | | Dead Letter Queues | Capture failed events | Medium | | Backpressure Handling | Prevent overload | Medium |
Add retry logic to handle transient failures:
triggers:
- template:
name: deploy-with-retry
argoWorkflow:
operation: submit
source:
resource:
# ...
retryStrategy:
steps: 3
duration: 10s
factor: 2
jitter: 0.1
This retries failed triggers up to 3 times with exponential backoff:
The jitter adds randomness to prevent thundering herd.
The EventBus provides at-least-once delivery. Events persist until acknowledged:
apiVersion: argoproj.io/v1alpha1
kind: EventBus
metadata:
name: default
spec:
jetstream:
version: "2.9.11"
persistence:
accessMode: ReadWriteOnce
storageClassName: standard
volumeSize: 10Gi
replicas: 3
With persistence enabled, events survive EventBus pod restarts. The 3-replica configuration provides high availability.
At-Least-Once Semantics
Argo Events guarantees at-least-once delivery, not exactly-once. Your workflows must be idempotent - processing the same event twice should produce the same result. See Idempotency Patterns for implementation strategies.
See examples.md for code examples.
documentation
Workload Identity Federation implementation guide. GKE setup, IAM bindings, ServiceAccount configuration, migration from service account keys, and troubleshooting patterns.
development
Secure GitHub Actions trigger patterns for pull requests, forks, and reusable workflows. Preventing privilege escalation and code injection through trigger misconfiguration.
development
Structured framework for evaluating GitHub Actions security before adoption. Trust tiers, risk assessment checklist, and decision tree for action evaluation.
testing
Securely store GitHub App credentials across different environments. GitHub Actions secrets, external CI, Kubernetes, and automated rotation patterns.