plugins/backend-toolkit/skills/webhook-design/SKILL.md
Design webhooks correctly on both sides — sending (HMAC signing, retries with backoff, at-least-once) and receiving (verify signature on raw body, enqueue + 200 fast, dedupe on event id). Use when adding webhook delivery or consuming a provider's webhooks. Not for internal service-to-service events (use async-messaging) or general outbound-call retry policy (use resilience-patterns).
npx skillsauth add jaykim88/claude-ai-engineering webhook-designInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Build webhook delivery and consumption that survives the real world: signed, retried, idempotent on the sending side; signature-verified, fast-acknowledged, and deduped on the receiving side.
Universal — HMAC signing, signature verification on raw body, at-least-once + retry, and event.id dedup are protocol-level patterns independent of language.
Verify the signature against the RAW (unparsed) body
now - signedAt exceeds the windowAcknowledge fast: enqueue, then return 2xx before any complex logic
background-jobs → return 2xxDedupe on event.id (idempotency)
event.id; on replay, skip — ideally dedupe in the SAME transaction as the business writeSign outbound payloads (HMAC) + include a timestamp
eventType + eventVersion): downstream consumers outlive your publisher; additive-only changes once live (see async-messaging for the same discipline)Retry with exponential backoff over a long window
resilience-patterns backoff + jitter; consider Outbox (async-messaging) so an event is never lost on crashValidate (validation loop)
| ❌ Anti-pattern | ✅ Correct |
|---|---|
| Verify signature after body parsing | Verify on raw body bytes |
| Process webhook inline before returning 200 | Verify → enqueue → 200 fast |
| No event.id dedup | Dedupe on event id (at-least-once = duplicates happen) |
| Fire-and-forget outbound (no retry) | Retry with backoff; Outbox for crash-safety |
| Unsigned outbound payloads | HMAC sign + timestamp |
| Signature valid but timestamp ancient (replay) | Enforce a window (e.g., ±5 min) on the signed timestamp |
| Outbound payload shape changed under the same eventType | eventVersion + additive-only changes |
| Tier | Examples | Action SLA |
|---|---|---|
| Critical | No signature verification (forged webhooks accepted); processed inline → provider timeout → unintended duplicates on retry; no event.id dedup on payment / order webhooks (double-effects) | Block release; fix immediately |
| Major | Signature verified after body-parse (mutated bytes → false rejects); no timestamp window (replay of old captured payloads); outbound retries without backoff (storming a recovering receiver) | Fix this sprint |
| Minor | Outbound payload not versioned; missing event.id on a low-risk flow; receiver timeout not tuned to the provider's window | Schedule within 2 sprints |
feat(webhook): verify + enqueue <provider> webhooks / feat(webhook): sign + retry outbound <event>rawBody: true + @Req() for the buffer; verify with crypto.createHmacwebhook_events(event_id PK) checked in the processing transactionstripe.webhooks.constructEvent(rawBody, sig, secret) does verificationawait request.body() for raw bytes; hmac.compare_digest; enqueue to Celeryio.ReadAll(r.Body) before parsing; hmac package; enqueue to Asynqresilience-patterns — webhook retries + idempotency reuse those primitivesbackground-jobs — received webhooks are processed off the request threadasync-messaging — outbound webhooks are events; consider Outbox for reliabilityevent.id in the same transaction as the business write.testing
Use transactions and isolation levels correctly — keep them short, no network calls inside, explicit isolation, retry on serialization conflicts, and choose optimistic vs pessimistic locking. Use when a write spans multiple tables, when concurrent updates corrupt data, or when designing money/inventory flows. Not for cross-service event delivery (use async-messaging Outbox) or schema-level constraints (use schema-design).
development
Backend testing pyramid — unit for pure logic, integration against a real DB (Testcontainers), and consumer-driven contract testing (Pact) for service boundaries. Use before a feature, after a bug fix, or when services break each other on deploy. Not for load testing (use performance-profiling) or security testing (use backend-security-audit).
data-ai
Design a relational schema — normalize to 3NF then denormalize with justification, choose the right Postgres index type per data shape, enforce constraints at the DB. Use when modeling a new domain, when queries are slow, or before a migration. Not for diagnosing slow queries (use query-optimization) or shipping the change without downtime (use migration-strategy).
development
Apply reliability primitives — capped exponential backoff with jitter, circuit breakers, timeouts, and idempotency keys — to every outbound call and mutating endpoint. Use when integrating an external service, when retries cause duplicate effects, or before shipping a payment/order flow. Not for job-runner retry config specifically (use background-jobs) or webhook-delivery specifics (use webhook-design, which reuses these primitives).