java/src/main/resources/targets/claude/skills/knowledge-packs/patterns-outbox/SKILL.md
Transactional Outbox Pattern: reliable event publishing with polling publisher and CDC strategies, outbox table design, and anti-patterns.
npx skillsauth add edercnj/claude-environment patterns-outboxInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Provides structured guidance for implementing the Transactional Outbox Pattern, ensuring reliable event delivery in event-driven systems. Covers the dual-write problem, outbox table design, SELECT FOR UPDATE SKIP LOCKED, Debezium CDC, and anti-patterns. Included when architecture.outbox_pattern = true.
Publishing events directly to a message broker inside a database transaction creates a dual-write problem. Two independent systems (database and broker) cannot participate in a single atomic transaction without a distributed transaction protocol (2PC), which is impractical in microservice architectures.
| Scenario | What Happens | Consequence | |----------|-------------|-------------| | Commit succeeds, publish fails | Business data is persisted but the event never reaches consumers | Lost event — downstream services never learn about the state change | | Publish succeeds, commit fails (rollback) | The event is delivered to consumers but the originating transaction rolled back | Phantom event — consumers react to a change that never happened | | Publish before commit, crash mid-transaction | Event is delivered; transaction never commits | Phantom event — same as above | | Network partition during publish | Timeout or partial delivery; retries may cause duplicates | Duplicate or lost event depending on retry policy |
// ANTI-PATTERN: dual-write inside a transaction
@Transactional
public void processOrder(Order order) {
orderRepository.save(order); // write 1: database
eventPublisher.publish( // write 2: broker
new OrderCreatedEvent(order)); // NOT atomic with write 1
}
The two writes target independent systems. No single transaction boundary can guarantee both succeed or both fail.
Instead of publishing directly to the broker, write the event to an outbox table in the same database transaction as the business data change. A separate relay process reads from the outbox table and publishes to the broker asynchronously.
This guarantees atomicity: the event is persisted if and only if the business transaction commits.
CREATE TABLE outbox_events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
topic VARCHAR(255) NOT NULL,
payload JSONB NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'PENDING',
retry_count INTEGER NOT NULL DEFAULT 0,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_outbox_pending ON outbox_events (created_at)
WHERE status = 'PENDING';
| Column | Type | Purpose |
|--------|------|---------|
| id | UUID | Unique event identifier; used by consumers for deduplication |
| topic | VARCHAR(255) | Target broker topic or routing key |
| payload | JSONB | Serialized event data (domain event content) |
| status | VARCHAR(20) | Processing status: PENDING, PROCESSING, SENT, FAILED, DEAD_LETTER |
| retry_count | INTEGER | Number of publication attempts; incremented on each failure |
| created_at | TIMESTAMP | Event creation time; used for ordering in polling queries |
| updated_at | TIMESTAMP | Last modification time; updated on status transitions |
The idx_outbox_pending index is a partial index filtered on status = 'PENDING'. This design choice provides:
PENDING --> PROCESSING --> SENT
|
+--> FAILED --> PENDING (retry)
|
+--> DEAD_LETTER (max retries exceeded)
// CORRECT: single transaction, single database
@Transactional
public void processOrder(Order order) {
orderRepository.save(order);
outboxRepository.save(new OutboxEvent(
"order.created",
serializeEvent(new OrderCreatedEvent(order))
));
// Both writes committed atomically
}
The Polling Publisher is a background process that periodically queries the outbox table for PENDING events, publishes them to the message broker, and updates their status.
SELECT id, topic, payload, retry_count
FROM outbox_events
WHERE status = 'PENDING'
ORDER BY created_at ASC
FOR UPDATE SKIP LOCKED
LIMIT :batch_size;
Why SKIP LOCKED:
The polling interval increases exponentially when no events are found, reducing database load during idle periods:
interval = min(base_interval * 2^consecutive_empty_polls, max_interval)
| Parameter | Recommended Value |
|-----------|------------------|
| base_interval | 1 second |
| max_interval | 30 seconds |
| reset_on_events | true (reset to base when events found) |
| Parameter | Recommended Value | Rationale |
|-----------|------------------|-----------|
| batch_size | 100 | Balance between throughput and transaction duration |
| max_processing_time | 30 seconds | Prevent long-running transactions that hold locks |
| commit_interval | Per batch | Commit after each batch to release locks promptly |
if (retryCount >= MAX_RETRIES) {
updateStatus(eventId, "DEAD_LETTER");
alertOps("Event exceeded max retries", eventId);
} else {
updateStatusAndRetry(eventId, "PENDING",
retryCount + 1);
}
| Parameter | Recommended Value |
|-----------|------------------|
| max_retries | 5 |
| retry_backoff | Exponential (1s, 2s, 4s, 8s, 16s) |
| dead_letter_alert | Mandatory — alert operations team |
Change Data Capture (CDC) is an alternative relay strategy that tails the database transaction log (WAL in PostgreSQL, binlog in MySQL) instead of polling the outbox table. Debezium is the most widely adopted open-source CDC platform.
outbox_events, Debezium captures the change{
"name": "outbox-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "db-host",
"database.port": "5432",
"database.dbname": "myservice",
"database.user": "debezium",
"database.password": "${DEBEZIUM_DB_PASSWORD}",
"table.include.list": "public.outbox_events",
"transforms": "outbox",
"transforms.outbox.type": "io.debezium.transforms.outbox.EventRouter",
"transforms.outbox.route.topic.replacement": "${routedByValue}",
"transforms.outbox.table.field.event.id": "id",
"transforms.outbox.table.field.event.key": "id",
"transforms.outbox.table.field.event.payload": "payload",
"transforms.outbox.route.by.field": "topic"
}
}
| Criterion | Polling Publisher | CDC (Debezium) | |-----------|-----------------|----------------| | Latency | Seconds (poll interval) | Milliseconds (log tailing) | | Database load | Periodic queries + row locking | Reads transaction log (minimal query load) | | Operational complexity | Low (application-level code) | Medium (requires Kafka Connect, Debezium, connector management) | | Ordering guarantee | Per-batch ordering by created_at | Strict commit-order from transaction log | | Horizontal scaling | Multiple instances with SKIP LOCKED | Single connector per database (or partitioned) | | Infrastructure | Application only | Kafka Connect cluster + Debezium | | Failure recovery | Application retry logic | Connector offset tracking in Kafka |
Use Polling Publisher when:
Use CDC (Debezium) when:
// ANTI-PATTERN
@Transactional
public void process(Order order) {
repo.save(order);
broker.publish(event); // dual-write: NOT atomic
}
Consequence: Lost events when publish fails after commit, or phantom events when publish succeeds but transaction rolls back.
-- ANTI-PATTERN: full index on all rows
CREATE INDEX idx_outbox_status ON outbox_events (status);
Consequence: The index grows with the total row count including SENT and DEAD_LETTER rows. Polling queries scan a bloated index, degrading performance as the table grows.
Correct approach: Use a partial index on status = 'PENDING' to keep the index small and fast.
// ANTI-PATTERN: fixed interval polling
while (true) {
List<Event> events = pollOutbox();
publish(events);
Thread.sleep(1000); // always 1 second
}
Consequence: Constant database load even when no events exist. Wastes CPU, I/O, and database connections during idle periods.
Correct approach: Use exponential backoff that increases the interval when no events are found and resets when events are discovered.
// ANTI-PATTERN: infinite retry
if (publishFailed) {
event.setStatus("PENDING"); // retry forever
}
Consequence: Poison events (e.g., malformed payload, permanently unreachable topic) block the outbox indefinitely. Other events behind the poison event are delayed or never processed.
Correct approach: After a configurable number of retries, move the event to DEAD_LETTER status and alert the operations team. Provide tooling to inspect and replay dead-lettered events.
Consequence: Events accumulate in the outbox table without detection. By the time the lag is noticed, downstream services have stale data, SLA violations have occurred, and manual reconciliation is required.
Correct approach: Monitor the following metrics and set alerts:
| Metric | Alert Threshold | Description | |--------|----------------|-------------| | PENDING event count | > 1000 | Number of unprocessed events in the outbox | | Oldest PENDING event age | > 5 minutes | Time since the oldest unprocessed event was created | | FAILED event count | > 0 | Any event that failed to publish | | DEAD_LETTER count (24h) | > 0 | Events moved to dead letter in the last 24 hours | | Publish success rate | < 99.9% | Ratio of successful publishes to total attempts |
| Pack | Relationship |
|------|-------------|
| architecture-patterns | Outbox, saga, dead letter queue, event sourcing, and idempotency pattern references |
| resilience | Retry and dead letter handling patterns |
| architecture-cqrs | Event sourcing as an alternative where the event store IS the outbox |
tools
Documentation automation v2: stack-aware generation from documentation.targets.
development
Generates or updates CI/CD pipelines per project stack with actionlint validation.
tools
Generates ADRs from architecture-plan mini-ADRs with sequential numbering and index update.
development
Formats source code; first step of the pre-commit chain (format -> lint -> compile).