skills/ipc-communication-patterns/SKILL.md
Comprehensive reference for inter-process communication mechanisms -- every way processes can talk on a computer. Covers sockets (TCP, UDP, Unix domain), WebSockets, SSE, pipes (named/anonymous), shared memory (mmap, shm_open), message queues, signals, D-Bus, XPC (macOS), gRPC, REST, stdin/stdout, file-based coordination, and clipboard. Performance benchmarks, platform availability, and code examples for each. Special focus on which IPC works best for AI agent coordination. Activate on: "IPC", "inter-process communication", "process communication", "how do I talk between processes", "Unix socket vs TCP", "shared memory", "named pipe", "WebSocket vs SSE", "gRPC vs REST", "agent coordination IPC", "XPC service", "message passing", "stdout pipe", "D-Bus", "mmap". NOT for: distributed systems design (use distributed-systems), network protocol design (use networking), message queue infrastructure like Kafka (use data-pipeline-engineer).
npx skillsauth add curiositech/windags-skills ipc-communication-patternsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Every way processes can communicate on a computer. This skill provides decision trees for selecting IPC mechanisms, failure diagnostics, and implementation patterns for AI agent coordination.
Are processes related (parent-child)?
├── YES
│ ├── Need bidirectional?
│ │ ├── YES → Unix Domain Socket (same machine) | TCP (may cross machines)
│ │ └── NO → Anonymous Pipe (stdin/stdout)
│ └── Simple one-way → Anonymous Pipe
│
└── NO (unrelated processes)
├── Same machine only?
│ ├── YES
│ │ ├── High throughput (>1GB/s) → Shared Memory + coordination
│ │ ├── Low latency (<10us) → Unix Domain Socket
│ │ ├── Typed RPC needed → gRPC over Unix Socket
│ │ └── Simple messages → Named Pipe | Unix Domain Socket
│ │
│ └── NO (cross-machine capable)
│ ├── Web-compatible → WebSocket | SSE | REST
│ ├── Streaming data → gRPC streaming | WebSocket
│ ├── Request-reply → REST/HTTP | gRPC unary
│ └── Fire-and-forget → UDP
│
└── Browser client involved?
├── Server → Client only → SSE
├── Bidirectional → WebSocket
└── Request-reply → REST/HTTP
If latency requirement:
├── <1us → Shared Memory (lock-free ring buffer)
├── <10us → Unix Domain Socket | Shared Memory (with locks)
├── <100us → TCP loopback | gRPC
└── <1ms → REST/HTTP acceptable
If throughput requirement:
├── >10GB/s → Shared Memory only option
├── >5GB/s → Unix Domain Socket
├── >1GB/s → TCP | gRPC
└── <1GB/s → Any mechanism works
Agent Communication Pattern → Recommended IPC
Parent spawns child agents:
├── Simple task execution → stdin/stdout pipe
├── Progress reporting needed → Unix Domain Socket
└── Web UI monitoring → stdin/stdout + SSE to browser
Orchestrator + independent agents:
├── Same machine → Unix Domain Socket
├── May scale across machines → gRPC | WebSocket
└── Simple coordination → Named Pipe
Long-running agent services:
├── macOS → XPC Service
├── Linux desktop → D-Bus
└── Cross-platform → Unix Domain Socket | TCP
Detection: Process hangs when writing to stdin while child's stdout buffer is full
Symptoms: write() blocks indefinitely, process unresponsive, strace shows blocking on pipe write
Root cause: Both stdin and stdout buffers full (~64KB Linux, ~16KB macOS), neither process can proceed
Fix: Use async I/O to drain stdout while writing stdin, or separate threads for read/write operations
# Detect: strace shows blocked write to pipe
strace -p <pid> | grep -E 'write.*PIPE|read.*PIPE'
Detection: EACCES error on connect(), "Permission denied" in logs
Symptoms: Client process cannot connect to Unix domain socket, socket file exists with wrong permissions
Root cause: Socket file permissions too restrictive, or client running as different user
Fix: Set socket permissions to 0666 for multi-user access, or 0600 + proper ownership
# Fix socket permissions
chmod 666 /tmp/agent.sock
# Or set ownership
chown user:group /tmp/agent.sock
Detection: ECONNREFUSED errors, agents unable to reach orchestrator
Symptoms: Multiple agents fail simultaneously, orchestrator shows no incoming connections
Root cause: Orchestrator crashed/restarted, firewall blocking port, or port already in use
Fix: Implement exponential backoff retry, health checks, and port conflict detection
// Detect port conflicts
const server = net.createServer();
server.on('error', (err: NodeJS.ErrnoException) => {
if (err.code === 'EADDRINUSE') {
console.error(`Port ${port} already in use`);
}
});
Detection: Garbage data reads, segfaults, inconsistent state between processes Symptoms: Data races, torn reads/writes, process crashes with SIGSEGV Root cause: Missing synchronization, incorrect memory barriers, or buffer overruns Fix: Add proper atomics for flags, use futex/semaphore for critical sections
// Fix: Use atomic operations
atomic_store(&shared->flag, 1); // Not: shared->flag = 1;
Detection: SIGTERM not handled cleanly, processes leave stale state Symptoms: Lock files not cleaned up, connections not closed, zombie processes Root cause: Signal handler interrupted critical section, or handler not async-signal-safe Fix: Use self-pipe trick or signalfd (Linux) for safe signal handling
// Fix: Only set flag in signal handler, do cleanup in main loop
volatile sig_atomic_t shutdown_requested = 0;
void sigterm_handler(int sig) { shutdown_requested = 1; }
Scenario: Orchestrator manages 5 agent processes, each running different AI models. Need bidirectional communication for task assignment and progress reporting.
Decision process:
Implementation walkthrough:
// 1. Create Unix domain socket server (orchestrator)
const SOCKET_PATH = '/tmp/windags-orchestrator.sock';
if (fs.existsSync(SOCKET_PATH)) fs.unlinkSync(SOCKET_PATH); // Clean stale socket
const server = net.createServer((connection) => {
const agentId = `agent-${Date.now()}`;
agents.set(agentId, connection);
let buffer = '';
connection.on('data', (chunk) => {
buffer += chunk.toString();
// Process complete messages (newline-delimited)
let newlineIdx;
while ((newlineIdx = buffer.indexOf('\n')) !== -1) {
const line = buffer.slice(0, newlineIdx);
buffer = buffer.slice(newlineIdx + 1);
try {
const message = JSON.parse(line);
handleAgentMessage(agentId, message);
} catch (err) {
console.error(`Invalid JSON from ${agentId}:`, err);
}
}
});
connection.on('close', () => agents.delete(agentId));
});
server.listen(SOCKET_PATH);
fs.chmodSync(SOCKET_PATH, 0o600); // Owner-only access
// 2. Agent connection pattern
function connectAgent(): Promise<net.Socket> {
return new Promise((resolve, reject) => {
const socket = net.connect(SOCKET_PATH);
socket.on('connect', () => resolve(socket));
socket.on('error', reject);
});
}
// 3. Message sending with error handling
function sendMessage(socket: net.Socket, msg: object): Promise<void> {
return new Promise((resolve, reject) => {
const line = JSON.stringify(msg) + '\n';
socket.write(line, (err) => err ? reject(err) : resolve());
});
}
Expert vs novice differences:
unlink() socket path before bindingwrite() → silent failuresThis skill should NOT be used for:
distributed-systems skill insteaddata-pipeline-engineer skill insteadnetworking skill insteadapi-design skill insteadDelegate to other skills when:
websocket-streaming skilldaemon-development skillmulti-agent-coordination skillagent-interchange-formats skilltools
Building resilient distributed systems with circuit breakers, retries with full-jitter exponential backoff, retry budgets (per-request 3-attempt + per-client 10% ratio per Google SRE), deadline propagation, and the cascading-failure math (4 layers × 3 retries = 64x amplification). Grounded in Resilience4j, Microsoft Cloud Patterns, AWS Architecture Blog (Marc Brooker), and Google SRE Book.
testing
Designing HTTP cache headers that work correctly across browsers, CDNs, and shared proxies — `Cache-Control` directives per RFC 9111, `stale-while-revalidate` and `stale-if-error` per RFC 5861, the Vary header for varying responses, and surrogate keys for tag-based purging. Grounded in IETF RFCs and Cloudflare/Fastly docs.
development
Use when designing or fixing a Content Security Policy on a real site, choosing between nonce-based and hash-based CSP, adding strict-dynamic, debugging "Refused to execute inline script" errors, deploying CSP in report-only mode first, configuring report-to / report-uri, or auditing an existing policy for unsafe-inline / unsafe-eval / wildcards. Triggers: "CSP blocks legitimate inline script", strict-dynamic, nonce-{RANDOM}, sha256-{HASH}, object-src none, base-uri none, frame-ancestors, Trusted Types, X-Content-Security-Policy obsolete, report-only vs enforced. NOT for general HTTP security headers (HSTS, COOP/COEP), Trusted Types deep dive, CORS configuration, or building a WAF.
tools
Choosing and operating an HTTP API versioning strategy that doesn't break clients — Stripe's date-based pinned versions, the Deprecation/Sunset header pair (RFC 9745 + RFC 8594), URI vs header vs media-type approaches, and the version-transformer pattern. Grounded in Stripe's published architecture and IETF RFCs.