Deadlock Finder and Fixer

Core Insight. Concurrency bugs do not come from one missing lock — they come from one lock acquired in the wrong place, at the wrong time, held across the wrong operation, by a thread that didn't know it was holding it. Find every instance of the hazard, not just the one that fired.

The Universal Rule. When you think you found the deadlock and fixed the three instances you could see, there is almost always a fourth. This is the single most common failure mode across every concurrency debugging session in this repo's history. Keep searching until you can prove exhaustively — by code audit — that no hazard remains. See THE FOURTH INSTANCE.

Quick Start: Something Is Hung

# 1. Is it CPU-alive or CPU-dead?
ps -Lp $PID -o tid,pcpu,pmem,comm --no-headers | head -20

# 2. Snapshot all thread states (pick ONE, in order of availability):
gdb --batch -ex "set pagination off" -ex "thread apply all bt full" -p $PID 2>&1 | tee /tmp/bt.txt
#   OR (if ptrace blocked / LD_PRELOAD hazard):
strace -k -f -p $PID 2>&1 | head -200
#   OR (sample /proc):
for i in 1 2 3; do cat /proc/$PID/task/*/stack 2>/dev/null | sort -u; sleep 1; done

# 3. Classify (pick the matching row from the Symptom Triage Table below).
# 4. Jump to the matching section in this skill or in gdb-for-debugging.

Diagnosis depth is in gdb-for-debugging — which already contains the Lock Graph Construction algorithm, mutex ownership inspector, async runtime analysis, and TSAN/rr workflow. This skill is the complement: it covers taxonomy, static-audit discovery, fix catalog, and prevention by design — the parts that don't need a running process.

Symptom Triage Table

| Observed Symptom | Likely Bug Class | Jump To | |------------------|------------------|---------| | Process 0% CPU, won't respond, threads in futex_wait / __lll_lock_wait | Classic deadlock (AB-BA or self) | Class 1 | | Async tasks pending but all tokio workers in epoll_wait | Mutex held across .await or channel cycle | Class 2 | | 100% CPU, futex spam, no progress | Livelock / retry storm / broken condvar | Class 3 | | database is locked, SQLITE_BUSY, timeouts | SQLite WAL contention / long transaction / writer fight | Class 4 | | Hang during library load, strlen or malloc call hangs | LD_PRELOAD / runtime-init reentrancy | Class 5 | | Test flakes under load, passes under --test-threads=1 | Data race (TSAN) or TOCTOU | Class 6 | | Agent swarm stalls; two agents editing same file | Advisory-lease race or missing reservation | Class 7 | | tmux pane hung, mux unresponsive | External process holding a shared lock / fd | Class 7 | | Task starvation: one worker CPU-pegged, others idle | Blocking call on async runtime thread | Class 2 | | Poisoned std::sync::Mutex after panic | Cascading panic-in-critical-section | Class 8 | | Lost updates, wrong counter values, weird retries | Lost wakeup / missed notification / incorrect memory ordering | Class 9 |

The Nine Classes (Taxonomy)

Class 1 — Classic Mutex Deadlock

Definition: Two or more threads each hold a lock the other needs; circular wait in the lock-wait graph.

Canonical forms:

AB-BA: T1 holds A, wants B; T2 holds B, wants A.
Self-deadlock: Single thread re-enters the same non-recursive mutex (e.g., callback path re-enters lock holder).
Reader-upgrade: Thread holds RwLock::read, then asks for RwLock::write in the same thread → guaranteed hang.
Condvar wakeup loop: Thread in pthread_cond_wait on M; its waker needs to acquire M to signal but can't.

How to spot at rest (static audit): search for any function that acquires two distinct mutexes, verify all call paths acquire them in the same order everywhere. Any deviation is a latent deadlock. See STATIC-AUDIT.md for ast-grep recipes.

How to spot at runtime: see gdb-for-debugging §"Lock Graph Construction & Deadlock Proof". The algorithm: identify all threads in __lll_lock_wait, read the __owner field on each contested pthread_mutex_t to build the wait-for graph, find a cycle.

Fix catalog:

Total lock order. Assign every mutex a global rank; assert in debug builds that locks are only acquired in ascending order. parking_lot::deadlock detector can enforce this at runtime.
Lock coalescing. Replace two separate mutexes with one covering both pieces of state if they're always used together.
Critical-section shrinking. Copy needed data out, release the lock, then do the work without it.
Don't hold a lock across a call you don't own. Never call user callbacks, foreign functions, or allocator hooks while holding a lock — they may re-enter.

Class 2 — Async / `.await` Deadlocks

Definition: The logical task graph has a cycle, or a task that holds a non-.await-aware lock yields to the runtime and is never re-polled because the next task needs the same lock.

Canonical forms:

std::sync::Mutex held across .await. The guard crosses the yield point; the task is parked with the lock still held; another task needs the lock and blocks the worker thread.
block_on inside an async runtime. Runtime thread enters a synchronous wait; the thing it's waiting for needs the runtime to make progress.
spawn_blocking missing (or misused for sync I/O from async context).
Channel cycle. Task A sends to a bounded channel whose reader is Task B; Task B sends to a bounded channel whose reader is Task A; both blocked on full/empty.
JoinHandle cycle. Task A .awaits B's handle; B .awaits A's.
Task starvation. A long-running task on a tokio-runtime-worker monopolizes the thread; other tasks cannot be polled. Looks like a deadlock to the end user.

Signature at rest: grep the codebase for let guard = lock.lock(); ... .await and std::sync::Mutex inside async fn. Use the recipes in STATIC-AUDIT.md — this is the highest-ROI static check you can run on an async Rust codebase.

Signature at runtime: workers idle in epoll_wait, but requests pending. See gdb-for-debugging §"Diagnosing Async Deadlocks".

Fix catalog:

Drop the guard before .await. Explicitly: let data = { let g = lock.lock(); g.clone() }; do_io(data).await;.
Use tokio::sync::Mutex only when you must hold the lock across .await. It is slower — prefer dropping the guard.
spawn_blocking for synchronous I/O from an async context (synchronous SQLite, std::fs::read, CPU-heavy work, C library calls).
Replace shared state with channels. Actors own their state; requests arrive via mpsc; replies via oneshot. No shared mutex, no lock-order bugs.
Bound channels carefully. Unbounded is a memory leak; bounded risks backpressure deadlock if the sender can't make progress without the receiver running. When in doubt, prefer try_send + drop-oldest policy.
Never block_on inside an async context. If you must bridge, use Handle::current().spawn_blocking(...) or restructure to avoid the bridge.

Class 3 — Livelock / Retry Storms

Definition: Threads make visible activity (futex_wake + futex_wait, high CPU, log noise) but no forward progress. Often mistaken for a deadlock.

Canonical forms:

Retry-on-BUSY without backoff. N workers hit a contended resource, all retry immediately, none wins.
Condvar ping-pong. Two threads repeatedly wake each other on edges that are no longer true.
Accept-loop without backoff. accept4 returns EAGAIN, immediately retried; no poll, no sleep.
ABA in a lock-free structure. CAS succeeds on a value that was swapped out and back in; the operation corrupts state; other threads undo and retry.

Signature: 100% CPU, strace shows a tight loop of the same syscall, logs show retry messages stacked.

Fix catalog:

Exponential backoff with jitter. Every retry loop must have it. No exceptions.
Single-writer pattern. Serialize writes to a contested resource through one owner; readers proceed in parallel.
Fairness / queue-based locks. parking_lot is unfair by default; switch to fair() if starvation is observed.
Convert edge-triggered notifications to level-triggered. Store the desired state, not the transition; the consumer can always recompute.

Class 4 — Database Concurrency (SQLite-heavy)

The recurring pain points across our projects:

SQLITE_BUSY / "database is locked". Multiple connections want the write lock simultaneously. The loser fails.
WAL checkpoint starvation. Long-running readers prevent the WAL from being reset; writes continue to append; DB size explodes.
Connection-per-request with shared file. Every connection sees its own lock state; a long transaction in one blocks every other. Connection pool can hide the problem or make it worse.
Async blocking on sync driver. rusqlite::Connection is synchronous; using it from an async handler without spawn_blocking blocks the runtime thread.
PRAGMA left at defaults. No busy_timeout, no journal_mode=WAL, no synchronous=NORMAL. Every writer serializes with exclusive locks and no retry.
Transaction escalation. BEGIN followed by a read followed by a write upgrades the lock; another writer that's already in a write transaction now deadlocks.

Fix catalog:

Set PRAGMAs on every connection open:

PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA busy_timeout = 5000;     -- ms; SQLite will retry internally
PRAGMA foreign_keys = ON;
PRAGMA temp_store = MEMORY;
PRAGMA mmap_size = 268435456;

One writer connection, many reader connections. Serialize all writes through a single Mutex<Connection> or a single actor task. Readers can use a pool.
BEGIN IMMEDIATE for transactions that will write. Acquires the write lock up-front; prevents deferred-to-immediate upgrade deadlocks.
Outer retry on SQLITE_BUSY with exponential backoff + jitter, on top of the internal busy_timeout.
Checkpoint explicitly. PRAGMA wal_checkpoint(TRUNCATE) on a schedule or after bulk writes so WAL doesn't grow unbounded.
From async: always wrap sync SQLite calls in spawn_blocking. Or use sqlx/tokio-rusqlite which do it for you.

See DATABASE.md for the full WAL semantics reference, PRAGMA matrix, retry-with-backoff Rust template, and project-sourced incident reports.

Class 5 — Runtime-Init / Reentrant Hazards

Definition: Code that runs during early process/library initialization acquires a lock, and something on the init path re-enters the same lock (or a lock held by the loader itself).

The canonical case from glibc_rust: libglibc_rs_abi.so exports strlen. When loaded via LD_PRELOAD, the dynamic loader calls strlen during symbol resolution. strlen calls into the membrane crate, which touches a OnceLock holding global policy. OnceLock::get_or_init takes a lock. The allocator inside get_or_init also goes through the same libc and re-enters the ABI. Reentrant lock on a non-recursive primitive → infinite hang.

The broader rule: Any function that may be called before main — or by a library interposition — cannot use OnceLock, std::sync::Mutex, lazy_static, RwLock, or the allocator. All of these can block.

Static-audit signature:

ast-grep run -l Rust -p '$X::get_or_init($$$)'
rg -n 'OnceLock|OnceCell|Lazy|lazy_static|thread_local!' crates/<preload_lib>/

Every hit is a potential hazard in an LD_PRELOAD context.

Fix catalog:

Atomic state machine instead of OnceLock. Encode {UNINIT=0, INIT_IN_PROGRESS=1, INIT_DONE=2} in an AtomicU8; race losers spin-wait briefly (rare path) or fall back to a null-safe default.
Compile-time constant initialization wherever possible (const fn, static).
Deferred initialization. Don't initialize on first call; initialize lazily only on paths that are safe (after main).
Signal-safety: forbid allocation and locks in any code path that might run from a signal handler. This is the same class of hazard.
Test via LD_PRELOAD the binary against a small program that calls every exported function; any hang means reentrant init.

See LD-PRELOAD.md for the full incident + fix narrative from glibc-rust/frankenlibc sessions.

Class 6 — Data Races & TOCTOU

Definition: Unsynchronized concurrent access to the same memory; one of the accesses is a write. In a language with a defined memory model (Rust, Go, Java, C11+), this is undefined behavior.

TOCTOU (time-of-check-to-time-of-use): Check a condition, then act on it, assuming it's still true. It isn't.

Discovery: TSAN is ground truth. RUSTFLAGS="-Zsanitizer=thread" cargo +nightly build ... then run the test suite with high concurrency. For Go: go test -race. For C/C++: -fsanitize=thread.

Fix catalog:

Wrap shared state in Mutex / RwLock / Atomic. The compiler enforces this in Rust; listen to it.
Replace shared state with channels / message passing.
For counters: AtomicUsize with Ordering::Relaxed only if you've read the memory-ordering rules; otherwise SeqCst. Err on the side of stronger.
For TOCTOU: eliminate the gap. Use atomic compare_exchange, transactional updates, or hold the lock across check + action.

See gdb-for-debugging §"Race Condition Methodology" for the reproduce → detect → localize → fix → verify loop.

Class 7 — Multi-Process / Swarm Races

Definition: Multiple processes (or agents) contend for a shared resource — a file, a database, a tmux session, a git working tree — without in-process synchronization.

Our typical forms:

Two agents editing the same file. Agent A writes, Agent B writes; one overwrites the other's work.
Advisory lease expiry. An agent takes a file reservation with TTL=3600, runs longer, lease expires, another agent takes it; original finishes and writes over the new work.
Cross-process SQLite. Solved by Class 4 techniques plus PRAGMA locking_mode=NORMAL (not EXCLUSIVE).
tmux mux server wedged. A child process holds an fd the mux needs; the mux blocks on I/O; every pane hangs.

Fix catalog:

Use MCP Agent Mail file reservations. See agent-mail. file_reservation_paths with an appropriate TTL + a reason tying back to the bead/task. Release explicitly; don't rely on TTL.
flock(2) for filesystem-only coordination. Advisory, cooperative. Every consumer must call it.
Single-writer process for the DB. Other processes submit work via a queue or RPC.
Monitor mux health. wezterm-mux-server is sacred — protect it explicitly (see system-performance-remediation).

Class 8 — Poisoning & Partial State

Definition: A thread panics while holding a Mutex. Rust's std::sync::Mutex poisons the mutex; subsequent .lock() calls return Err(PoisonError). If the panic left shared state partially updated, every caller must now decide: trust or discard.

Fix catalog:

parking_lot::Mutex does not poison. It's faster and simpler, but callers must handle partial state explicitly.
Treat every critical section as a transaction. Build the new state in a local, then swap atomically at the end. A panic mid-build leaves the shared state untouched.
Catch-unwind at the task boundary. If a task panics, let the runtime notice and restart it; don't let the panic propagate through shared state.

Class 9 — Memory Ordering & Lost Wakeups

Definition: Correct locks, incorrect assumptions about visibility or ordering. The observed behavior seems to violate program order — because it does, on the CPU's reordered view.

Canonical forms:

Wake-before-wait. Thread A stores a flag, signals condvar; Thread B hasn't entered wait yet; signal is lost.
Lost notification on an edge. Use of Notify::notify_one before notified().await — the notification is dropped.
Wrong memory ordering. Ordering::Relaxed on a pointer publication — reader sees a garbage object because the initializer store hasn't become visible.

Fix catalog:

Always pair a condvar with a predicate. while !ready { cv.wait(lock) }. Never if.
Store the predicate before signaling. Signal after the condition is true.
Prefer tokio Notify with notified() set up before the event can happen (see Tokio Notify docs — the notified() future must be polled at least once to subscribe).
Use Ordering::Release for the producer store and Ordering::Acquire for the consumer load when publishing a pointer / building an atomic state machine. Never Relaxed for data publication.

The Discovery Playbook

When a bug has been reported:

Capture state before touching anything. gdb thread apply all bt full → file. Once the process dies, the evidence is gone.
Classify using the Symptom Triage Table. Jump to the matching class above.
Build the lock-wait graph (gdb-for-debugging §Lock Graph). Prove the deadlock exists before you "fix" it — guessing is how you get four more of them.
Fix the root cause, not the symptom. A timeout on a deadlocked .lock() is a smoke alarm, not a fire extinguisher.
Audit for the other three. Every deadlock you find is a sample from a distribution. Run the static-audit recipes to find the rest. See THE FOURTH INSTANCE.
Add a regression test. Stress tests with --test-threads=N and loom (Rust) or go test -race. Fuzz the scheduler with rr --chaos if you have it.

When doing a preemptive audit (no bug reported yet):

Run the static audit recipes (STATIC-AUDIT.md) — ast-grep + ripgrep across the codebase for every pattern in the nine classes.
Enable parking_lot deadlock detection in debug builds; run the test suite. Any detection is a proof of deadlock.
Run TSAN on the test suite.
Run loom (if Rust) on the core concurrency primitives of the project.
Review every unsafe impl Send/Sync. Each one is a hand-written promise the compiler couldn't check.

Static Audit Recipes (High-ROI Greps)

See STATIC-AUDIT.md for the full catalog. Highlights:

# Rust: guard held across await (manual inspection required)
rg -n --type rust -U 'let\s+\w+\s*=\s*.*\.(lock|read|write)\(\).*\n[^}]*\.await' .

# Rust: std::sync::Mutex inside async fn (smell)
ast-grep run -l Rust -p 'async fn $F($$$) { $$$ std::sync::Mutex $$$ }'

# Rust: block_on inside anywhere (double-check: may be inside a sync bridge)
rg -n --type rust 'block_on' .

# Rust: OnceLock / Lazy in LD_PRELOAD libs (Class 5)
rg -n --type rust 'OnceLock|OnceCell|Lazy::new|lazy_static!|thread_local!' crates/<preload>/

# Two different lock orderings in the same code (Class 1)
rg -n --type rust 'let\s+\w+\s*=\s*self\.\w+\.lock\(\)' . | sort -u

# SQLite: missing busy_timeout (Class 4)
rg -n 'Connection::open|open_in_memory' . | rg -v 'busy_timeout'

# Rust: unbounded channel (Class 2 back-pressure risk)
rg -n 'unbounded_channel|mpsc::unbounded' --type rust .

# Missing fairness on rwlock (Class 3)
rg -n 'RwLock::new' --type rust .  # review each for writer-starvation risk

Fix Catalog (Canonical Replacements)

See FIX-CATALOG.md. Summary:

| Broken Pattern | Replace With | Why | |----------------|--------------|-----| | OnceLock on LD_PRELOAD path | AtomicU8 state machine | No allocator, no reentrancy | | std::sync::Mutex held across .await | Scoped guard dropped before .await | Task yield with lock is a bug | | Deep call holding two locks | Total lock order + assertion | Eliminate cycle possibility | | Retry-on-BUSY tight loop | Exponential backoff + jitter | Break livelock | | Connection-per-request SQLite | Single writer, read pool | Prevent lock escalation storms | | Shared Mutex<Vec<Work>> | mpsc::channel + actor | No lock for producers | | lazy_static in LD_PRELOAD | const / compile-time init | No lock needed | | std::Mutex + panic risk | parking_lot::Mutex + transaction-style updates | No poisoning, clearer semantics | | flock only in-process | flock + app-level lease + TTL | Multi-process coordination |

Prevention by Design

Prefer message passing over shared state. An actor with a private mutable state + mpsc inbox has zero lock-ordering bugs, by construction.
Single writer, many readers. Most of our DB and state-sharing incidents would not have happened if writes were serialized through one owner.
Ranks for locks. Assign a total order; assert ascending acquisition. Loom or parking_lot can check in tests.
Bound every queue / channel. Unbounded is a leak; explicit bounds force you to reason about backpressure.
Time-bound every wait. try_lock_for(Duration) over .lock(); timeout(Duration, fut).await over bare .await. Every hang becomes a log line, not a stall.
Encapsulate; don't share handles to shared state. Give callers a method that does the operation, not a guard.
Fix the worst class first. By volume of pain across our incidents, the ranking is: Class 4 (DB) > Class 2 (async) > Class 5 (runtime init) > Class 7 (swarm) > Class 1 (classic). Look at your own codebase and adjust.

Validation

Before you declare a concurrency fix done:

[ ] Reproduce reliably, or accept that you can't and run the fix past stress + TSAN + loom.
[ ] Add a test that would have caught it: #[test] with --test-threads=N, or loom::model, or a stress harness with N=100× the old workload.
[ ] Run the static audit (STATIC-AUDIT.md) and find every other instance of the same hazard.
[ ] Document the fix in commit/bead so future-you knows what you changed and why.
[ ] parking_lot deadlock_detection on in debug builds; run tests; no detections.
[ ] TSAN clean on the test suite (Rust/C/Go).
[ ] loom::model passes for the critical primitive if Rust.

References

Core

| Topic | Reference | |-------|-----------| | The Fourth Instance (find ALL hazards, not just one) | THE-FOURTH-INSTANCE.md | | Static-audit recipes (ast-grep + ripgrep, all languages) | STATIC-AUDIT.md | | Fix catalog (14+ canonical replacements) | FIX-CATALOG.md | | Diagnosis techniques (pointers to gdb-for-debugging) | DIAGNOSIS.md | | Anti-patterns (what NOT to do, all classes) | ANTI-PATTERNS.md | | Incident narratives (8+ real project stories) | INCIDENTS.md | | Validation tooling (TSAN, loom, miri, parking_lot, rr) | VALIDATION.md |

Language Cookbooks

| Language | Reference | |----------|-----------| | Rust (asupersync) — PRIMARY: Cx, Scope, obligations, lab/DPOR, structured concurrency | ASUPERSYNC.md | | Rust (tokio/std ecosystem) — tokio, parking_lot, crossbeam, rayon, dashmap, sqlx | RUST.md | | Go — goroutines, channels, sync, context, errgroup, pprof, race detector | GO.md | | Python — GIL, asyncio, threading, multiprocessing, trio/anyio, py-spy | PYTHON.md | | TypeScript / Node.js — event loop, promises, worker_threads, React, Next.js, Prisma | TYPESCRIPT.md |

Domain Deep-Dives

| Topic | Reference | |-------|-----------| | Database concurrency (SQLite WAL, PRAGMAs, retries) | DATABASE.md | | LD_PRELOAD / reentrant init (glibc-rust incident) | LD-PRELOAD.md | | Async / await (cross-language async patterns) | ASYNC.md | | Multi-process / swarm (agent-mail, flock, leases) | SWARM.md | | Distributed concurrency (Redlock, pg_advisory, etcd, CRDTs, saga, outbox) | DISTRIBUTED.md | | Creative patterns (actor, STM, CSP, structured concurrency, single-writer, "do nothing") | CREATIVE-PATTERNS.md | | Lock-free (CAS, ABA, epoch reclamation, seqlocks, flat combiner, HTM) | LOCK-FREE.md | | Formal methods (loom, DPOR, TLA+, miri, linearizability, evidence ledgers) | FORMAL-METHODS.md | | Resilience patterns (circuit breaker, bulkhead, singleflight, backpressure, hedge, quorum) | RESILIENCE-PATTERNS.md | | Concurrency operators (composable diagnostic moves with triggers + failure modes + prompts) | CONCURRENCY-OPERATORS.md | | C/C++ systems (pthread, memory model, signal safety, fork hazards, io_uring, epoll) | C-CPP.md | | Database advanced (Postgres advisory, SKIP LOCKED, SSI, MVCC, Prisma/Drizzle, Redis) | DATABASE-ADVANCED.md | | Cookbook index (dispatch by language, topic, or bug class) | COOKBOOK-INDEX.md | | Cross-language matrix (primitive equivalents, same-bug-different-language, detection tools) | CROSS-LANGUAGE.md |

Companion Skills

| Skill | Use When | |-------|----------| | /cs/gdb-for-debugging/ | Lock-graph construction, async runtime debugging, TSAN, rr | | /cs/asupersync-mega-skill/ | Full asupersync runtime, migration, all reference files | | /cs/agent-mail/ | Advisory file reservations, multi-agent coordination | | /cs/system-performance-remediation/ | Process triage, kill hierarchy, mux protection |

Deadlock Finder and Fixer

Core Insight. Concurrency bugs do not come from one missing lock — they come from one lock acquired in the wrong place, at the wrong time, held across the wrong operation, by a thread that didn't know it was holding it. Find every instance of the hazard, not just the one that fired.

The Universal Rule. When you think you found the deadlock and fixed the three instances you could see, there is almost always a fourth. This is the single most common failure mode across every concurrency debugging session in this repo's history. Keep searching until you can prove exhaustively — by code audit — that no hazard remains. See THE FOURTH INSTANCE.

Quick Start: Something Is Hung

# 1. Is it CPU-alive or CPU-dead?
ps -Lp $PID -o tid,pcpu,pmem,comm --no-headers | head -20

# 2. Snapshot all thread states (pick ONE, in order of availability):
gdb --batch -ex "set pagination off" -ex "thread apply all bt full" -p $PID 2>&1 | tee /tmp/bt.txt
#   OR (if ptrace blocked / LD_PRELOAD hazard):
strace -k -f -p $PID 2>&1 | head -200
#   OR (sample /proc):
for i in 1 2 3; do cat /proc/$PID/task/*/stack 2>/dev/null | sort -u; sleep 1; done

# 3. Classify (pick the matching row from the Symptom Triage Table below).
# 4. Jump to the matching section in this skill or in gdb-for-debugging.

Symptom Triage Table

The Nine Classes (Taxonomy)

Class 1 — Classic Mutex Deadlock

Definition: Two or more threads each hold a lock the other needs; circular wait in the lock-wait graph.

Canonical forms:

AB-BA: T1 holds A, wants B; T2 holds B, wants A.
Self-deadlock: Single thread re-enters the same non-recursive mutex (e.g., callback path re-enters lock holder).
Reader-upgrade: Thread holds RwLock::read, then asks for RwLock::write in the same thread → guaranteed hang.
Condvar wakeup loop: Thread in pthread_cond_wait on M; its waker needs to acquire M to signal but can't.

Fix catalog:

Total lock order. Assign every mutex a global rank; assert in debug builds that locks are only acquired in ascending order. parking_lot::deadlock detector can enforce this at runtime.
Lock coalescing. Replace two separate mutexes with one covering both pieces of state if they're always used together.
Critical-section shrinking. Copy needed data out, release the lock, then do the work without it.
Don't hold a lock across a call you don't own. Never call user callbacks, foreign functions, or allocator hooks while holding a lock — they may re-enter.

Class 2 — Async / `.await` Deadlocks

Definition: The logical task graph has a cycle, or a task that holds a non-.await-aware lock yields to the runtime and is never re-polled because the next task needs the same lock.

Canonical forms:

std::sync::Mutex held across .await. The guard crosses the yield point; the task is parked with the lock still held; another task needs the lock and blocks the worker thread.
block_on inside an async runtime. Runtime thread enters a synchronous wait; the thing it's waiting for needs the runtime to make progress.
spawn_blocking missing (or misused for sync I/O from async context).
Channel cycle. Task A sends to a bounded channel whose reader is Task B; Task B sends to a bounded channel whose reader is Task A; both blocked on full/empty.
JoinHandle cycle. Task A .awaits B's handle; B .awaits A's.
Task starvation. A long-running task on a tokio-runtime-worker monopolizes the thread; other tasks cannot be polled. Looks like a deadlock to the end user.

Signature at runtime: workers idle in epoll_wait, but requests pending. See gdb-for-debugging §"Diagnosing Async Deadlocks".

Fix catalog:

Drop the guard before .await. Explicitly: let data = { let g = lock.lock(); g.clone() }; do_io(data).await;.
Use tokio::sync::Mutex only when you must hold the lock across .await. It is slower — prefer dropping the guard.
spawn_blocking for synchronous I/O from an async context (synchronous SQLite, std::fs::read, CPU-heavy work, C library calls).
Replace shared state with channels. Actors own their state; requests arrive via mpsc; replies via oneshot. No shared mutex, no lock-order bugs.
Bound channels carefully. Unbounded is a memory leak; bounded risks backpressure deadlock if the sender can't make progress without the receiver running. When in doubt, prefer try_send + drop-oldest policy.
Never block_on inside an async context. If you must bridge, use Handle::current().spawn_blocking(...) or restructure to avoid the bridge.

Class 3 — Livelock / Retry Storms

Definition: Threads make visible activity (futex_wake + futex_wait, high CPU, log noise) but no forward progress. Often mistaken for a deadlock.

Canonical forms:

Retry-on-BUSY without backoff. N workers hit a contended resource, all retry immediately, none wins.
Condvar ping-pong. Two threads repeatedly wake each other on edges that are no longer true.
Accept-loop without backoff. accept4 returns EAGAIN, immediately retried; no poll, no sleep.
ABA in a lock-free structure. CAS succeeds on a value that was swapped out and back in; the operation corrupts state; other threads undo and retry.

Signature: 100% CPU, strace shows a tight loop of the same syscall, logs show retry messages stacked.

Fix catalog:

Exponential backoff with jitter. Every retry loop must have it. No exceptions.
Single-writer pattern. Serialize writes to a contested resource through one owner; readers proceed in parallel.
Fairness / queue-based locks. parking_lot is unfair by default; switch to fair() if starvation is observed.
Convert edge-triggered notifications to level-triggered. Store the desired state, not the transition; the consumer can always recompute.

Class 4 — Database Concurrency (SQLite-heavy)

The recurring pain points across our projects:

SQLITE_BUSY / "database is locked". Multiple connections want the write lock simultaneously. The loser fails.
WAL checkpoint starvation. Long-running readers prevent the WAL from being reset; writes continue to append; DB size explodes.
Connection-per-request with shared file. Every connection sees its own lock state; a long transaction in one blocks every other. Connection pool can hide the problem or make it worse.
Async blocking on sync driver. rusqlite::Connection is synchronous; using it from an async handler without spawn_blocking blocks the runtime thread.
PRAGMA left at defaults. No busy_timeout, no journal_mode=WAL, no synchronous=NORMAL. Every writer serializes with exclusive locks and no retry.
Transaction escalation. BEGIN followed by a read followed by a write upgrades the lock; another writer that's already in a write transaction now deadlocks.

Fix catalog:

Set PRAGMAs on every connection open:

PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA busy_timeout = 5000;     -- ms; SQLite will retry internally
PRAGMA foreign_keys = ON;
PRAGMA temp_store = MEMORY;
PRAGMA mmap_size = 268435456;

One writer connection, many reader connections. Serialize all writes through a single Mutex<Connection> or a single actor task. Readers can use a pool.
BEGIN IMMEDIATE for transactions that will write. Acquires the write lock up-front; prevents deferred-to-immediate upgrade deadlocks.
Outer retry on SQLITE_BUSY with exponential backoff + jitter, on top of the internal busy_timeout.
Checkpoint explicitly. PRAGMA wal_checkpoint(TRUNCATE) on a schedule or after bulk writes so WAL doesn't grow unbounded.
From async: always wrap sync SQLite calls in spawn_blocking. Or use sqlx/tokio-rusqlite which do it for you.

See DATABASE.md for the full WAL semantics reference, PRAGMA matrix, retry-with-backoff Rust template, and project-sourced incident reports.

Class 5 — Runtime-Init / Reentrant Hazards

Definition: Code that runs during early process/library initialization acquires a lock, and something on the init path re-enters the same lock (or a lock held by the loader itself).

Static-audit signature:

ast-grep run -l Rust -p '$X::get_or_init($$$)'
rg -n 'OnceLock|OnceCell|Lazy|lazy_static|thread_local!' crates/<preload_lib>/

Every hit is a potential hazard in an LD_PRELOAD context.

Fix catalog:

Atomic state machine instead of OnceLock. Encode {UNINIT=0, INIT_IN_PROGRESS=1, INIT_DONE=2} in an AtomicU8; race losers spin-wait briefly (rare path) or fall back to a null-safe default.
Compile-time constant initialization wherever possible (const fn, static).
Deferred initialization. Don't initialize on first call; initialize lazily only on paths that are safe (after main).
Signal-safety: forbid allocation and locks in any code path that might run from a signal handler. This is the same class of hazard.
Test via LD_PRELOAD the binary against a small program that calls every exported function; any hang means reentrant init.

See LD-PRELOAD.md for the full incident + fix narrative from glibc-rust/frankenlibc sessions.

Class 6 — Data Races & TOCTOU

Definition: Unsynchronized concurrent access to the same memory; one of the accesses is a write. In a language with a defined memory model (Rust, Go, Java, C11+), this is undefined behavior.

TOCTOU (time-of-check-to-time-of-use): Check a condition, then act on it, assuming it's still true. It isn't.

Discovery: TSAN is ground truth. RUSTFLAGS="-Zsanitizer=thread" cargo +nightly build ... then run the test suite with high concurrency. For Go: go test -race. For C/C++: -fsanitize=thread.

Fix catalog:

Wrap shared state in Mutex / RwLock / Atomic. The compiler enforces this in Rust; listen to it.
Replace shared state with channels / message passing.
For counters: AtomicUsize with Ordering::Relaxed only if you've read the memory-ordering rules; otherwise SeqCst. Err on the side of stronger.
For TOCTOU: eliminate the gap. Use atomic compare_exchange, transactional updates, or hold the lock across check + action.

See gdb-for-debugging §"Race Condition Methodology" for the reproduce → detect → localize → fix → verify loop.

Class 7 — Multi-Process / Swarm Races

Definition: Multiple processes (or agents) contend for a shared resource — a file, a database, a tmux session, a git working tree — without in-process synchronization.

Our typical forms:

Two agents editing the same file. Agent A writes, Agent B writes; one overwrites the other's work.
Advisory lease expiry. An agent takes a file reservation with TTL=3600, runs longer, lease expires, another agent takes it; original finishes and writes over the new work.
Cross-process SQLite. Solved by Class 4 techniques plus PRAGMA locking_mode=NORMAL (not EXCLUSIVE).
tmux mux server wedged. A child process holds an fd the mux needs; the mux blocks on I/O; every pane hangs.

Fix catalog:

Use MCP Agent Mail file reservations. See agent-mail. file_reservation_paths with an appropriate TTL + a reason tying back to the bead/task. Release explicitly; don't rely on TTL.
flock(2) for filesystem-only coordination. Advisory, cooperative. Every consumer must call it.
Single-writer process for the DB. Other processes submit work via a queue or RPC.
Monitor mux health. wezterm-mux-server is sacred — protect it explicitly (see system-performance-remediation).

Class 8 — Poisoning & Partial State

Fix catalog:

parking_lot::Mutex does not poison. It's faster and simpler, but callers must handle partial state explicitly.
Treat every critical section as a transaction. Build the new state in a local, then swap atomically at the end. A panic mid-build leaves the shared state untouched.
Catch-unwind at the task boundary. If a task panics, let the runtime notice and restart it; don't let the panic propagate through shared state.

Class 9 — Memory Ordering & Lost Wakeups

Definition: Correct locks, incorrect assumptions about visibility or ordering. The observed behavior seems to violate program order — because it does, on the CPU's reordered view.

Canonical forms:

Wake-before-wait. Thread A stores a flag, signals condvar; Thread B hasn't entered wait yet; signal is lost.
Lost notification on an edge. Use of Notify::notify_one before notified().await — the notification is dropped.
Wrong memory ordering. Ordering::Relaxed on a pointer publication — reader sees a garbage object because the initializer store hasn't become visible.

Fix catalog:

Always pair a condvar with a predicate. while !ready { cv.wait(lock) }. Never if.
Store the predicate before signaling. Signal after the condition is true.
Prefer tokio Notify with notified() set up before the event can happen (see Tokio Notify docs — the notified() future must be polled at least once to subscribe).
Use Ordering::Release for the producer store and Ordering::Acquire for the consumer load when publishing a pointer / building an atomic state machine. Never Relaxed for data publication.

The Discovery Playbook

When a bug has been reported:

Capture state before touching anything. gdb thread apply all bt full → file. Once the process dies, the evidence is gone.
Classify using the Symptom Triage Table. Jump to the matching class above.
Build the lock-wait graph (gdb-for-debugging §Lock Graph). Prove the deadlock exists before you "fix" it — guessing is how you get four more of them.
Fix the root cause, not the symptom. A timeout on a deadlocked .lock() is a smoke alarm, not a fire extinguisher.
Audit for the other three. Every deadlock you find is a sample from a distribution. Run the static-audit recipes to find the rest. See THE FOURTH INSTANCE.
Add a regression test. Stress tests with --test-threads=N and loom (Rust) or go test -race. Fuzz the scheduler with rr --chaos if you have it.

When doing a preemptive audit (no bug reported yet):

Run the static audit recipes (STATIC-AUDIT.md) — ast-grep + ripgrep across the codebase for every pattern in the nine classes.
Enable parking_lot deadlock detection in debug builds; run the test suite. Any detection is a proof of deadlock.
Run TSAN on the test suite.
Run loom (if Rust) on the core concurrency primitives of the project.
Review every unsafe impl Send/Sync. Each one is a hand-written promise the compiler couldn't check.

Static Audit Recipes (High-ROI Greps)

See STATIC-AUDIT.md for the full catalog. Highlights:

# Rust: guard held across await (manual inspection required)
rg -n --type rust -U 'let\s+\w+\s*=\s*.*\.(lock|read|write)\(\).*\n[^}]*\.await' .

# Rust: std::sync::Mutex inside async fn (smell)
ast-grep run -l Rust -p 'async fn $F($$$) { $$$ std::sync::Mutex $$$ }'

# Rust: block_on inside anywhere (double-check: may be inside a sync bridge)
rg -n --type rust 'block_on' .

# Rust: OnceLock / Lazy in LD_PRELOAD libs (Class 5)
rg -n --type rust 'OnceLock|OnceCell|Lazy::new|lazy_static!|thread_local!' crates/<preload>/

# Two different lock orderings in the same code (Class 1)
rg -n --type rust 'let\s+\w+\s*=\s*self\.\w+\.lock\(\)' . | sort -u

# SQLite: missing busy_timeout (Class 4)
rg -n 'Connection::open|open_in_memory' . | rg -v 'busy_timeout'

# Rust: unbounded channel (Class 2 back-pressure risk)
rg -n 'unbounded_channel|mpsc::unbounded' --type rust .

# Missing fairness on rwlock (Class 3)
rg -n 'RwLock::new' --type rust .  # review each for writer-starvation risk

Fix Catalog (Canonical Replacements)

See FIX-CATALOG.md. Summary:

Prevention by Design

Prefer message passing over shared state. An actor with a private mutable state + mpsc inbox has zero lock-ordering bugs, by construction.
Single writer, many readers. Most of our DB and state-sharing incidents would not have happened if writes were serialized through one owner.
Ranks for locks. Assign a total order; assert ascending acquisition. Loom or parking_lot can check in tests.
Bound every queue / channel. Unbounded is a leak; explicit bounds force you to reason about backpressure.
Time-bound every wait. try_lock_for(Duration) over .lock(); timeout(Duration, fut).await over bare .await. Every hang becomes a log line, not a stall.
Encapsulate; don't share handles to shared state. Give callers a method that does the operation, not a guard.
Fix the worst class first. By volume of pain across our incidents, the ranking is: Class 4 (DB) > Class 2 (async) > Class 5 (runtime init) > Class 7 (swarm) > Class 1 (classic). Look at your own codebase and adjust.

Validation

Before you declare a concurrency fix done:

[ ] Reproduce reliably, or accept that you can't and run the fix past stress + TSAN + loom.
[ ] Add a test that would have caught it: #[test] with --test-threads=N, or loom::model, or a stress harness with N=100× the old workload.
[ ] Run the static audit (STATIC-AUDIT.md) and find every other instance of the same hazard.
[ ] Document the fix in commit/bead so future-you knows what you changed and why.
[ ] parking_lot deadlock_detection on in debug builds; run tests; no detections.
[ ] TSAN clean on the test suite (Rust/C/Go).
[ ] loom::model passes for the critical primitive if Rust.

Adoption

tkersey/deadlock-finder-and-fixer

$ install --global

Security Scan Results

SKILL.md

Deadlock Finder and Fixer

Quick Start: Something Is Hung

Symptom Triage Table

The Nine Classes (Taxonomy)

Class 1 — Classic Mutex Deadlock

Class 2 — Async / .await Deadlocks

Class 3 — Livelock / Retry Storms

Class 4 — Database Concurrency (SQLite-heavy)

Class 5 — Runtime-Init / Reentrant Hazards

Class 6 — Data Races & TOCTOU

Class 7 — Multi-Process / Swarm Races

Class 8 — Poisoning & Partial State

Class 9 — Memory Ordering & Lost Wakeups

The Discovery Playbook

Static Audit Recipes (High-ROI Greps)

Fix Catalog (Canonical Replacements)

Prevention by Design

Validation

References

Core

Language Cookbooks

Domain Deep-Dives

Companion Skills

Related Skills

tkersey/fm

tkersey/hylo

tkersey/ledger

tkersey/review-fold

tkersey/deadlock-finder-and-fixer

$ install --global

Security Scan Results

SKILL.md

Deadlock Finder and Fixer

Quick Start: Something Is Hung

Symptom Triage Table

The Nine Classes (Taxonomy)

Class 1 — Classic Mutex Deadlock

Class 2 — Async / .await Deadlocks

Class 3 — Livelock / Retry Storms

Class 4 — Database Concurrency (SQLite-heavy)

Class 5 — Runtime-Init / Reentrant Hazards

Class 6 — Data Races & TOCTOU

Class 7 — Multi-Process / Swarm Races

Class 8 — Poisoning & Partial State

Class 9 — Memory Ordering & Lost Wakeups

The Discovery Playbook

Static Audit Recipes (High-ROI Greps)

Fix Catalog (Canonical Replacements)

Prevention by Design

Validation

References

Core

Language Cookbooks

Domain Deep-Dives

Companion Skills

Related Skills

tkersey/fm

tkersey/hylo

tkersey/ledger

tkersey/review-fold

Class 2 — Async / `.await` Deadlocks

Class 2 — Async / `.await` Deadlocks