.claude/skills/dist-sys-auditor/SKILL.md
Use when adding or modifying coordination protocols, implementing consensus or gossip mechanisms, or changing distributed state management. Audits designs against academic literature and battle-tested systems with citation requirements.
npx skillsauth add ahrav/gossip-rs dist-sys-auditorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Enforce evidence-backed distributed systems design and implementation for the gossip-rs project. Every coordination decision must trace to established academic literature, a battle-tested production system, or a formally verified specification. No novel protocols. No uncited coordination logic.
gossip-coordination, or any code touching
Boundary 2 (Coordination) contractsgossip-contracts that relate to
coordination, cursors, shards, epochs, or OpIdsOn every invocation, read these files before doing anything else:
tmp/gossip-project-artifacts/distributed-secret-scanner-instructions.md
— master instructions with evidence hierarchy, anti-patterns, citation tabletmp/gossip-project-artifacts/gossip-contracts-references.md
— curated references organized by boundaryThese are the primary evidence sources. Patterns found here are pre-approved.
If the current work touches specific boundary contracts, also read the relevant
boundary draft files in tmp/gossip-project-artifacts/boundary_*.rs to check
locked decisions and specified APIs.
Design Mode — you are in a brainstorming session, plan mode, writing a design doc, or discussing architecture without modifying a specific code file.
Implementation Mode — you are writing, reviewing, or modifying Rust code that embodies coordination patterns.
If both apply (designing while coding), use both workflows.
Verify the proposal does not contradict any locked architectural decision:
last_key (no regression to None)If a conflict is found, BLOCK and explain the contradiction.
Verify against the 7 explicit prohibitions from Section 9 of the instructions:
If any anti-pattern is detected, BLOCK with the specific prohibition number and reference.
Every design decision must be presented in this format:
## Distributed Systems Decision: [topic]
### Locked Decisions Check
- [x] No conflict with D2.1–D2.5, D3.3, Decision A/B
(or [ ] Conflicts with [decision] — resolution required)
### Anti-Pattern Check
- [x] No novel consensus protocols
- [x] No clock synchronization assumptions
- [x] No conflating "no response" with "failure"
- [x] No unbounded queues
- [x] No wall-clock timeouts for correctness
- [x] No silent work drops
- [x] No network reliability assumptions
### Decision Record
| Field | Detail |
|-------|--------|
| **Problem** | [what we're deciding] |
| **Approach** | [pattern/technique name] |
| **Provenance** | [Level N] — [citation with URL/paper] |
| **Fit** | [why this matches our constraints] |
| **Trade-offs** | [what we give up] |
| **Safety invariant** | [what must never happen] |
| **Liveness invariant** | [what must eventually happen] |
| **Verification** | [how we test invariants hold] |
### Alternatives Considered
| Alternative | Why Rejected | Reference |
|-------------|-------------|-----------|
| [option] | [reason] | [citation] |
### Open Questions
[anything unresolved, flagged for follow-up]
The Provenance field is mandatory. Use this evidence hierarchy — prefer higher levels:
| Level | Label | Example | |-------|-------|---------| | 5 | Formally verified | TLA+ specs, Coq/Isabelle proofs (IronFleet, Verdi) | | 4 | Peer-reviewed | SOSP, OSDI, NSDI, EuroSys, VLDB, SIGMOD papers | | 3 | Battle-tested | FoundationDB, CockroachDB, Kafka, etcd, TigerBeetle | | 2 | Industry whitepapers | AWS, Google, Meta engineering blogs with depth | | 1 | Textbooks | DDIA (Kleppmann), Attiya & Welch, Tanenbaum |
If no citation from levels 1–5 exists, the recommendation is NOVEL and must be explicitly flagged:
**Provenance**: Level 0 — NOVEL/UNJUSTIFIED
**WARNING**: No established reference found. This pattern requires:
- Extra review from a second opinion
- Property-based tests covering all invariants
- Consider whether an existing proven pattern could substitute
If the pattern is not covered by the curated references, do targeted web research. Search for:
"{pattern name}" paper SOSP OSDI VLDB"{pattern name}" production implementation"{pattern name}" formal verification TLA+"{pattern name}" failure mode post-mortemWhen a valuable reference is found, recommend adding it to
gossip-contracts-references.md:
### Suggested Reference Addition
Add to section: [boundary section name]
| Title | Authors / Source | Year | Why It Matters |
|-------|-----------------|------|----------------|
| [title](URL) | [authors] | [year] | [relevance to our project] |
Scan the code being written or modified for these patterns:
For each coordination pattern identified, produce:
**Pattern: [name]** (file:line_start–line_end)
- **Invariant**: [the safety or liveness property this code upholds]
- **Reference**: [paper/system/post] — [URL or citation from references doc]
- **Violation scenario**: [concrete description of what breaks — e.g.,
"zombie worker with expired lease commits stale results, overwriting
valid progress from the new lease holder"]
- **Verification**: [specific test — e.g., "property test: for any sequence
of checkpoint operations, cursor.last_key is monotonically non-decreasing"]
These are hard safety invariants (D2.3, D2.4). Explicitly verify:
### Monotonicity & Ordering Check
- [ ] Cursor advancement: `new_key >= old_key` enforced
- [ ] Epoch/fence tokens: monotonically increasing, no reuse
- [ ] Shard bounds: half-open [start, end), start < end
- [ ] Checkpoint regression: no regression of last_key to None (D2.5)
If any of these cannot be verified from the code, emit a BLOCK.
Any coordination logic that doesn't map to a known pattern with a citation gets flagged:
### Flags
| Severity | Location | Issue |
|----------|----------|-------|
| BLOCK | file:NN | No citation for [pattern] — NOVEL/UNJUSTIFIED |
| BLOCK | file:NN | Contradicts locked decision [D2.X] |
| BLOCK | file:NN | Anti-pattern violation: [prohibition #N] |
| WARN | file:NN | Evidence level 1-2 only — consider stronger reference |
| WARN | file:NN | Missing verification strategy for [invariant] |
| INFO | file:NN | Stronger test available: [suggestion] |
Identify coordination patterns in the code that lack corresponding tests or property-based verification:
### Missing Coverage
| Pattern | Invariant | Suggested Verification |
|---------|-----------|----------------------|
| [pattern] | [invariant] | [test type + description] |
These are non-negotiable. Code must conform:
| ID | Decision | Source |
|----|----------|--------|
| D2.1 | Cursor is two-layer: last_key (coordinator-visible) + token (connector-opaque) | Spanner query restart (2017) |
| D2.2 | ShardSpec uses half-open [start, end) byte ranges | Spanner, Bigtable, FoundationDB |
| D2.3 | Cursor monotonicity: coordinator rejects new_key < old_key | Hard safety invariant |
| D2.4 | Cursor bounds: cursor must fall within shard's [start, end) | Hard safety invariant |
| D2.5 | Checkpoints require last_key — no regression to None | Hard safety invariant |
| D3.3 | Lex-ordered byte strings for all key ranges | Bigtable, FoundationDB, CockroachDB |
| A | No custom leader election — use consensus backends (etcd/Raft, ZK/ZAB) | Ongaro & Ousterhout 2014 |
| B | No gossip protocols — use leases + fencing | Gray & Cheriton 1989 |
| # | Prohibition | Reference | |---|-------------|-----------| | 1 | No novel consensus protocols | Use Raft via etcd or existing impl | | 2 | No clock synchronization assumptions | Use logical clocks or HLCs | | 3 | No conflating "no response" with "failure" | FLP result (Fischer, Lynch, Paterson 1985) | | 4 | No unbounded queues | Explicit backpressure required | | 5 | No wall-clock timeouts for correctness | Clocks for liveness only | | 6 | No silent work drops | Every scan unit reaches terminal state | | 7 | No network reliability assumptions | Bailis & Kingsbury 2014 |
| Concept | Primary Reference | |---------|-------------------| | Leases | Gray & Cheriton, SOSP 1989 | | Fencing tokens | Kleppmann, "How to do distributed locking" (2016) | | Consensus (Raft) | Ongaro & Ousterhout, USENIX ATC 2014 | | Range sharding | Corbett et al. (Spanner), OSDI 2012 | | Cursor restart | Bacon et al. (Spanner), 2017 | | Idempotency keys | Stripe (Brandur Leach), 2017 | | Exactly-once | Akidau et al. (Dataflow Model), VLDB 2015 | | Linearizability | Herlihy & Wing, TOPLAS 1990 | | Distributed snapshots | Chandy & Lamport, TOCS 1985 | | Failure detection | Hayashibara et al. (Phi accrual), SRDS 2004 | | Work stealing | Blumofe & Leiserson, JACM 1999 | | FLP impossibility | Fischer, Lynch, Paterson, JACM 1985 | | Sagas | Garcia-Molina & Salem, SIGMOD 1987 | | WAL / ARIES | Mohan et al., TODS 1992 | | DST | FoundationDB (Apple, 2021) | | TLA+ in industry | Newcombe et al., CACM 2015 | | Causal ordering | Lamport, CACM 1978 | | HLC | Kulkarni et al., OPODIS 2014 |
/deep-research — use before this skill when entering entirely new
territory. Deep-research gathers evidence; dist-sys-auditor enforces it./interface-design-review — complementary: checks API ergonomics;
dist-sys-auditor checks distributed correctness. Run both on coordination APIs./design-tournament — use when multiple viable approaches surface.
Feed tournament results into dist-sys-auditor for citation validation./test-strategy — dist-sys-auditor recommends verification strategies;
test-strategy helps implement them./security-reviewer — covers memory safety; dist-sys-auditor covers
coordination safety (zombie writes, split-brain, lost progress)./doc-rigor — after dist-sys-auditor approves a pattern, doc-rigor
ensures documentation captures why the pattern was chosen with its citation.development
Deep first-principles code explanation that builds real understanding through phased walkthroughs with diagrams. Covers algorithms, data structures, memory layout, concurrency patterns, and performance tricks — especially for systems code in Rust. Use whenever the user asks to explain, walk through, break down, deep dive into, or understand code. Trigger on "how does this work", "what's happening here", "teach me about this", "why is it done this way", or when the user references a file with @ and wants to understand it. Proactively use when examining code involving lock-free algorithms, atomics/CAS, memory ordering,
development
Use when creating implementation-ready beads tasks that need testing strategy, optimal implementation approach, and documentation requirements baked in — composes /create-task with parallel enrichment agents that analyze the codebase and produce concrete test specifications, algorithm/data-structure guidance, and doc quality standards so implementing agents don't need to re-research
development
--- name: autoresearch description: Autonomous Goal-directed Iteration. Apply Karpathy's autoresearch principles to ANY task. Loops autonomously — modify, verify, keep/discard, repeat. Supports bounded iteration via Iterations: N inline config. version: 1.9.11 --- # Claude Autoresearch — Autonomous Goal-directed Iteration Inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch). Applies constraint-driven autonomous iteration to ANY work — not just ML research. **Core id
development
Use when implementing a new feature and assessing coverage gaps, during periodic test hygiene, when test suites feel bloated, or before merging code that changes coordination or hot paths. Two-phase assess-then-improve testing pipeline.