Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

ahrav/invariant-test-review

Name: invariant-test-review
Author: ahrav

.claude/skills/invariant-test-review/SKILL.md

npx skillsauth add ahrav/gossip-rs invariant-test-review

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Invariant Test Review

Review tests with one question: does this test actually prove the property it claims to prove?

This skill is for subtle tests where the risk is not "missing coverage" in the abstract, but "the test passes without isolating the intended invariant."

When to Use

When a PR adds or rewrites tests for coordination, state machines, or multi-step workflows
When a reviewer says "this test does not isolate the invariant"
When oracle or comparable-state assertions aggregate many fields
When unordered collections appear in test comparisons
When terminal-state, rejection, replay, or idempotency behavior matters
When a regression test feels plausible but may be proving the wrong thing

When NOT to Use

Trivial input/output tests with one obvious assertion and no hidden state
Broad test-planning work where the main question is test type selection rather than proof strength
Pure documentation review without test changes

Invocation

/invariant-test-review [<test-file-or-function>]

With no argument: review recently changed Rust test files in the working tree. Prefer dedicated test files first (*_tests.rs, tests/*.rs); if none changed, inspect changed Rust files that contain inline tests or simulation/oracle helpers.
With a file path: review all relevant tests in that file.
With a test function name: review that specific test plus any nearby helper, fixture, oracle, or comparable-state code it depends on.

Core Principle

A test only proves what its setup, observation surface, and assertions uniquely force. If the test can pass because of unrelated setup, missing negative cases, or a lossy comparator, it is weaker than it looks.

Severity Levels

Classify each finding with one of these severities:

| Severity | Meaning | |----------|---------| | BLOCK | The test does not isolate the claimed invariant, can pass for the wrong reason, or relies on an invalid oracle/comparator. | | WARN | The test points at the right behavior but is weaker than it looks because of missing twins, proxy observations, or confounding setup. | | INFO | Improves clarity, discoverability, or explanation without materially changing proof strength. |

Workflow

1. State the Claimed Invariant

Rewrite the test's purpose as one precise sentence.

Good:

stale lease checkpoints are rejected
terminal failed runs never become active again
oracle comparison ignores spawned order but preserves child identity

Weak:

covers eviction
tests failure handling
checks drift

If you cannot name the invariant in one sentence, the test is underspecified.

2. Identify the Minimal Trigger

Ask:

What smallest input or state transition should make this test flip from pass to fail?
Which setup steps are required for that transition?
Which setup steps are merely cargo cult from another test?

Delete or inline anything that does not participate in the invariant.

3. Audit the Observation Surface

The assertion must observe the property directly.

For rejection behavior: assert the specific error or state rejection, not just "operation returned false"
For terminal-state behavior: assert irreversibility explicitly
For replay/idempotency behavior: assert the cached or repeated result, not only overall success
For eviction/order/drift behavior: inspect the comparable state, not a loose side effect

If the assertion only checks a proxy, call that out.

4. Add the Discriminating Twin

When the real question is "is this assertion strong enough?", add the smallest paired case that distinguishes the competing claims:

Happy path + negative path
Allowed boundary + rejected boundary
Ordered input + permuted input
Fresh lease + stale lease
Pre-terminal transition + post-terminal transition

For terminal, rejection, replay, and idempotency rules, the negative or boundary twin is usually mandatory.

5. Audit Oracle and Comparator Semantics

Many misleading tests come from a comparator that is wrong, not the system under test.

Check for:

Order-sensitive Vec equality over logically unordered state
Snapshots that omit the field the invariant depends on
Comparable wrappers that normalize too much or too little
Equality checks that conflate identity with presentation order

If order is irrelevant, compare setwise or sort explicitly before asserting.

6. Confirm the Failure Mode

Ask the final question:

If the code were wrong in exactly the way we care about, would this test fail for that reason?

If the answer is "not sure" or "only indirectly," the test needs revision.

Red Flags

Typical classifications:

BLOCK: The test name claims one invariant, but the assertion only checks generic success.
BLOCK: Comparable-state assertions use order-sensitive equality for unordered data.
BLOCK: The test could fail because of an unrelated precondition before it reaches the behavior under review.
WARN: Setup acquires leases, cursors, or resources that are never used by the assertion.
WARN: A "regression test" duplicates a larger scenario instead of isolating the bug.
WARN: Terminal or rejection semantics are tested only on the happy path.
INFO: Assertion messages or rewrite guidance could name the invariant more directly.

Review Output

Return a short report in this format:

## Invariant Test Review: [test name or file]

- **Claimed invariant**: [one sentence]
- **Minimal trigger**: [smallest state/input that matters]
- **Observation surface**: [what the assertion actually observes]

### Findings

- [BLOCK|WARN|INFO] [Issue 1]: [why the current test is weaker than it looks]
- [BLOCK|WARN|INFO] [Issue 2]: [missing twin, confounder, or comparator problem]
- [BLOCK|WARN|INFO] [Issue N]: [additional issues — add as many as needed]

### Recommended Rewrite

- Remove (if applicable): [vestigial setup]
- Add (if applicable): [negative-path or boundary twin]
- Normalize (if applicable): [unordered state before comparison]
- Assert: [the direct property instead of a proxy]

Related Skills

/test-strategy — choose the right test form once the invariant is clear
/sim-review — review DST compatibility and simulation-specific constraints
/pr-comment-response — verify reviewer bug claims with the smallest proof

ahrav/invariant-test-review

.claude/skills/invariant-test-review/SKILL.md

Use when writing or reviewing state-machine tests, simulation tests, oracle tests, or regression tests to verify they actually prove the claimed invariant. Catches hidden weaknesses like missing negative paths and order-sensitive comparisons.

1 stars

testing

Updated Apr 10, 2026

$ install --global

skillsauth

npx skillsauth add ahrav/gossip-rs invariant-test-review

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 10, 2026, 3:11 AM4.7s1 file scanned

SKILL.md

name:: invariant-test-review
description:: Use when writing or reviewing state-machine tests, simulation tests, oracle tests, or regression tests to verify they actually prove the claimed invariant. Catches hidden weaknesses like missing negative paths and order-sensitive comparisons.
user-invocable:: true

Invariant Test Review

Review tests with one question: does this test actually prove the property it claims to prove?

This skill is for subtle tests where the risk is not "missing coverage" in the abstract, but "the test passes without isolating the intended invariant."

When to Use

When a PR adds or rewrites tests for coordination, state machines, or multi-step workflows
When a reviewer says "this test does not isolate the invariant"
When oracle or comparable-state assertions aggregate many fields
When unordered collections appear in test comparisons
When terminal-state, rejection, replay, or idempotency behavior matters
When a regression test feels plausible but may be proving the wrong thing

When NOT to Use

Trivial input/output tests with one obvious assertion and no hidden state
Broad test-planning work where the main question is test type selection rather than proof strength
Pure documentation review without test changes

Invocation

/invariant-test-review [<test-file-or-function>]

With no argument: review recently changed Rust test files in the working tree. Prefer dedicated test files first (*_tests.rs, tests/*.rs); if none changed, inspect changed Rust files that contain inline tests or simulation/oracle helpers.
With a file path: review all relevant tests in that file.
With a test function name: review that specific test plus any nearby helper, fixture, oracle, or comparable-state code it depends on.

Core Principle

Severity Levels

Classify each finding with one of these severities:

Workflow

1. State the Claimed Invariant

Rewrite the test's purpose as one precise sentence.

Good:

stale lease checkpoints are rejected
terminal failed runs never become active again
oracle comparison ignores spawned order but preserves child identity

Weak:

covers eviction
tests failure handling
checks drift

If you cannot name the invariant in one sentence, the test is underspecified.

2. Identify the Minimal Trigger

Ask:

What smallest input or state transition should make this test flip from pass to fail?
Which setup steps are required for that transition?
Which setup steps are merely cargo cult from another test?

Delete or inline anything that does not participate in the invariant.

3. Audit the Observation Surface

The assertion must observe the property directly.

For rejection behavior: assert the specific error or state rejection, not just "operation returned false"
For terminal-state behavior: assert irreversibility explicitly
For replay/idempotency behavior: assert the cached or repeated result, not only overall success
For eviction/order/drift behavior: inspect the comparable state, not a loose side effect

If the assertion only checks a proxy, call that out.

4. Add the Discriminating Twin

When the real question is "is this assertion strong enough?", add the smallest paired case that distinguishes the competing claims:

Happy path + negative path
Allowed boundary + rejected boundary
Ordered input + permuted input
Fresh lease + stale lease
Pre-terminal transition + post-terminal transition

For terminal, rejection, replay, and idempotency rules, the negative or boundary twin is usually mandatory.

5. Audit Oracle and Comparator Semantics

Many misleading tests come from a comparator that is wrong, not the system under test.

Check for:

Order-sensitive Vec equality over logically unordered state
Snapshots that omit the field the invariant depends on
Comparable wrappers that normalize too much or too little
Equality checks that conflate identity with presentation order

If order is irrelevant, compare setwise or sort explicitly before asserting.

6. Confirm the Failure Mode

Ask the final question:

If the code were wrong in exactly the way we care about, would this test fail for that reason?

If the answer is "not sure" or "only indirectly," the test needs revision.

Red Flags

Typical classifications:

BLOCK: The test name claims one invariant, but the assertion only checks generic success.
BLOCK: Comparable-state assertions use order-sensitive equality for unordered data.
BLOCK: The test could fail because of an unrelated precondition before it reaches the behavior under review.
WARN: Setup acquires leases, cursors, or resources that are never used by the assertion.
WARN: A "regression test" duplicates a larger scenario instead of isolating the bug.
WARN: Terminal or rejection semantics are tested only on the happy path.
INFO: Assertion messages or rewrite guidance could name the invariant more directly.

Review Output

Return a short report in this format:

## Invariant Test Review: [test name or file]

- **Claimed invariant**: [one sentence]
- **Minimal trigger**: [smallest state/input that matters]
- **Observation surface**: [what the assertion actually observes]

### Findings

- [BLOCK|WARN|INFO] [Issue 1]: [why the current test is weaker than it looks]
- [BLOCK|WARN|INFO] [Issue 2]: [missing twin, confounder, or comparator problem]
- [BLOCK|WARN|INFO] [Issue N]: [additional issues — add as many as needed]

### Recommended Rewrite

- Remove (if applicable): [vestigial setup]
- Add (if applicable): [negative-path or boundary twin]
- Normalize (if applicable): [unordered state before comparison]
- Assert: [the direct property instead of a proxy]

Related Skills

/test-strategy — choose the right test form once the invariant is clear
/sim-review — review DST compatibility and simulation-specific constraints
/pr-comment-response — verify reviewer bug claims with the smallest proof

Related Skills

ahrav/first-principles

development

VerifiedTrustedCommunity

Deep first-principles code explanation that builds real understanding through phased walkthroughs with diagrams. Covers algorithms, data structures, memory layout, concurrency patterns, and performance tricks — especially for systems code in Rust. Use whenever the user asks to explain, walk through, break down, deep dive into, or understand code. Trigger on "how does this work", "what's happening here", "teach me about this", "why is it done this way", or when the user references a file with @ and wants to understand it. Proactively use when examining code involving lock-free algorithms, atomics/CAS, memory ordering,

1SKILL.mdUpdated Apr 17, 2026

ahrav/first-principles

ahrav/task-forge

development

VerifiedTrustedCommunity

Use when creating implementation-ready beads tasks that need testing strategy, optimal implementation approach, and documentation requirements baked in — composes /create-task with parallel enrichment agents that analyze the codebase and produce concrete test specifications, algorithm/data-structure guidance, and doc quality standards so implementing agents don't need to re-research

1SKILL.mdUpdated Apr 10, 2026

ahrav/.claude/skills/autoresearch

development

VerifiedTrustedCommunity

--- name: autoresearch description: Autonomous Goal-directed Iteration. Apply Karpathy's autoresearch principles to ANY task. Loops autonomously — modify, verify, keep/discard, repeat. Supports bounded iteration via Iterations: N inline config. version: 1.9.11 --- # Claude Autoresearch — Autonomous Goal-directed Iteration Inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch). Applies constraint-driven autonomous iteration to ANY work — not just ML research. **Core id

1SKILL.mdUpdated Apr 10, 2026

ahrav/.claude/skills/autoresearch

ahrav/test-pipeline

development

VerifiedTrustedCommunity

Use when implementing a new feature and assessing coverage gaps, during periodic test hygiene, when test suites feel bloated, or before merging code that changes coordination or hot paths. Two-phase assess-then-improve testing pipeline.

1SKILL.mdUpdated Apr 2, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/ahrav/gossip-rs.git

# Copy into Claude Code skills folder (global)
cp -r gossip-rs/.claude/skills/invariant-test-review ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

ahrav/gossip-rs

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT