Test Consolidator

Analyze test modules for opportunities to replace many similar unit tests with a single, more powerful testing construct — property-based tests, parameterized tests, or fuzz tests. The goal is comprehensive coverage with minimal code to maintain, review, and keep updated.

When to Use

A test module has 5+ unit tests for the same function with different inputs
Tests follow a pattern: call function with input X, assert output Y
You notice copy-paste test bodies differing only in values
Before a PR when test files are growing faster than source files
After refactoring: old tests may cluster around the same behavior
During periodic test hygiene alongside /test-dedup

Philosophy

Tests should scale with behaviors, not with inputs.

If a function has 20 valid inputs worth testing, the answer is NOT 20 test functions. The answer is one construct that covers all 20 — and ideally the infinite space between them.

The hierarchy of consolidation (prefer higher):

Property-based test (proptest) — When you can state an invariant that holds for ALL valid inputs. One proptest! replaces unbounded unit tests. Maximum coverage, minimum code.
Parameterized test (rstest) — The default choice for any finite set of (input, expected) pairs. One #[rstest] with #[case] attributes replaces N identical test bodies. Each case runs as a named sub-test, giving clear CI output on failure. Always prefer rstest over table-driven loops — rstest gives you per-case names, per-case failures, and the ability to #[ignore] individual cases.
Fuzz test — When exploring adversarial or untrusted input spaces. One fuzz target can subsume hundreds of hand-crafted "weird input" tests.
Table-driven test — A distant fallback. Only use when rstest is truly inappropriate (e.g., dynamically generated case lists, cases loaded from a file, or >50 cases where #[case] attributes become unwieldy). If you're reaching for let cases = [...]; for case in cases { ... }, stop and use rstest instead.
Individual unit tests — The last resort. Only when each test truly exercises unique setup, distinct error paths, or documents a specific bug.

Never consolidate for the sake of shorter code. The goal is less code to maintain — meaning fewer places to update when the function signature changes, fewer tests to rename when behavior evolves, and fewer copy-paste errors.

Analysis Process

Step 1: Identify Test Clusters

A test cluster is a group of tests that:

Test the same function or method
Have the same structure (setup → call → assert)
Differ primarily in input values and expected outputs
May also differ in minor setup variations

Scan the module and group tests into clusters. A single test can belong to multiple clusters if it tests more than one function.

Cluster: parse_duration()
  - test_parse_seconds          → parse_duration("5s")  == Duration::from_secs(5)
  - test_parse_minutes          → parse_duration("3m")  == Duration::from_secs(180)
  - test_parse_hours            → parse_duration("2h")  == Duration::from_secs(7200)
  - test_parse_zero             → parse_duration("0s")  == Duration::ZERO
  - test_parse_large            → parse_duration("999h") == Duration::from_secs(3596400)
  - test_parse_invalid_unit     → parse_duration("5x")  == Err(...)
  - test_parse_empty            → parse_duration("")     == Err(...)
  - test_parse_negative         → parse_duration("-1s")  == Err(...)

Step 2: Classify Each Cluster

For each cluster, determine the best consolidation strategy:

Can you state a universal property? → Property-based test. Examples:

"parse then format roundtrips for all valid inputs"
"output length is always ≤ input length"
"sorted output is a permutation of input"
"encoding never produces invalid UTF-8"

Is it a finite set of (input, expected) with no general property? → Parameterized test (rstest). Always rstest, not table-driven. Examples:

Known error codes mapping to messages
Specific file extensions mapping to MIME types
Configuration keys mapping to defaults

Are the tests exploring adversarial/edge inputs? → Fuzz test. Examples:

"doesn't panic on any byte sequence"
"doesn't allocate more than 10x input size"

Does each test have genuinely unique setup or assertions? → Keep as individual tests. Don't force consolidation.

Step 3: Evaluate Consolidation Candidates

For each cluster, answer:

| Question | If Yes | If No | |----------|--------|-------| | Can I state one invariant covering all cases? | proptest | Next question | | Are all test bodies structurally identical? | rstest | Partial consolidation | | Do >3 tests share the same assertion pattern? | rstest | Probably keep individual | | Would adding a new case require a new function? | Consolidate with rstest (adding a #[case] should be trivial) | Fine as-is | | Do tests differ only in values, not in logic? | Strong rstest candidate | Keep separate |

Step 4: Choose the Right Tool

Property-Based (proptest) — Best for invariants

Use when: The assertion is about a property that holds regardless of input.

// BEFORE: 6 unit tests
#[test] fn encode_ascii()    { assert_eq!(encode("hello"), "aGVsbG8="); }
#[test] fn encode_empty()    { assert_eq!(encode(""), ""); }
#[test] fn encode_unicode()  { assert_eq!(encode("café"), "Y2Fmw6k="); }
#[test] fn decode_ascii()    { assert_eq!(decode("aGVsbG8="), "hello"); }
#[test] fn decode_empty()    { assert_eq!(decode(""), ""); }
#[test] fn decode_unicode()  { assert_eq!(decode("Y2Fmw6k="), "café"); }

// AFTER: 1 property test (covers infinite inputs)
proptest! {
    #[test]
    fn roundtrip(input in "\\PC*") {
        let encoded = encode(&input);
        let decoded = decode(&encoded).unwrap();
        prop_assert_eq!(input, decoded);
    }
}
// KEEP: encode_empty as anchor (documents empty-input behavior)

Parameterized (rstest) — Best for case tables

Use when: You have specific (input, expected) pairs with no general property.

// BEFORE: 7 unit tests
#[test] fn status_200() { assert_eq!(status_text(200), "OK"); }
#[test] fn status_201() { assert_eq!(status_text(201), "Created"); }
#[test] fn status_400() { assert_eq!(status_text(400), "Bad Request"); }
#[test] fn status_404() { assert_eq!(status_text(404), "Not Found"); }
#[test] fn status_500() { assert_eq!(status_text(500), "Internal Server Error"); }
#[test] fn status_unknown() { assert_eq!(status_text(999), "Unknown"); }
#[test] fn status_zero() { assert_eq!(status_text(0), "Unknown"); }

// AFTER: 1 parameterized test
#[rstest]
#[case(200, "OK")]
#[case(201, "Created")]
#[case(400, "Bad Request")]
#[case(404, "Not Found")]
#[case(500, "Internal Server Error")]
#[case(999, "Unknown")]
#[case(0, "Unknown")]
fn status_text_mapping(#[case] code: u16, #[case] expected: &str) {
    assert_eq!(status_text(code), expected);
}

Parameterized (rstest) — Complex types and multi-field cases

rstest handles complex types well. Use #[case] even when cases involve structs, enums, booleans, or multi-field assertions.

// BEFORE: 5 individual tests with complex setup
#[test] fn gate_short_window() {
    let spec = CharClassSpec { max_lower_pct: 95, min_window_len: 32 };
    assert!(gate_passes(&[b'a'; 31], spec));
}
#[test] fn gate_all_lowercase() {
    let spec = CharClassSpec { max_lower_pct: 95, min_window_len: 32 };
    assert!(!gate_passes(&[b'a'; 40], spec));
}
#[test] fn gate_mixed_case() {
    let spec = CharClassSpec { max_lower_pct: 95, min_window_len: 32 };
    let mut window = vec![b'a'; 38];
    window.extend_from_slice(b"AB");
    assert!(gate_passes(&window, spec));
}
// ... 2 more ...

// AFTER: 1 rstest with descriptive case names
#[rstest]
#[case::short_window_passes(&[b'a'; 31], 95, 32, true)]
#[case::all_lowercase_rejected(&[b'a'; 40], 95, 32, false)]
#[case::mixed_case_passes(&[b'a'; 35, b'A', b'B', b'C', b'D', b'E'], 95, 32, true)]
#[case::at_threshold_rejected(&[b'a'; 39, b'A'], 95, 32, false)]
#[case::high_threshold_passes(&[b'a'; 40], 100, 32, true)]
fn char_class_gate(
    #[case] window: &[u8],
    #[case] max_lower_pct: u8,
    #[case] min_window_len: usize,
    #[case] expected: bool,
) {
    let spec = CharClassSpec { max_lower_pct, min_window_len };
    assert_eq!(gate_passes(window, spec), expected);
}

Table-Driven — Avoid; use rstest instead

Table-driven tests (let cases = [...]; for case in cases) are an anti-pattern in this codebase. They have real downsides compared to rstest:

A failing case stops the loop — later cases don't run (no isolation)
No per-case test names in cargo test output
Cannot #[ignore] or filter individual cases
Failure messages require manual "case={label}" formatting

The only situations where table-driven is acceptable:

Dynamically generated case lists (cases built at runtime)
Cases loaded from an external file
50 cases where #[case] attribute lines become genuinely unwieldy

// ANTI-PATTERN — do NOT write this:
#[test]
fn status_text_cases() {
    let cases = [
        (200, "OK"),
        (201, "Created"),
        (400, "Bad Request"),
    ];
    for (code, expected) in cases {
        assert_eq!(status_text(code), expected, "code={code}");
    }
}

// CORRECT — use rstest instead:
#[rstest]
#[case(200, "OK")]
#[case(201, "Created")]
#[case(400, "Bad Request")]
fn status_text_mapping(#[case] code: u16, #[case] expected: &str) {
    assert_eq!(status_text(code), expected);
}

Fuzz Test — Best for adversarial input exploration

Use when: Tests are exploring "weird inputs" to find crashes.

// BEFORE: 12 unit tests trying to crash the parser
#[test] fn parse_null_bytes()   { let _ = parse(b"\x00\x00"); }
#[test] fn parse_huge_input()   { let _ = parse(&vec![0xFF; 10_000]); }
#[test] fn parse_truncated()    { let _ = parse(b"\x01\x02"); }
// ... 9 more ...

// AFTER: 1 fuzz target (explores billions of inputs)
fuzz_target!(|data: &[u8]| {
    let _ = parse(data);
});
// KEEP: 1-2 specific regression tests if they document known bugs

Step 5: Check for Blockers

Before recommending consolidation, verify:

[ ] Feature gates: proptest is a direct dev-dependency (no feature gate for tests). Some crates gate Arbitrary impls behind test-support. If moving tests behind a feature gate, keep at least one ungated unit test as baseline.
[ ] Regression tests: Tests with bug references (// Regression: GH-123) should stay as individual tests even if technically consolidatable.
[ ] Error path tests: Tests verifying specific error messages or error variants may need individual tests if the exact error matters.
[ ] Readability anchors: Keep one simple example test per public function as documentation, even if a property test covers it.
[ ] rstest availability: rstest is a project standard (workspace dep rstest = "0.25"). If not yet in a crate's [dev-dependencies], add rstest.workspace = true. This is expected and should not be treated as a blocker.

Step 6: Produce Consolidation Plan

For each cluster, specify:

Current state: Number of tests, lines of test code
Recommended strategy: Which tool and why
What gets removed: Which specific tests are replaced
What gets kept: Which tests survive and why
New test code: The actual replacement test(s)
Net effect: Lines removed vs added, maintenance burden change

Project-Specific Conventions

Test locations in this codebase

Inline tests: #[cfg(test)] mod tests { ... } at bottom of source file
Sibling test files: crates/*/src/*_tests.rs (e.g., ring_buffer_tests.rs, error_tests.rs)
Property tests: in #[cfg(test)] modules using proptest (direct dev-dep, no feature gate)
Kani proofs: #[cfg(kani)] blocks in gossip-stdx
Simulation tests: crates/gossip-coordination/src/sim/ (CoordinationSim harness + proptest state machine)
Fuzz targets: crates/gossip-contracts/fuzz/ and crates/gossip-stdx/fuzz/
Integration tests: crates/*/tests/ (e.g., identity_smoke.rs)
- Scanner engine tests: crates/scanner-engine/src/engine/tests.rs and *_tests.rs companions
- Scanner integration tests: crates/scanner-engine-integration-tests/tests/chunked_file_scans.rs

Feature gates

test-support: Enables Arbitrary impls in gossip-contracts and sim infrastructure in gossip-coordination
kani: Enables Kani model checking proofs in gossip-stdx
tiger-harness: Enables tiger harness in scanner-engine
scheduler-sim: Enables scheduler simulation in scanner-scheduler
sim-harness: Enables simulation harness in scanner-engine and scanner-scheduler
bench: Enables benchmark scaffolding in scanner-engine and scanner-scheduler

Dependencies

rstest: Project standard (workspace dep rstest = "0.25"). Add rstest.workspace = true to crate-level [dev-dependencies] if not already present. No feature gate needed.
proptest: Direct dev-dependency in most crates. Arbitrary impls for shared types gated behind test-support in gossip-contracts.
cargo-fuzz: External tool, no Cargo.toml change needed

Output Format

## Test Consolidation Report: [module/file]

### Cluster Analysis

#### Cluster 1: `function_name()` — N tests, M lines

**Tests in cluster:**
| # | Test | Input | Assertion |
|---|------|-------|-----------|
| 1 | `test_foo_basic` | "hello" | returns "HELLO" |
| 2 | `test_foo_empty` | "" | returns "" |
| 3 | `test_foo_unicode` | "café" | returns "CAFÉ" |
| ... | ... | ... | ... |

**Pattern detected:** All tests call `foo(input)` and assert exact output.
Inputs vary, assertion structure is identical.

**Recommended strategy:** Property-based test
**Rationale:** The invariant `foo(x).to_lowercase() == x.to_lowercase()` holds
for all inputs. A single proptest replaces all 8 unit tests.

**Consolidation:**
- REPLACE tests #1-#6 with proptest `prop_foo_case_invariant`
- KEEP `test_foo_empty` as anchor (documents empty-input behavior)
- KEEP `test_foo_regression_gh_42` (regression test, bug reference)

**Proposed code:**
```rust
proptest! {
    #[test]
    fn prop_foo_case_invariant(input in "\\PC{0,100}") {
        prop_assert_eq!(foo(&input).to_lowercase(), input.to_lowercase());
    }
}

Cluster 2: `bar()` — N tests, M lines

...

Summary

| Metric | Before | After | Change | |--------|--------|-------|--------| | Total tests | 34 | 12 | -22 (65%) | | Test lines | 280 | 95 | -185 (66%) | | Behaviors covered | 8 | 8 | No change | | Input space covered | ~34 points | Continuous | Vastly improved |

Dependency Changes

[ ] Add rstest.workspace = true to [dev-dependencies] (if rstest recommended)
[ ] No new dependencies needed (if only proptest/fuzz)

Migration Order

Add proptest for Cluster 1 (highest value: 8 tests → 1)
Add rstest for Cluster 3 (6 tests → 1 parameterized)
Add rstest for Cluster 5 (4 tests → 1 parameterized)
Run full test suite to verify no coverage loss
Delete subsumed unit tests


## Decision Heuristics

### When to prefer proptest over rstest
- You can state a property about the output without knowing the exact value
- The input space is continuous or very large
- Roundtrip properties exist (encode/decode, serialize/deserialize)
- Order/sorting/containment invariants exist

### When to prefer rstest over proptest
- Each case has a specific expected output that must be exact
- The set of important cases is finite and known
- The mapping is arbitrary (no mathematical relationship)
- You want each case to appear as a named sub-test in output

### When to prefer rstest over table-driven (almost always)
- **Default to rstest for ALL finite case sets.** rstest gives you:
  - Per-case test names in `cargo test` output (e.g., `status_text_mapping::case_3`)
  - Independent execution — one failing case doesn't hide others
  - `#[ignore]` on individual cases
  - Native `cargo test --test-threads` parallelism per case
- Table-driven is only acceptable for dynamically generated cases or >50 entries

### When to prefer fuzz over all else
- The function processes untrusted/external input
- The goal is "never panic" rather than "correct output"
- You've been writing tests that try to "trick" the parser

### When NOT to consolidate
- Each test has genuinely different setup logic (not just different values)
- Tests verify different error paths with different error types
- Tests are regression tests for specific bugs (keep the history)
- The "consolidated" version would be harder to understand than the originals
- There are only 2-3 tests — consolidation overhead isn't worth it

## Judgment Calls

- **Threshold**: Don't consolidate clusters of fewer than 4 tests unless the
  consolidation is obviously cleaner (e.g., perfect roundtrip property).
- **Mixed clusters**: If 6 of 8 tests consolidate but 2 are genuinely unique,
  consolidate the 6 and keep the 2. Don't force everything into one construct.
- **Error tests**: Tests for error cases often belong in a separate rstest or
  table, not mixed into the happy-path property test. Group by success/failure.
- **Naming**: Consolidated tests should have clear names describing the
  invariant or case set, not generic names like `test_all_cases`.

## Related Skills

- `/test-dedup` — Remove tests subsumed by higher-level tests (complementary)
- `/test-strategy` — Decide what kind of test to write for new code

Test Consolidator

When to Use

A test module has 5+ unit tests for the same function with different inputs
Tests follow a pattern: call function with input X, assert output Y
You notice copy-paste test bodies differing only in values
Before a PR when test files are growing faster than source files
After refactoring: old tests may cluster around the same behavior
During periodic test hygiene alongside /test-dedup

Philosophy

Tests should scale with behaviors, not with inputs.

If a function has 20 valid inputs worth testing, the answer is NOT 20 test functions. The answer is one construct that covers all 20 — and ideally the infinite space between them.

The hierarchy of consolidation (prefer higher):

Property-based test (proptest) — When you can state an invariant that holds for ALL valid inputs. One proptest! replaces unbounded unit tests. Maximum coverage, minimum code.
Parameterized test (rstest) — The default choice for any finite set of (input, expected) pairs. One #[rstest] with #[case] attributes replaces N identical test bodies. Each case runs as a named sub-test, giving clear CI output on failure. Always prefer rstest over table-driven loops — rstest gives you per-case names, per-case failures, and the ability to #[ignore] individual cases.
Fuzz test — When exploring adversarial or untrusted input spaces. One fuzz target can subsume hundreds of hand-crafted "weird input" tests.
Table-driven test — A distant fallback. Only use when rstest is truly inappropriate (e.g., dynamically generated case lists, cases loaded from a file, or >50 cases where #[case] attributes become unwieldy). If you're reaching for let cases = [...]; for case in cases { ... }, stop and use rstest instead.
Individual unit tests — The last resort. Only when each test truly exercises unique setup, distinct error paths, or documents a specific bug.

Analysis Process

Step 1: Identify Test Clusters

A test cluster is a group of tests that:

Test the same function or method
Have the same structure (setup → call → assert)
Differ primarily in input values and expected outputs
May also differ in minor setup variations

Scan the module and group tests into clusters. A single test can belong to multiple clusters if it tests more than one function.

Cluster: parse_duration()
  - test_parse_seconds          → parse_duration("5s")  == Duration::from_secs(5)
  - test_parse_minutes          → parse_duration("3m")  == Duration::from_secs(180)
  - test_parse_hours            → parse_duration("2h")  == Duration::from_secs(7200)
  - test_parse_zero             → parse_duration("0s")  == Duration::ZERO
  - test_parse_large            → parse_duration("999h") == Duration::from_secs(3596400)
  - test_parse_invalid_unit     → parse_duration("5x")  == Err(...)
  - test_parse_empty            → parse_duration("")     == Err(...)
  - test_parse_negative         → parse_duration("-1s")  == Err(...)

Step 2: Classify Each Cluster

For each cluster, determine the best consolidation strategy:

Can you state a universal property? → Property-based test. Examples:

"parse then format roundtrips for all valid inputs"
"output length is always ≤ input length"
"sorted output is a permutation of input"
"encoding never produces invalid UTF-8"

Is it a finite set of (input, expected) with no general property? → Parameterized test (rstest). Always rstest, not table-driven. Examples:

Known error codes mapping to messages
Specific file extensions mapping to MIME types
Configuration keys mapping to defaults

Are the tests exploring adversarial/edge inputs? → Fuzz test. Examples:

"doesn't panic on any byte sequence"
"doesn't allocate more than 10x input size"

Does each test have genuinely unique setup or assertions? → Keep as individual tests. Don't force consolidation.

Step 3: Evaluate Consolidation Candidates

For each cluster, answer:

Step 4: Choose the Right Tool

Property-Based (proptest) — Best for invariants

Use when: The assertion is about a property that holds regardless of input.

// BEFORE: 6 unit tests
#[test] fn encode_ascii()    { assert_eq!(encode("hello"), "aGVsbG8="); }
#[test] fn encode_empty()    { assert_eq!(encode(""), ""); }
#[test] fn encode_unicode()  { assert_eq!(encode("café"), "Y2Fmw6k="); }
#[test] fn decode_ascii()    { assert_eq!(decode("aGVsbG8="), "hello"); }
#[test] fn decode_empty()    { assert_eq!(decode(""), ""); }
#[test] fn decode_unicode()  { assert_eq!(decode("Y2Fmw6k="), "café"); }

// AFTER: 1 property test (covers infinite inputs)
proptest! {
    #[test]
    fn roundtrip(input in "\\PC*") {
        let encoded = encode(&input);
        let decoded = decode(&encoded).unwrap();
        prop_assert_eq!(input, decoded);
    }
}
// KEEP: encode_empty as anchor (documents empty-input behavior)

Parameterized (rstest) — Best for case tables

Use when: You have specific (input, expected) pairs with no general property.

// BEFORE: 7 unit tests
#[test] fn status_200() { assert_eq!(status_text(200), "OK"); }
#[test] fn status_201() { assert_eq!(status_text(201), "Created"); }
#[test] fn status_400() { assert_eq!(status_text(400), "Bad Request"); }
#[test] fn status_404() { assert_eq!(status_text(404), "Not Found"); }
#[test] fn status_500() { assert_eq!(status_text(500), "Internal Server Error"); }
#[test] fn status_unknown() { assert_eq!(status_text(999), "Unknown"); }
#[test] fn status_zero() { assert_eq!(status_text(0), "Unknown"); }

// AFTER: 1 parameterized test
#[rstest]
#[case(200, "OK")]
#[case(201, "Created")]
#[case(400, "Bad Request")]
#[case(404, "Not Found")]
#[case(500, "Internal Server Error")]
#[case(999, "Unknown")]
#[case(0, "Unknown")]
fn status_text_mapping(#[case] code: u16, #[case] expected: &str) {
    assert_eq!(status_text(code), expected);
}

Parameterized (rstest) — Complex types and multi-field cases

rstest handles complex types well. Use #[case] even when cases involve structs, enums, booleans, or multi-field assertions.

// BEFORE: 5 individual tests with complex setup
#[test] fn gate_short_window() {
    let spec = CharClassSpec { max_lower_pct: 95, min_window_len: 32 };
    assert!(gate_passes(&[b'a'; 31], spec));
}
#[test] fn gate_all_lowercase() {
    let spec = CharClassSpec { max_lower_pct: 95, min_window_len: 32 };
    assert!(!gate_passes(&[b'a'; 40], spec));
}
#[test] fn gate_mixed_case() {
    let spec = CharClassSpec { max_lower_pct: 95, min_window_len: 32 };
    let mut window = vec![b'a'; 38];
    window.extend_from_slice(b"AB");
    assert!(gate_passes(&window, spec));
}
// ... 2 more ...

// AFTER: 1 rstest with descriptive case names
#[rstest]
#[case::short_window_passes(&[b'a'; 31], 95, 32, true)]
#[case::all_lowercase_rejected(&[b'a'; 40], 95, 32, false)]
#[case::mixed_case_passes(&[b'a'; 35, b'A', b'B', b'C', b'D', b'E'], 95, 32, true)]
#[case::at_threshold_rejected(&[b'a'; 39, b'A'], 95, 32, false)]
#[case::high_threshold_passes(&[b'a'; 40], 100, 32, true)]
fn char_class_gate(
    #[case] window: &[u8],
    #[case] max_lower_pct: u8,
    #[case] min_window_len: usize,
    #[case] expected: bool,
) {
    let spec = CharClassSpec { max_lower_pct, min_window_len };
    assert_eq!(gate_passes(window, spec), expected);
}

Table-Driven — Avoid; use rstest instead

Table-driven tests (let cases = [...]; for case in cases) are an anti-pattern in this codebase. They have real downsides compared to rstest:

A failing case stops the loop — later cases don't run (no isolation)
No per-case test names in cargo test output
Cannot #[ignore] or filter individual cases
Failure messages require manual "case={label}" formatting

The only situations where table-driven is acceptable:

Dynamically generated case lists (cases built at runtime)
Cases loaded from an external file
50 cases where #[case] attribute lines become genuinely unwieldy

// ANTI-PATTERN — do NOT write this:
#[test]
fn status_text_cases() {
    let cases = [
        (200, "OK"),
        (201, "Created"),
        (400, "Bad Request"),
    ];
    for (code, expected) in cases {
        assert_eq!(status_text(code), expected, "code={code}");
    }
}

// CORRECT — use rstest instead:
#[rstest]
#[case(200, "OK")]
#[case(201, "Created")]
#[case(400, "Bad Request")]
fn status_text_mapping(#[case] code: u16, #[case] expected: &str) {
    assert_eq!(status_text(code), expected);
}

Fuzz Test — Best for adversarial input exploration

Use when: Tests are exploring "weird inputs" to find crashes.

// BEFORE: 12 unit tests trying to crash the parser
#[test] fn parse_null_bytes()   { let _ = parse(b"\x00\x00"); }
#[test] fn parse_huge_input()   { let _ = parse(&vec![0xFF; 10_000]); }
#[test] fn parse_truncated()    { let _ = parse(b"\x01\x02"); }
// ... 9 more ...

// AFTER: 1 fuzz target (explores billions of inputs)
fuzz_target!(|data: &[u8]| {
    let _ = parse(data);
});
// KEEP: 1-2 specific regression tests if they document known bugs

Step 5: Check for Blockers

Before recommending consolidation, verify:

[ ] Feature gates: proptest is a direct dev-dependency (no feature gate for tests). Some crates gate Arbitrary impls behind test-support. If moving tests behind a feature gate, keep at least one ungated unit test as baseline.
[ ] Regression tests: Tests with bug references (// Regression: GH-123) should stay as individual tests even if technically consolidatable.
[ ] Error path tests: Tests verifying specific error messages or error variants may need individual tests if the exact error matters.
[ ] Readability anchors: Keep one simple example test per public function as documentation, even if a property test covers it.
[ ] rstest availability: rstest is a project standard (workspace dep rstest = "0.25"). If not yet in a crate's [dev-dependencies], add rstest.workspace = true. This is expected and should not be treated as a blocker.

Step 6: Produce Consolidation Plan

For each cluster, specify:

Current state: Number of tests, lines of test code
Recommended strategy: Which tool and why
What gets removed: Which specific tests are replaced
What gets kept: Which tests survive and why
New test code: The actual replacement test(s)
Net effect: Lines removed vs added, maintenance burden change

Project-Specific Conventions

Test locations in this codebase

Inline tests: #[cfg(test)] mod tests { ... } at bottom of source file
Sibling test files: crates/*/src/*_tests.rs (e.g., ring_buffer_tests.rs, error_tests.rs)
Property tests: in #[cfg(test)] modules using proptest (direct dev-dep, no feature gate)
Kani proofs: #[cfg(kani)] blocks in gossip-stdx
Simulation tests: crates/gossip-coordination/src/sim/ (CoordinationSim harness + proptest state machine)
Fuzz targets: crates/gossip-contracts/fuzz/ and crates/gossip-stdx/fuzz/
Integration tests: crates/*/tests/ (e.g., identity_smoke.rs)
- Scanner engine tests: crates/scanner-engine/src/engine/tests.rs and *_tests.rs companions
- Scanner integration tests: crates/scanner-engine-integration-tests/tests/chunked_file_scans.rs

Feature gates

test-support: Enables Arbitrary impls in gossip-contracts and sim infrastructure in gossip-coordination
kani: Enables Kani model checking proofs in gossip-stdx
tiger-harness: Enables tiger harness in scanner-engine
scheduler-sim: Enables scheduler simulation in scanner-scheduler
sim-harness: Enables simulation harness in scanner-engine and scanner-scheduler
bench: Enables benchmark scaffolding in scanner-engine and scanner-scheduler

Dependencies

rstest: Project standard (workspace dep rstest = "0.25"). Add rstest.workspace = true to crate-level [dev-dependencies] if not already present. No feature gate needed.
proptest: Direct dev-dependency in most crates. Arbitrary impls for shared types gated behind test-support in gossip-contracts.
cargo-fuzz: External tool, no Cargo.toml change needed

Output Format

## Test Consolidation Report: [module/file]

### Cluster Analysis

#### Cluster 1: `function_name()` — N tests, M lines

**Tests in cluster:**
| # | Test | Input | Assertion |
|---|------|-------|-----------|
| 1 | `test_foo_basic` | "hello" | returns "HELLO" |
| 2 | `test_foo_empty` | "" | returns "" |
| 3 | `test_foo_unicode` | "café" | returns "CAFÉ" |
| ... | ... | ... | ... |

**Pattern detected:** All tests call `foo(input)` and assert exact output.
Inputs vary, assertion structure is identical.

**Recommended strategy:** Property-based test
**Rationale:** The invariant `foo(x).to_lowercase() == x.to_lowercase()` holds
for all inputs. A single proptest replaces all 8 unit tests.

**Consolidation:**
- REPLACE tests #1-#6 with proptest `prop_foo_case_invariant`
- KEEP `test_foo_empty` as anchor (documents empty-input behavior)
- KEEP `test_foo_regression_gh_42` (regression test, bug reference)

**Proposed code:**
```rust
proptest! {
    #[test]
    fn prop_foo_case_invariant(input in "\\PC{0,100}") {
        prop_assert_eq!(foo(&input).to_lowercase(), input.to_lowercase());
    }
}

Cluster 2: `bar()` — N tests, M lines

...

Summary

Dependency Changes

[ ] Add rstest.workspace = true to [dev-dependencies] (if rstest recommended)
[ ] No new dependencies needed (if only proptest/fuzz)

Migration Order

Add proptest for Cluster 1 (highest value: 8 tests → 1)
Add rstest for Cluster 3 (6 tests → 1 parameterized)
Add rstest for Cluster 5 (4 tests → 1 parameterized)
Run full test suite to verify no coverage loss
Delete subsumed unit tests


## Decision Heuristics

### When to prefer proptest over rstest
- You can state a property about the output without knowing the exact value
- The input space is continuous or very large
- Roundtrip properties exist (encode/decode, serialize/deserialize)
- Order/sorting/containment invariants exist

### When to prefer rstest over proptest
- Each case has a specific expected output that must be exact
- The set of important cases is finite and known
- The mapping is arbitrary (no mathematical relationship)
- You want each case to appear as a named sub-test in output

### When to prefer rstest over table-driven (almost always)
- **Default to rstest for ALL finite case sets.** rstest gives you:
  - Per-case test names in `cargo test` output (e.g., `status_text_mapping::case_3`)
  - Independent execution — one failing case doesn't hide others
  - `#[ignore]` on individual cases
  - Native `cargo test --test-threads` parallelism per case
- Table-driven is only acceptable for dynamically generated cases or >50 entries

### When to prefer fuzz over all else
- The function processes untrusted/external input
- The goal is "never panic" rather than "correct output"
- You've been writing tests that try to "trick" the parser

### When NOT to consolidate
- Each test has genuinely different setup logic (not just different values)
- Tests verify different error paths with different error types
- Tests are regression tests for specific bugs (keep the history)
- The "consolidated" version would be harder to understand than the originals
- There are only 2-3 tests — consolidation overhead isn't worth it

## Judgment Calls

- **Threshold**: Don't consolidate clusters of fewer than 4 tests unless the
  consolidation is obviously cleaner (e.g., perfect roundtrip property).
- **Mixed clusters**: If 6 of 8 tests consolidate but 2 are genuinely unique,
  consolidate the 6 and keep the 2. Don't force everything into one construct.
- **Error tests**: Tests for error cases often belong in a separate rstest or
  table, not mixed into the happy-path property test. Group by success/failure.
- **Naming**: Consolidated tests should have clear names describing the
  invariant or case set, not generic names like `test_all_cases`.

## Related Skills

- `/test-dedup` — Remove tests subsumed by higher-level tests (complementary)
- `/test-strategy` — Decide what kind of test to write for new code

Adoption

ahrav/test-consolidate

$ install --global

Security Scan Results

SKILL.md

Test Consolidator

When to Use

Philosophy

Analysis Process

Step 1: Identify Test Clusters

Step 2: Classify Each Cluster

Step 3: Evaluate Consolidation Candidates

Step 4: Choose the Right Tool

Property-Based (proptest) — Best for invariants

Parameterized (rstest) — Best for case tables

Parameterized (rstest) — Complex types and multi-field cases

Table-Driven — Avoid; use rstest instead

Fuzz Test — Best for adversarial input exploration

Step 5: Check for Blockers

Step 6: Produce Consolidation Plan

Project-Specific Conventions

Test locations in this codebase

Feature gates

Dependencies

Output Format

Cluster 2: bar() — N tests, M lines

Summary

Dependency Changes

Migration Order

Related Skills

ahrav/first-principles

ahrav/task-forge

ahrav/.claude/skills/autoresearch

ahrav/test-pipeline

ahrav/test-consolidate

$ install --global

Security Scan Results

SKILL.md

Test Consolidator

When to Use

Philosophy

Analysis Process

Step 1: Identify Test Clusters

Step 2: Classify Each Cluster

Step 3: Evaluate Consolidation Candidates

Step 4: Choose the Right Tool

Property-Based (proptest) — Best for invariants

Parameterized (rstest) — Best for case tables

Parameterized (rstest) — Complex types and multi-field cases

Table-Driven — Avoid; use rstest instead

Fuzz Test — Best for adversarial input exploration

Step 5: Check for Blockers

Step 6: Produce Consolidation Plan

Project-Specific Conventions

Test locations in this codebase

Feature gates

Dependencies

Output Format

Cluster 2: bar() — N tests, M lines

Summary

Dependency Changes

Migration Order

Related Skills

ahrav/first-principles

ahrav/task-forge

ahrav/.claude/skills/autoresearch

ahrav/test-pipeline

Cluster 2: `bar()` — N tests, M lines

Cluster 2: `bar()` — N tests, M lines