Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

nwave-ai/nw-property-based-testing

Name: nw-property-based-testing
Author: nwave-ai

plugins/nw/skills/nw-property-based-testing/SKILL.md

npx skillsauth add nwave-ai/nwave nw-property-based-testing

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Property-Based Testing and Mutation Testing

Deferred to Phase 2.25: Mutation testing runs ONCE per feature as final quality gate at orchestrator Phase 2.25 (after all steps complete). Do NOT run mutation testing during inner TDD loop.

Property-Based Testing (PBT)

Instead of examples ("given X, expect Y"), write properties ("for all valid inputs, condition Z holds"). Framework generates hundreds/thousands of inputs checking property. Dramatically expands test coverage.

Property Patterns

Invariants: "for all inputs, condition holds" (sorted list is ordered, balance >= 0)
Roundtrip: "encode then decode = original" (serialize/deserialize, compress/decompress)
Oracle: "compare against reference implementation" (optimized vs correct-but-slow)
Metamorphic: "different operations, same result" (add(a,b)==add(b,a), filter can't increase size)

Shrinking

When property fails, framework auto-finds minimal failing input. Dramatically accelerates debugging. Algorithm: find failing input -> try simpler variants -> if still fails, use as new candidate -> repeat.

PBT Tools by Language

| Language | Framework | |----------|-----------| | Python | Hypothesis | | JavaScript/TypeScript | fast-check | | Haskell | QuickCheck | | Rust | quickcheck | | Java | jqwik | | C# | FsCheck |

Adopted by Amazon, Volvo, Stripe, Jane Street (ICSE 2024 study).

When PBT Adds Value

Falsifier-gate: closed-world finite → parametrize, NOT PBT

If the input domain is finite + enumerable (N known files, M known event types, K known skill names, fixed Python versions), PBT is the wrong tool:

Hypothesis import (~457ms) + per-example bookkeeping > @pytest.mark.parametrize overhead
Shrinking is irrelevant — the failing input is already a known list member, no minimization needed
Coverage is bounded by the parameter list, not the example budget — fewer assertions, same coverage

Decision rule: enumerate the domain. If listable ([a, b, c, ...]), use parametrize-collapse or dict-iteration (see nw-test-optimization §3.1, §3.2). Reserve PBT for "for all X in DOMAIN, P(X) holds" where DOMAIN is infinite (all strings, all integers, all valid JSON, all sorted lists).

Empirical anchor 2026-05-18: 155-file closed-world skill registry PBT migration was correctly aborted at recon stage by the falsifier-gate. Solution: set-difference parametrize-collapse (commit c2637f6c8), 5.42s → 0.71s (8.9× faster). Mass-migrating closed-world tests to PBT would have made the suite slower, not faster.

See nw-test-optimization §4-bis Paradigm-Match Decision Rule for the full shape-to-paradigm table.

PBT + TDD Integration

Start with example-based TDD for specific cases (drives detailed design)
Once basic implementation works, write properties to generalize
If property fails: found bug or need refined implementation
Refactor freely - properties verify behavior preservation

Properties = higher-level spec that survives refactoring better than examples.

Mutation Testing

Evaluates test suite quality by introducing artificial bugs (mutations) and checking if tests catch them. Mutation score = killed mutants / total mutants. Stronger metric than code coverage.

Mutation Score Targets

| Score | Quality | |-------|---------| | < 60% | Weak suite, significant gaps | | 60-80% | Moderate, some gaps | | > 80% | Strong, few gaps |

Target: 75-80% minimum. Not all survivors indicate bad tests (equivalent mutants exist).

Mutation Operators

Mutation Testing Tools

| Language | Tool | |----------|------| | Java | PIT | | JavaScript/TypeScript/C# | Stryker | | Python | mutmut, Cosmic Ray |

Computationally expensive. Use incremental: on changed code in PRs, full codebase weekly.

Combined PBT + Mutation Workflow

Write example-based tests (TDD) -> cover known scenarios
Apply mutation testing -> identify assertion gaps -> write more tests
Add PBT for complex logic -> cover input space systematically
Mutation testing again -> verify properties are comprehensive

Quality ratchet: each technique exposes gaps others miss. Prioritize critical paths and complex algorithms.

PBT Performance Guidance

Fast feedback: ~100 examples | CI/CD: ~1000 examples | Nightly builds: ~10000+ examples

Modern frameworks allow configuring example count per context.

State-Delta + Hypothesis Integration

Combines the delta-first paradigm (see nw-tdd-methodology::Delta-First Test Paradigm) with Hypothesis shrinking to cover production code that branches on input shape.

`path_strategy()` — composite Hypothesis strategy

Location: nwave_ai/state_delta/strategies/path_strategy.py

Generates realistic PATH string shapes covering 4 production branches:

Empty string (no PATH set)
$HOME/bin literal (unexpanded shell variable)
Legacy fallback path (/usr/local/bin only)
Idempotent case (target already present in PATH)

Lazy-import boundary: hypothesis is NOT imported at import nwave_ai.state_delta.matcher time. It is loaded only when path_strategy() is called. This is verified by a subprocess-isolated test at tests/state_delta/unit/test_lazy_import.py — importing the matcher in a hypothesis-free environment must not raise ImportError.

Integration pattern

from hypothesis import given, settings
from nwave_ai.state_delta.strategies.path_strategy import path_strategy
from nwave_ai.state_delta import assert_state_delta, prepended_with, unchanged

@given(path_strategy())
@settings(max_examples=500)
def test_path_injection_all_shapes(initial_path):
    before = {"env.PATH": initial_path, "env.OTHER": "x"}

    result_path = inject_nwave_bin(initial_path)

    after = {"env.PATH": result_path, "env.OTHER": "x"}

    assert_state_delta(
        before,
        after,
        universe={"env.PATH", "env.OTHER"},
        expected={"env.PATH": prepended_with("/home/user/.nwave/bin"),
                  "env.OTHER": unchanged()},
    )

Hypothesis shrinking finds the minimal failing PATH shape automatically when a branch is broken.

When to use this combination

Production code has multiple branches over input shape (empty vs. populated, legacy vs. current format).
You want both shrinking (Hypothesis strength) and surrounding-state verification (delta-first strength).
Single @given replaces N parametrized example tests covering the same branches.

Reference

D-12 Part B hard gate: tests/state_delta/integration/test_pilot_bug48.py::test_pilot_bug48_post_fix_validated — 500 examples, GREEN in 0.88s.

nwave-ai/nw-property-based-testing

plugins/nw/skills/nw-property-based-testing/SKILL.md

Property-based testing strategies, mutation testing, shrinking, and combined PBT+mutation workflow for test quality validation

541 stars

testing

Updated Jun 10, 2026

$ install --global

skillsauth

npx skillsauth add nwave-ai/nwave nw-property-based-testing

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 10, 2026, 7:19 AM65.3s1 file scanned

SKILL.md

name:: nw-property-based-testing
description:: Property-based testing strategies, mutation testing, shrinking, and combined PBT+mutation workflow for test quality validation
user-invocable:: false
disable-model-invocation:: true

Property-Based Testing and Mutation Testing

Deferred to Phase 2.25: Mutation testing runs ONCE per feature as final quality gate at orchestrator Phase 2.25 (after all steps complete). Do NOT run mutation testing during inner TDD loop.

Property-Based Testing (PBT)

Property Patterns

Invariants: "for all inputs, condition holds" (sorted list is ordered, balance >= 0)
Roundtrip: "encode then decode = original" (serialize/deserialize, compress/decompress)
Oracle: "compare against reference implementation" (optimized vs correct-but-slow)
Metamorphic: "different operations, same result" (add(a,b)==add(b,a), filter can't increase size)

Shrinking

PBT Tools by Language

| Language | Framework | |----------|-----------| | Python | Hypothesis | | JavaScript/TypeScript | fast-check | | Haskell | QuickCheck | | Rust | quickcheck | | Java | jqwik | | C# | FsCheck |

Adopted by Amazon, Volvo, Stripe, Jane Street (ICSE 2024 study).

When PBT Adds Value

Falsifier-gate: closed-world finite → parametrize, NOT PBT

If the input domain is finite + enumerable (N known files, M known event types, K known skill names, fixed Python versions), PBT is the wrong tool:

Hypothesis import (~457ms) + per-example bookkeeping > @pytest.mark.parametrize overhead
Shrinking is irrelevant — the failing input is already a known list member, no minimization needed
Coverage is bounded by the parameter list, not the example budget — fewer assertions, same coverage

See nw-test-optimization §4-bis Paradigm-Match Decision Rule for the full shape-to-paradigm table.

PBT + TDD Integration

Start with example-based TDD for specific cases (drives detailed design)
Once basic implementation works, write properties to generalize
If property fails: found bug or need refined implementation
Refactor freely - properties verify behavior preservation

Properties = higher-level spec that survives refactoring better than examples.

Mutation Testing

Evaluates test suite quality by introducing artificial bugs (mutations) and checking if tests catch them. Mutation score = killed mutants / total mutants. Stronger metric than code coverage.

Mutation Score Targets

| Score | Quality | |-------|---------| | < 60% | Weak suite, significant gaps | | 60-80% | Moderate, some gaps | | > 80% | Strong, few gaps |

Target: 75-80% minimum. Not all survivors indicate bad tests (equivalent mutants exist).

Mutation Operators

Mutation Testing Tools

| Language | Tool | |----------|------| | Java | PIT | | JavaScript/TypeScript/C# | Stryker | | Python | mutmut, Cosmic Ray |

Computationally expensive. Use incremental: on changed code in PRs, full codebase weekly.

Combined PBT + Mutation Workflow

Write example-based tests (TDD) -> cover known scenarios
Apply mutation testing -> identify assertion gaps -> write more tests
Add PBT for complex logic -> cover input space systematically
Mutation testing again -> verify properties are comprehensive

Quality ratchet: each technique exposes gaps others miss. Prioritize critical paths and complex algorithms.

PBT Performance Guidance

Fast feedback: ~100 examples | CI/CD: ~1000 examples | Nightly builds: ~10000+ examples

Modern frameworks allow configuring example count per context.

State-Delta + Hypothesis Integration

Combines the delta-first paradigm (see nw-tdd-methodology::Delta-First Test Paradigm) with Hypothesis shrinking to cover production code that branches on input shape.

`path_strategy()` — composite Hypothesis strategy

Location: nwave_ai/state_delta/strategies/path_strategy.py

Generates realistic PATH string shapes covering 4 production branches:

Empty string (no PATH set)
$HOME/bin literal (unexpanded shell variable)
Legacy fallback path (/usr/local/bin only)
Idempotent case (target already present in PATH)

Integration pattern

from hypothesis import given, settings
from nwave_ai.state_delta.strategies.path_strategy import path_strategy
from nwave_ai.state_delta import assert_state_delta, prepended_with, unchanged

@given(path_strategy())
@settings(max_examples=500)
def test_path_injection_all_shapes(initial_path):
    before = {"env.PATH": initial_path, "env.OTHER": "x"}

    result_path = inject_nwave_bin(initial_path)

    after = {"env.PATH": result_path, "env.OTHER": "x"}

    assert_state_delta(
        before,
        after,
        universe={"env.PATH", "env.OTHER"},
        expected={"env.PATH": prepended_with("/home/user/.nwave/bin"),
                  "env.OTHER": unchanged()},
    )

Hypothesis shrinking finds the minimal failing PATH shape automatically when a branch is broken.

When to use this combination

Production code has multiple branches over input shape (empty vs. populated, legacy vs. current format).
You want both shrinking (Hypothesis strength) and surrounding-state verification (delta-first strength).
Single @given replaces N parametrized example tests covering the same branches.

Reference

D-12 Part B hard gate: tests/state_delta/integration/test_pilot_bug48.py::test_pilot_bug48_post_fix_validated — 500 examples, GREEN in 0.88s.

Related Skills

nwave-ai/nw-distill

testing

VerifiedTrustedCommunity

Acceptance test creation methodology for the DISTILL wave. Domain knowledge for the acceptance designer agent: port-to-port principle, prior wave reading, wave-decision reconciliation, graceful degradation, and document back-propagation.

563SKILL.mdUpdated May 21, 2026

nwave-ai/nw-collaboration-and-handoffs

development

VerifiedTrustedCommunity

Cross-agent collaboration protocols, workflow handoff patterns, and commit message formats for TDD/Mikado/refactoring workflows

563SKILL.mdUpdated Apr 16, 2026

nwave-ai/nw-collaboration-and-handoffs

nwave-ai/nw-roadmap

development

VerifiedTrustedCommunity

Creates a phased roadmap.json for a feature goal with acceptance criteria and TDD steps. Use when planning implementation steps before execution.

563SKILL.mdUpdated Apr 9, 2026

nwave-ai/nw-distill

testing

VerifiedTrustedCommunity

563SKILL.mdUpdated Apr 9, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/nwave-ai/nwave.git

# Copy into Claude Code skills folder (global)
cp -r nwave/plugins/nw/skills/nw-property-based-testing ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

nwave-ai/nwave

541 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

nwave-ai/nw-property-based-testing

$ install --global

Security Scan Results

SKILL.md

Property-Based Testing and Mutation Testing

Property-Based Testing (PBT)

Property Patterns

Shrinking

PBT Tools by Language

When PBT Adds Value

Falsifier-gate: closed-world finite → parametrize, NOT PBT

PBT + TDD Integration

Mutation Testing

Mutation Score Targets

Mutation Operators

Mutation Testing Tools

Combined PBT + Mutation Workflow

PBT Performance Guidance

State-Delta + Hypothesis Integration

path_strategy() — composite Hypothesis strategy

Integration pattern

When to use this combination

Reference

Related Skills

nwave-ai/nw-distill

nwave-ai/nw-collaboration-and-handoffs

nwave-ai/nw-roadmap

nwave-ai/nw-distill

nwave-ai/nw-property-based-testing

$ install --global

Security Scan Results

SKILL.md

Property-Based Testing and Mutation Testing

Property-Based Testing (PBT)

Property Patterns

Shrinking

PBT Tools by Language

When PBT Adds Value

Falsifier-gate: closed-world finite → parametrize, NOT PBT

PBT + TDD Integration

Mutation Testing

Mutation Score Targets

Mutation Operators

Mutation Testing Tools

Combined PBT + Mutation Workflow

PBT Performance Guidance

State-Delta + Hypothesis Integration

path_strategy() — composite Hypothesis strategy

Integration pattern

When to use this combination

Reference

Related Skills

nwave-ai/nw-distill

nwave-ai/nw-collaboration-and-handoffs

nwave-ai/nw-roadmap

nwave-ai/nw-distill

`path_strategy()` — composite Hypothesis strategy

`path_strategy()` — composite Hypothesis strategy