skills/mutation-testing/SKILL.md
Finds weak or missing tests by analyzing if code changes would be caught. Use when verifying test effectiveness, strengthening test suites, or validating TDD workflows.
npx skillsauth add envy-7z/mobile-agent-skillpack mutation-testingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
STARTER_CHARACTER = 🧬🔬
Mutation testing answers the question: "Are my tests actually catching bugs?"
Code coverage tells you what code your tests execute. Mutation testing tells you if your tests would detect changes to that code. A test suite with 100% coverage can still miss 40% of potential bugs.
The Mutation Testing Process:
The Insight: A surviving mutant represents a bug your tests wouldn't catch.
Use mutation testing analysis when:
Integration with TDD:
TDD Workflow Mutation Testing Validation
┌─────────────────┐ ┌─────────────────────────────┐
│ RED: Write test │ │ │
│ GREEN: Pass it │──────────► │ After GREEN: Verify tests │
│ REFACTOR │ │ would kill relevant mutants │
└─────────────────┘ └─────────────────────────────┘
Follow this systematic process when analyzing code on a branch:
# For JavaScript/TypeScript
git diff main...HEAD --name-only | grep -E '\.(ts|js|tsx|jsx)$' | grep -v '\.test\.'
# For Python
git diff main...HEAD --name-only | grep '\.py$' | grep -v 'test_'
# Get detailed diff for analysis
git diff main...HEAD -- src/
For each changed function/method, mentally apply mutation operators (see Language-Specific Operators below).
For each potential mutant, ask:
Categorize findings:
| Category | Description | Action Required | |----------|-------------|-----------------| | Killed | Test would fail if mutant applied | None - tests are effective | | Survived | Test would pass with mutant | Add/strengthen test | | No Coverage | No test exercises this code | Add behavior test | | Equivalent | Mutant produces same behavior | None - not a real bug |
These mutation operators apply across most languages:
| Original | Mutated | Test Should Verify |
|----------|---------|-------------------|
| a + b | a - b | Addition behavior |
| a - b | a + b | Subtraction behavior |
| a * b | a / b | Multiplication behavior |
| a / b | a * b | Division behavior |
| a % b | a * b | Modulo behavior |
Key Pattern:
// ❌ WEAK TEST - Would NOT catch mutant
calculate(10, 1) // 10 * 1 = 10, 10 / 1 = 10 (SAME!)
// ✅ STRONG TEST - Would catch mutant
calculate(10, 3) // 10 * 3 = 30, 10 / 3 = 3.33 (DIFFERENT!)
| Original | Mutated | Test Should Verify |
|----------|---------|-------------------|
| a < b | a <= b | Boundary value at equality |
| a < b | a >= b | Both sides of condition |
| a <= b | a < b | Boundary value at equality |
| a > b | a >= b | Boundary value at equality |
| a >= b | a > b | Boundary value at equality |
Key Pattern:
// ❌ WEAK TEST - Would NOT catch boundary mutant
isAdult(25) // 25 >= 18 = true, 25 > 18 = true (SAME!)
// ✅ STRONG TEST - Would catch boundary mutant
isAdult(18) // 18 >= 18 = true, 18 > 18 = false (DIFFERENT!)
| Original | Mutated | Test Should Verify |
|----------|---------|-------------------|
| a == b | a != b | Both equal and not equal cases |
| a != b | a == b | Both equal and not equal cases |
| Original | Mutated | Test Should Verify |
|----------|---------|-------------------|
| a AND b | a OR b | Case where one is true, other is false |
| a OR b | a AND b | Case where one is true, other is false |
| NOT a | a | Negation is necessary |
Key Pattern:
// ❌ WEAK TEST - Would NOT catch mutant
canAccess(true, true) // true OR true = true AND true (SAME!)
// ✅ STRONG TEST - Would catch mutant
canAccess(true, false) // true OR false = true, true AND false = false (DIFFERENT!)
| Original | Mutated | Test Should Verify |
|----------|---------|-------------------|
| true | false | Both true and false outcomes |
| false | true | Both true and false outcomes |
| Original | Mutated | Test Should Verify | |----------|---------|-------------------| | Function body | Empty function | Side effects of the function |
Key Pattern:
// ❌ WEAK TEST - Would NOT catch mutant
processOrder(order) // No assertions - empty function also doesn't throw!
// ✅ STRONG TEST - Would catch mutant
processOrder(order)
verifyOrderWasSaved(order) // Verifies side effect
| Original | Mutated | Test Should Verify |
|----------|---------|-------------------|
| "text" | "" | Non-empty string behavior |
| "" | "XX" | Empty string behavior |
| Original | Mutated | Test Should Verify |
|----------|---------|-------------------|
| [1, 2, 3] | [] | Non-empty collection behavior |
| {} | Empty or mutated | Empty collection behavior |
Different languages have specific mutation operators beyond the universal ones:
?.), nullish coalescing (??), and JS-specific methodsis), membership (in), floor division (//), and Python-specific patterns| State | Meaning | Action | |-------|---------|--------| | Killed | Test failed when mutant applied | Good - tests are effective | | Survived | Tests passed with mutant active | Bad - add/strengthen test | | No Coverage | No test exercises this code | Add behavior test | | Timeout | Tests timed out (infinite loop) | Counted as detected | | Equivalent | Mutant produces same behavior | No action - not a real bug |
killed / valid * 100 - The higher, the betterkilled + timeoutsurvived + no coverage| Score | Quality | |-------|---------| | < 60% | Weak test suite - significant gaps | | 60-80% | Moderate - many improvements possible | | 80-90% | Good - but still gaps to address | | > 90% | Strong - but watch for equivalent mutants |
Equivalent mutants produce the same behavior as the original code. They cannot be killed because there is no observable difference.
Pattern 1: Operations with identity elements
// Mutant in conditional where both branches have same effect
if (whatever) {
number += 0 // Can mutate to -= 0, *= 1, /= 1 - all equivalent!
} else {
number += 0
}
Pattern 2: Boundary conditions that don't affect outcome
// When max equals min, condition doesn't matter
max = max(a, b)
min = min(a, b)
if (a >= b) { // Mutating to <= or < has no effect when a == b
result = 10 ** (max - min) // 10 ** 0 = 1 regardless
}
Pattern 3: Dead code paths
// If this path is never reached, mutations don't matter
if (impossibleCondition) {
doSomething() // Mutating this won't affect behavior
}
When analyzing code changes on a branch:
// Original weak test
test('validates age', () => {
assert(isAdult(25) === true)
assert(isAdult(10) === false)
})
// Strengthened with boundary values
test('validates age at boundary', () => {
assert(isAdult(17) === false) // Just below
assert(isAdult(18) === true) // Exactly at boundary
assert(isAdult(19) === true) // Just above
})
// Original weak test - only tests one branch
test('returns access result', () => {
assert(canAccess(true, true) === true)
})
// Strengthened - tests all meaningful combinations
test('grants access when admin', () => {
assert(canAccess(true, false) === true)
})
test('grants access when owner', () => {
assert(canAccess(false, true) === true)
})
test('denies access when neither', () => {
assert(canAccess(false, false) === false)
})
// Weak - uses identity values
test('calculates', () => {
assert(multiply(10, 1) === 10) // x * 1 = x / 1
assert(add(5, 0) === 5) // x + 0 = x - 0
})
// Strong - uses values that reveal operator differences
test('calculates', () => {
assert(multiply(10, 3) === 30) // 10 * 3 != 10 / 3
assert(add(5, 3) === 8) // 5 + 3 != 5 - 3
})
// Weak - no verification of side effects
test('processes order', () => {
processOrder(order)
// No assertions!
})
// Strong - verifies observable outcomes
test('processes order', () => {
processOrder(order)
verifyOrderSaved(order)
verifyEmailSent(order.customerEmail)
})
Mutation testing tools automate the mutation generation and test execution:
The key question for every line of code:
"If I introduced a bug here, would my tests catch it?"
For each test, verify it would catch:
Remember:
>= vs > (boundary not tested)AND vs OR (only tested when both true/false)+ vs - (only tested with 0)* vs / (only tested with 1)| Avoid | Use Instead | |-------|-------------| | 0 (for +/-) | Non-zero values | | 1 (for */) | Values > 1 | | Empty collections | Collections with multiple items | | Identical values for comparisons | Distinct values | | All true/false for logical ops | Mixed true/false |
development
Use when you have a spec or requirements for a multi-step task, before touching code
data-ai
Use when about to claim work is complete, fixed, or passing, before committing or creating PRs - requires running verification commands and confirming output before making any success claims; evidence before assertions always
tools
Use when starting feature work that needs isolation from current workspace or before executing implementation plans - creates isolated git worktrees with smart directory selection and safety verification
testing
Applies Kent Beck's Thinkies—pattern-based thinking habits that generate ideas. Use when stuck, exploring alternatives, or reframing decisions.