.claude/skills/mutation-testing/SKILL.md
Test quality validation through mutation testing, assessing test suite effectiveness by introducing code mutations and measuring kill rate. Use when evaluating test quality, identifying weak tests, or proving tests actually catch bugs.
npx skillsauth add proffesor-for-testing/agentic-qe mutation-testingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
<default_to_action> When validating test quality or improving test effectiveness:
Quick Mutation Metrics:
Critical Success Factors:
| Score | Interpretation | |-------|----------------| | 90%+ | Excellent test quality | | 80-90% | Good, minor improvements | | 60-80% | Needs attention | | < 60% | Significant gaps |
| Category | Original | Mutant |
|----------|----------|--------|
| Arithmetic | a + b | a - b |
| Relational | x >= 18 | x > 18 |
| Logical | a && b | a \|\| b |
| Conditional | if (x) | if (true) |
| Statement | return x | (removed) |
// Original code
function isAdult(age) {
return age >= 18; // ← Mutant: change >= to >
}
// Strong test (catches mutation)
test('18 is adult', () => {
expect(isAdult(18)).toBe(true); // Kills mutant!
});
// Weak test (mutation survives)
test('19 is adult', () => {
expect(isAdult(19)).toBe(true); // Doesn't catch >= vs >
});
// Surviving mutant → Test needs boundary value
# Install
npm install --save-dev @stryker-mutator/core @stryker-mutator/jest-runner
# Initialize
npx stryker init
Configuration:
{
"packageManager": "npm",
"reporters": ["html", "clear-text", "progress"],
"testRunner": "jest",
"coverageAnalysis": "perTest",
"mutate": [
"src/**/*.ts",
"!src/**/*.spec.ts"
],
"thresholds": {
"high": 90,
"low": 70,
"break": 60
}
}
Run:
npx stryker run
Output:
Mutation Score: 87.3%
Killed: 124
Survived: 18
No Coverage: 3
Timeout: 1
// Surviving mutant: >= changed to >
function calculateDiscount(quantity) {
if (quantity >= 10) { // Mutant survives!
return 0.1;
}
return 0;
}
// Original weak test
test('large order gets discount', () => {
expect(calculateDiscount(15)).toBe(0.1); // Doesn't test boundary
});
// Fixed: Add boundary test
test('exactly 10 gets discount', () => {
expect(calculateDiscount(10)).toBe(0.1); // Kills mutant!
});
test('9 does not get discount', () => {
expect(calculateDiscount(9)).toBe(0); // Tests below boundary
});
// Analyze mutation score and generate fixes
await Task("Mutation Analysis", {
targetFile: 'src/payment.ts',
generateMissingTests: true,
minScore: 80
}, "qe-test-generator");
// Returns:
// {
// mutationScore: 0.65,
// survivedMutations: [
// { line: 45, operator: '>=', mutant: '>', killedBy: null }
// ],
// generatedTests: [
// 'test for boundary at line 45'
// ]
// }
// Coverage + mutation correlation
await Task("Coverage Quality Analysis", {
coverageData: coverageReport,
mutationData: mutationReport,
identifyWeakCoverage: true
}, "qe-coverage-analyzer");
aqe/mutation-testing/
├── mutation-results/* - Stryker reports
├── surviving/* - Surviving mutants
├── generated-tests/* - Tests to kill mutants
└── trends/* - Mutation score over time
const mutationFleet = await FleetManager.coordinate({
strategy: 'mutation-testing',
agents: [
'qe-test-generator', // Generate tests for survivors
'qe-coverage-analyzer', // Coverage correlation
'qe-quality-analyzer' // Quality assessment
],
topology: 'sequential'
});
High code coverage ≠ good tests. 100% coverage but weak assertions = useless. Mutation testing proves tests actually catch bugs.
Focus on critical paths first. Don't mutation test everything - prioritize payment, authentication, data integrity code.
With Agents: Agents run mutation analysis, identify surviving mutants, and generate missing test cases to kill them. Automated improvement of test quality.
After each mutation test run, append results to run-history.json in this skill directory:
node -e "
const fs = require('fs');
const h = JSON.parse(fs.readFileSync('.claude/skills/mutation-testing/run-history.json'));
h.runs.push({date: new Date().toISOString().split('T')[0], mutation_score_pct: SCORE, killed: KILLED, survived: SURVIVED});
fs.writeFileSync('.claude/skills/mutation-testing/run-history.json', JSON.stringify(h, null, 2));
"
Read run-history.json before each run to track score improvements over time.
/qe-test-generation to ensure tests exist/qe-coverage-analysis to prioritize improvement areas/qe-quality-assessment for ship/no-ship decision--testRunner jest explicitly if both jest and vitest are installed>= to > in date comparisons rarely gets killed — add boundary tests--mutate to target specific functions--concurrency defaults to CPU count which OOMs in containers — set to 2development
Apply XP practices including pair programming, ensemble programming, continuous integration, and sustainable pace. Use when implementing agile development practices, improving team collaboration, or adopting technical excellence practices.
development
Warehouse Management System testing patterns for inventory operations, pick/pack/ship workflows, wave management, EDI X12/EDIFACT compliance, RF/barcode scanning, and WMS-ERP integration. Use when testing WMS platforms (Blue Yonder, Manhattan, SAP EWM).
testing
Advanced visual regression testing with pixel-perfect comparison, AI-powered diff analysis, responsive design validation, and cross-browser visual consistency. Use when detecting UI regressions, validating designs, or ensuring visual consistency.
development
Comprehensive truth scoring, code quality verification, and automatic rollback system with 0.95 accuracy threshold for ensuring high-quality agent outputs and codebase reliability.