skills/qc-metrics/SKILL.md
Use when measuring and reporting QA quality — defect escape rate, test coverage analysis, flaky test rate, mean time to detect, shift-left metrics, and building quality dashboards for stakeholders.
npx skillsauth add kienbui1995/magic-powers qc-metricsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Defect Escape Rate (most important)
Formula: Production defects / (Pre-prod defects + Production defects) × 100
Target: < 10% (less than 10% of defects found in production)
Why it matters: Measures testing effectiveness; high rate = gaps in coverage
Test Coverage
Line coverage: % of code lines executed by tests
Branch coverage: % of code branches executed
Requirement coverage: % of requirements with test cases
Target: > 80% branch coverage for business-critical paths
Caution: Coverage is a floor not a ceiling; 100% with bad tests is worthless
Defect Detection Distribution
% defects found per phase: Unit test | Integration | QA | UAT | Production
Target: Most defects found in unit/integration (shift-left)
Cost multiplier: Defect in prod costs ~30x more than unit test phase
Flaky Test Rate
Formula: Tests that failed non-deterministically in last 7 days / total tests
Target: < 2% flaky rate
Impact: High flaky rate erodes trust in test suite, developers start ignoring failures
Mean Time to Detect (MTTD)
Time from bug introduction (code commit) to detection
Target: < 24 hours for critical paths
Enables: Faster feedback, cheaper fixes
Test Execution Time
Time from code commit to test results
Target: < 10 min for unit+integration, < 30 min for full suite
Impact: Slow tests = developers don't run them locally
# Monthly defect escape rate report
def calculate_escape_rate(month: str) -> dict:
pre_prod_defects = get_defects_found_before_production(month)
prod_defects = get_production_incidents(month)
total = len(pre_prod_defects) + len(prod_defects)
escape_rate = len(prod_defects) / total * 100 if total > 0 else 0
return {
"month": month,
"pre_prod_defects": len(pre_prod_defects),
"production_defects": len(prod_defects),
"escape_rate": f"{escape_rate:.1f}%",
"trend": "improving" if escape_rate < get_previous_month_rate() else "degrading"
}
Healthy test pyramid:
E2E tests: 5-10% (slow, expensive, cover critical paths)
Integration tests: 20-30% (medium speed, component interactions)
Unit tests: 60-70% (fast, cheap, cover business logic)
Unhealthy patterns:
Ice cream cone (inverted pyramid):
Many E2E, few unit tests → slow CI, brittle suite
Fix: Identify what E2E tests are actually testing unit-level logic
Hourglass (gaps in middle):
Many unit + many E2E, no integration
Fix: Add API/service-level integration tests
Check your pyramid:
grep -r "@pytest.mark" tests/ | grep -c "e2e\|integration\|unit"
Dashboard sections:
1. Build Health (real-time)
- Last build status: PASS/FAIL
- Build duration trend (line chart)
- Flaky test list (top 10)
2. Coverage Trends (weekly)
- Line coverage % over time (trend line)
- Files with lowest coverage (table)
- Coverage delta per PR
3. Defect Metrics (monthly)
- Defect escape rate over time
- Defect distribution by phase (bar chart)
- Open P1/P2 count (KPI card)
- Time to resolution by priority
4. Release Readiness (per release)
- Test completion % by feature
- Outstanding P1/P2 defects
- Test pass rate last 5 runs
- Sign-off status
Measuring shift-left effectiveness:
Before shift-left: 70% bugs found in QA/UAT, 30% in unit tests
After 6 months: 50% bugs in unit, 30% integration, 20% QA, <5% production
Leading indicators (predict future quality):
- PR code review turnaround time (faster = fewer bugs pass through)
- Unit test coverage per PR (requires tests with new code)
- Static analysis violations per PR (code quality proxy)
Lagging indicators (measure past quality):
- Defect escape rate (measures testing effectiveness)
- Production incident rate (measures overall quality)
- Customer-reported bugs (ultimate quality measure)
Report: leading indicators weekly (actionable), lagging monthly (strategic)
qc-defect-management — defect data feeds escape rate and distribution metricsado-pipeline-optimization — test result publishing enables coverage trendingtest-strategy — metrics validate and inform the overall test strategycontent-media
Use when designing for XR (AR/VR/MR), choosing interaction modes, or adapting 2D UI patterns for spatial computing
testing
Use when creating new skills, editing existing skills, or verifying skills work before deployment
development
Use when you have a spec or requirements for a multi-step task, before touching code
development
Use when executing a structured workflow — select and run a feature, bugfix, refactor, research, or incident template with correct agent and model assignments per phase.