.claude/skills/test-metrics-dashboard/SKILL.md
Use when querying test history, analyzing flakiness rates, tracking MTTR, or building quality trend dashboards from test execution data.
npx skillsauth add proffesor-for-testing/agentic-qe test-metrics-dashboardInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Data & Analysis skill for querying test execution history, identifying trends, and surfacing actionable quality metrics.
/test-metrics-dashboard
| Metric | Formula | Target | Alert | |--------|---------|--------|-------| | Pass Rate | Passed / Total | > 95% | < 90% | | Flakiness Rate | Flaky / Total | < 5% | > 10% | | MTTR | Avg time from failure to fix | < 4 hours | > 24 hours | | Execution Time | Total suite duration | < 10 min | > 20 min | | Coverage Delta | Current - Previous | >= 0% | < -2% |
# Export Jest results to JSON
npx jest --json --outputFile=test-results/$(date +%Y-%m-%d).json
# Parse results for dashboard
jq '{
date: .startTime,
total: .numTotalTests,
passed: .numPassedTests,
failed: .numFailedTests,
duration_ms: (.testResults | map(.endTime - .startTime) | add),
pass_rate: ((.numPassedTests / .numTotalTests) * 100),
flaky: [.testResults[] | select(.numPendingTests > 0)] | length
}' test-results/$(date +%Y-%m-%d).json
# Compare last 5 runs
for f in $(ls -t test-results/*.json | head -5); do
jq --arg file "$f" '{
file: $file,
pass_rate: ((.numPassedTests / .numTotalTests) * 100 | floor),
duration_s: ((.testResults | map(.endTime - .startTime) | add) / 1000 | floor)
}' "$f"
done
# Find most frequently failing tests across runs
for f in test-results/*.json; do
jq -r '.testResults[] | select(.numFailingTests > 0) | .testFilePath' "$f"
done | sort | uniq -c | sort -rn | head -10
Store dashboard data in ${CLAUDE_PLUGIN_DATA}/test-metrics.log:
2026-03-18|95.2|4.1|312|82.5|3
Format: date|pass_rate|flakiness_rate|duration_s|coverage_pct|failed_count
Read history for trend detection:
# Coverage trending down?
tail -5 "${CLAUDE_PLUGIN_DATA}/test-metrics.log" | awk -F'|' '{print $5}' | sort -n | head -1
Feeds into:
/qe-quality-assessment — quality gate decisions based on metrics/test-failure-investigator — investigate top failing tests/coverage-drop-investigator — when coverage trends downdevelopment
Apply XP practices including pair programming, ensemble programming, continuous integration, and sustainable pace. Use when implementing agile development practices, improving team collaboration, or adopting technical excellence practices.
development
Warehouse Management System testing patterns for inventory operations, pick/pack/ship workflows, wave management, EDI X12/EDIFACT compliance, RF/barcode scanning, and WMS-ERP integration. Use when testing WMS platforms (Blue Yonder, Manhattan, SAP EWM).
testing
Advanced visual regression testing with pixel-perfect comparison, AI-powered diff analysis, responsive design validation, and cross-browser visual consistency. Use when detecting UI regressions, validating designs, or ensuring visual consistency.
development
Comprehensive truth scoring, code quality verification, and automatic rollback system with 0.95 accuracy threshold for ensuring high-quality agent outputs and codebase reliability.