Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

proffesor-for-testing/test-metrics-dashboard

Name: test-metrics-dashboard
Author: proffesor-for-testing

.claude/skills/test-metrics-dashboard/SKILL.md

npx skillsauth add proffesor-for-testing/agentic-qe test-metrics-dashboard

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Test Metrics Dashboard

Data & Analysis skill for querying test execution history, identifying trends, and surfacing actionable quality metrics.

Activation

/test-metrics-dashboard

Key Metrics

Test Health Metrics

| Metric | Formula | Target | Alert | |--------|---------|--------|-------| | Pass Rate | Passed / Total | > 95% | < 90% | | Flakiness Rate | Flaky / Total | < 5% | > 10% | | MTTR | Avg time from failure to fix | < 4 hours | > 24 hours | | Execution Time | Total suite duration | < 10 min | > 20 min | | Coverage Delta | Current - Previous | >= 0% | < -2% |

Data Collection

# Export Jest results to JSON
npx jest --json --outputFile=test-results/$(date +%Y-%m-%d).json

# Parse results for dashboard
jq '{
  date: .startTime,
  total: .numTotalTests,
  passed: .numPassedTests,
  failed: .numFailedTests,
  duration_ms: (.testResults | map(.endTime - .startTime) | add),
  pass_rate: ((.numPassedTests / .numTotalTests) * 100),
  flaky: [.testResults[] | select(.numPendingTests > 0)] | length
}' test-results/$(date +%Y-%m-%d).json

Trend Analysis

# Compare last 5 runs
for f in $(ls -t test-results/*.json | head -5); do
  jq --arg file "$f" '{
    file: $file,
    pass_rate: ((.numPassedTests / .numTotalTests) * 100 | floor),
    duration_s: ((.testResults | map(.endTime - .startTime) | add) / 1000 | floor)
  }' "$f"
done

Top Failing Tests

# Find most frequently failing tests across runs
for f in test-results/*.json; do
  jq -r '.testResults[] | select(.numFailingTests > 0) | .testFilePath' "$f"
done | sort | uniq -c | sort -rn | head -10

Run History

Store dashboard data in ${CLAUDE_PLUGIN_DATA}/test-metrics.log:

2026-03-18|95.2|4.1|312|82.5|3

Read history for trend detection:

# Coverage trending down?
tail -5 "${CLAUDE_PLUGIN_DATA}/test-metrics.log" | awk -F'|' '{print $5}' | sort -n | head -1

Composition

Feeds into:

/qe-quality-assessment — quality gate decisions based on metrics
/test-failure-investigator — investigate top failing tests
/coverage-drop-investigator — when coverage trends down

Gotchas

Metrics without baselines are meaningless — establish baselines before tracking trends
Flakiness rate is underreported — a test that fails 1/100 times still breaks CI weekly
Duration trends upward over time as test count grows — set alerts on rate of increase, not absolute value
Agent may report metrics from a single run as "trends" — need 5+ data points for meaningful trends

proffesor-for-testing/test-metrics-dashboard

.claude/skills/test-metrics-dashboard/SKILL.md

Use when querying test history, analyzing flakiness rates, tracking MTTR, or building quality trend dashboards from test execution data.

304 stars

development

Updated Apr 11, 2026

$ install --global

skillsauth

npx skillsauth add proffesor-for-testing/agentic-qe test-metrics-dashboard

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 11, 2026, 8:25 PM49.6s1 file scanned

SKILL.md

name:: test-metrics-dashboard
description:: Use when querying test history, analyzing flakiness rates, tracking MTTR, or building quality trend dashboards from test execution data.
user-invocable:: true

Test Metrics Dashboard

Data & Analysis skill for querying test execution history, identifying trends, and surfacing actionable quality metrics.

Activation

/test-metrics-dashboard

Key Metrics

Test Health Metrics

Data Collection

# Export Jest results to JSON
npx jest --json --outputFile=test-results/$(date +%Y-%m-%d).json

# Parse results for dashboard
jq '{
  date: .startTime,
  total: .numTotalTests,
  passed: .numPassedTests,
  failed: .numFailedTests,
  duration_ms: (.testResults | map(.endTime - .startTime) | add),
  pass_rate: ((.numPassedTests / .numTotalTests) * 100),
  flaky: [.testResults[] | select(.numPendingTests > 0)] | length
}' test-results/$(date +%Y-%m-%d).json

Trend Analysis

# Compare last 5 runs
for f in $(ls -t test-results/*.json | head -5); do
  jq --arg file "$f" '{
    file: $file,
    pass_rate: ((.numPassedTests / .numTotalTests) * 100 | floor),
    duration_s: ((.testResults | map(.endTime - .startTime) | add) / 1000 | floor)
  }' "$f"
done

Top Failing Tests

# Find most frequently failing tests across runs
for f in test-results/*.json; do
  jq -r '.testResults[] | select(.numFailingTests > 0) | .testFilePath' "$f"
done | sort | uniq -c | sort -rn | head -10

Run History

Store dashboard data in ${CLAUDE_PLUGIN_DATA}/test-metrics.log:

2026-03-18|95.2|4.1|312|82.5|3

Read history for trend detection:

# Coverage trending down?
tail -5 "${CLAUDE_PLUGIN_DATA}/test-metrics.log" | awk -F'|' '{print $5}' | sort -n | head -1

Composition

Feeds into:

/qe-quality-assessment — quality gate decisions based on metrics
/test-failure-investigator — investigate top failing tests
/coverage-drop-investigator — when coverage trends down

Gotchas

Metrics without baselines are meaningless — establish baselines before tracking trends
Flakiness rate is underreported — a test that fails 1/100 times still breaks CI weekly
Duration trends upward over time as test count grows — set alerts on rate of increase, not absolute value
Agent may report metrics from a single run as "trends" — need 5+ data points for meaningful trends

Related Skills

proffesor-for-testing/qe-xp-practices

development

VerifiedTrustedCommunity

Apply XP practices including pair programming, ensemble programming, continuous integration, and sustainable pace. Use when implementing agile development practices, improving team collaboration, or adopting technical excellence practices.

304SKILL.mdUpdated Apr 11, 2026

proffesor-for-testing/qe-xp-practices

proffesor-for-testing/qe-wms-testing-patterns

development

VerifiedTrustedCommunity

Warehouse Management System testing patterns for inventory operations, pick/pack/ship workflows, wave management, EDI X12/EDIFACT compliance, RF/barcode scanning, and WMS-ERP integration. Use when testing WMS platforms (Blue Yonder, Manhattan, SAP EWM).

304SKILL.mdUpdated Apr 11, 2026

proffesor-for-testing/qe-wms-testing-patterns

proffesor-for-testing/qe-visual-testing-advanced

testing

VerifiedTrustedCommunity

Advanced visual regression testing with pixel-perfect comparison, AI-powered diff analysis, responsive design validation, and cross-browser visual consistency. Use when detecting UI regressions, validating designs, or ensuring visual consistency.

304SKILL.mdUpdated Apr 11, 2026

proffesor-for-testing/qe-visual-testing-advanced

proffesor-for-testing/qe-verification-quality

development

VerifiedTrustedCommunity

Comprehensive truth scoring, code quality verification, and automatic rollback system with 0.95 accuracy threshold for ensuring high-quality agent outputs and codebase reliability.

304SKILL.mdUpdated Apr 11, 2026

proffesor-for-testing/qe-verification-quality

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/proffesor-for-testing/agentic-qe.git

# Copy into Claude Code skills folder (global)
cp -r agentic-qe/.claude/skills/test-metrics-dashboard ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

proffesor-for-testing/agentic-qe

304 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT