plugins/developer-kit-specs/skills/task-quality-kpi/SKILL.md
Objective task quality evaluation framework using quantitative KPIs. KPIs are automatically calculated by a hook when task files are modified and saved to TASK-XXX--kpi.json. Use when: reading KPI data for task evaluation, understanding quality metrics, deciding whether to iterate or approve based on data.
npx skillsauth add giuseppe-trisciuoglio/developer-kit task-quality-kpiInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The Task Quality KPI Framework provides objective, quantitative metrics for evaluating task implementation quality.
Key Architecture: KPIs are auto-generated by a hook - you read the results, not run scripts.
┌─────────────────────────────────────────────────────────────┐
│ HOOK (auto-executes) │
│ Trigger: PostToolUse on TASK-*.md │
│ Script: task-kpi-analyzer.py │
│ Output: TASK-XXX--kpi.json │
├─────────────────────────────────────────────────────────────┤
│ SKILL / AGENT (reads output) │
│ Input: TASK-XXX--kpi.json │
│ Action: Make evaluation decisions │
└─────────────────────────────────────────────────────────────┘
| Problem | Solution | |---------|----------| | Skills can't execute scripts | Hook auto-runs on file save | | Subjective review_status | Quantitative 0-10 scores | | "Looks good to me" | Evidence-based evaluation | | Binary pass/fail | Graduated quality levels |
After any task file modification, find KPI data at:
docs/specs/[ID]/tasks/TASK-XXX--kpi.json
┌─────────────────────────────────────────────────────────────┐
│ OVERALL SCORE (0-10) │
├─────────────────────────────────────────────────────────────┤
│ Spec Compliance (30%) │
│ ├── Acceptance Criteria Met (0-10) │
│ ├── Requirements Coverage (0-10) │
│ └── No Scope Creep (0-10) │
├─────────────────────────────────────────────────────────────┤
│ Code Quality (25%) │
│ ├── Static Analysis (0-10) │
│ ├── Complexity (0-10) │
│ └── Patterns Alignment (0-10) │
├─────────────────────────────────────────────────────────────┤
│ Test Coverage (25%) │
│ ├── Unit Tests Present (0-10) │
│ ├── Test/Code Ratio (0-10) │
│ └── Coverage Percentage (0-10) │
├─────────────────────────────────────────────────────────────┤
│ Contract Fulfillment (20%) │
│ ├── Provides Verified (0-10) │
│ └── Expects Satisfied (0-10) │
└─────────────────────────────────────────────────────────────┘
| Category | Weight | Why | |----------|--------|-----| | Spec Compliance | 30% | Most important - did we build what was asked? | | Code Quality | 25% | Technical excellence | | Test Coverage | 25% | Verification and confidence | | Contract Fulfillment | 20% | Integration with other tasks |
agents_loop.py)DO NOT run scripts - read the auto-generated file:
Read the KPI file:
docs/specs/001-feature/tasks/TASK-001--kpi.json
The KPI file contains:
{
"task_id": "TASK-001",
"evaluated_at": "2026-01-15T10:30:00Z",
"overall_score": 8.2,
"passed_threshold": true,
"threshold": 7.5,
"kpi_scores": [
{
"category": "Spec Compliance",
"weight": 30,
"score": 8.5,
"weighted_score": 2.55,
"metrics": {
"acceptance_criteria_met": 9.0,
"requirements_coverage": 8.0,
"no_scope_creep": 8.5
},
"evidence": [
"Acceptance criteria: 9/10 checked",
"Requirements coverage: 8/10"
]
}
],
"recommendations": [
"Code Quality: Moderate improvements possible"
],
"summary": "Score: 8.2/10 - PASSED"
}
Use overall_score and passed_threshold:
IF passed_threshold == true:
→ Task meets quality standards
→ Approve and proceed
IF passed_threshold == false:
→ Task needs improvement
→ Check recommendations for specific targets
→ Create fix specification
## Review Process
1. Read KPI file: TASK-XXX--kpi.json
2. Extract overall_score and kpi_scores
3. Read task file to validate
4. Generate evaluation report
5. Decision based on passed_threshold
# Check KPI file exists
kpi_path = spec_path / "tasks" / f"{task_id}--kpi.json"
if kpi_path.exists():
kpi_data = json.loads(kpi_path.read_text())
if kpi_data["passed_threshold"]:
# Quality threshold met
advance_state("update_done")
else:
# Need more work
fix_targets = kpi_data["recommendations"]
create_fix_task(fix_targets)
advance_state("fix")
else:
# KPI not generated yet - task may not be implemented
log_warning("No KPI data found")
Instead of max 3 retries, iterate until quality threshold met:
Iteration 1: Score 6.2 → FAILED → Fix: Improve test coverage
Iteration 2: Score 7.1 → FAILED → Fix: Refactor complex functions
Iteration 3: Score 7.8 → PASSED → Proceed
Each iteration updates the KPI file automatically on task save.
| Score | Quality Level | Action | |-------|---------------|--------| | 9.0-10.0 | Exceptional | Approve, document best practices | | 8.0-8.9 | Good | Approve with minor notes | | 7.0-7.9 | Acceptable | Approve (if threshold 7.5) | | 6.0-6.9 | Below Standard | Request specific improvements | | < 6.0 | Poor | Significant rework required |
| Project Type | Threshold | Rationale | |--------------|-----------|-----------| | Production MVP | 8.0 | High quality required | | Internal Tool | 7.0 | Good enough | | Prototype | 6.0 | Functional over perfect | | Critical System | 8.5 | No compromises |
Acceptance Criteria Met
(checked_criteria / total_criteria) * 10Requirements Coverage
traceability-matrix.mdNo Scope Creep
(implemented_files / expected_files) * 10Static Analysis
Complexity
10 - (long_functions_ratio * 5)Patterns Alignment
knowledge-graph.jsonUnit Tests Present
min(10, test_files * 5)Test/Code Ratio
(test_count / code_count) * 10Coverage Percentage
coverage_percent / 10Provides Verified
provides frontmatterExpects Satisfied
expects frontmatterIf TASK-XXX--kpi.json doesn't exist:
DO NOT try to calculate KPIs manually. The hook runs automatically when:
Before evaluating:
Check if KPI file exists:
docs/specs/[ID]/tasks/TASK-XXX--kpi.json
If missing:
- Task may not be implemented yet
- Ask user to save the task file first
The KPIs are objective. Only override with documented evidence:
Target specific categories:
❌ "Fix code quality issues"
✅ "Improve Code Quality KPI from 5.2 to 7.0:
- Complexity: Refactor processData() (5→8)
- Patterns: Add error handling (6→8)"
Monitor quality over time:
Sprint 1: Average KPI 6.8
Sprint 2: Average KPI 7.3 (+0.5)
Sprint 3: Average KPI 7.9 (+0.6)
Check:
hooks.jsonTASK-*.mdValidate:
Possible causes:
Fix the root cause, not just the score.
Read the KPI file to evaluate task quality:
docs/specs/001-feature/tasks/TASK-042--kpi.json
Based on the data:
- Overall score: 6.8/10 (below threshold)
- Lowest KPI: Test Coverage (5.0/10)
- Recommendation: Add unit tests
Decision: REQUEST FIXES - target Test Coverage improvement
Iteration 1 KPI: Score 6.2 → FAILED
- Spec Compliance: 7.0 ✓
- Code Quality: 5.5 ✗
- Test Coverage: 6.0 ✗
Fix targets:
1. Refactor complex functions (Code Quality)
2. Add test coverage (Test Coverage)
Iteration 2 KPI: Score 7.8 → PASSED ✓
# In agents_loop, after implementation step
kpi_file = spec_dir / "tasks" / f"{task_id}--kpi.json"
if kpi_file.exists():
kpi = json.loads(kpi_file.read_text())
if kpi["passed_threshold"]:
print(f"✅ Task passed quality check: {kpi['overall_score']}/10")
advance_state("update_done")
else:
print(f"❌ Task failed quality check: {kpi['overall_score']}/10")
print("Recommendations:")
for rec in kpi["recommendations"]:
print(f" - {rec}")
advance_state("fix")
evaluator-agent.md - Agent that uses KPI data for evaluationhooks.json - Hook configuration for auto-generationtask-kpi-analyzer.py - Hook script (do not execute directly)agents_loop.py - Orchestrator that reads KPI for decisionsdevelopment
Provides security review capability for TypeScript/Node.js applications, validates code against XSS, injection, CSRF, JWT/OAuth2 flaws, dependency CVEs, and secrets exposure. Use when performing security audits, before deployment, reviewing authentication/authorization implementations, or ensuring OWASP compliance for Express, NestJS, and Next.js. Triggers on "security review", "check for security issues", "TypeScript security audit".
development
Provides final code cleanup after task review approval. Removes debug logs, temporary comments, dead code, optimizes imports, and improves readability. Use when asked to clean up code, polish, finalize, tidy up, remove technical debt, or prepare code for completion after review. Not for refactoring logic or fixing bugs—focused solely on cosmetic and hygiene cleanup.
tools
Ralph Wiggum-inspired automation loop for specification-driven development. Orchestrates task implementation, review, cleanup, and synchronization using a Python script. Use when: user runs /loop command, user asks to automate task implementation, user wants to iterate through spec tasks step-by-step, or user wants to run development workflow automation with context window management. One step per invocation. State machine: init → choose_task → implementation → review → fix → cleanup → sync → update_done. Supports --from-task and --to-task for task range filtering. State persisted in fix_plan.json.
testing
Creates, updates, validates, and displays the architectural DNA of a project through two shared documents: docs/specs/architecture.md (technology stack, architectural rules, security constraints, AI guardrails) and docs/specs/ontology.md (domain glossary / Ubiquitous Language). Use BEFORE brainstorm as a project setup step, or at any point in the SDD lifecycle to validate specs/tasks against architecture principles. Triggers on 'create constitution', 'update constitution', 'constitution check', 'validate against constitution', 'project principles', 'architectural guardrails', 'setup project architecture', 'define ontology'.