.claude/skills/eval-sprint/SKILL.md
Adversarial evaluation of sprint spec before implementation.
npx skillsauth add leogodin217/leos_claude_starter eval-sprintInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Adversarial evaluation of sprint spec before implementation. Run in a new session to ensure fresh eyes.
Find problems in the spec that would cause implementation to fail or produce poor results. The evaluator is deliberately adversarial — looking for ways the spec could be misinterpreted, is incomplete, or violates principles.
Load ONLY:
CLAUDE.md — Principles (especially #7 and #8)docs/sprints/current/spec.md — The spec to evaluateDO NOT load:
.py files to understand the codebase. Use LSP tools instead (see below).When verifying spec claims against the codebase (class names, function signatures, caller counts, line numbers), use only LSP tools. Do NOT read entire source files — this wastes context on code irrelevant to the evaluation.
| Verification need | LSP tool | Example |
|---|---|---|
| "Does this class/function exist?" | find_definition | Spec says TypeResolver — verify the actual class name |
| "Who calls this function?" | find_references | Spec says "only two callers" — verify caller count |
| "What's the signature?" | get_hover | Spec says rng param at line 203 — verify |
| "What type is this?" | get_hover | Spec references Distribution union — verify it exists |
| "Does this symbol exist in the module?" | find_workspace_symbols | Spec says export from __init__.py — verify |
NEVER use Read on source files for this skill. If you catch yourself about to read a .py file, use an LSP tool instead.
= None, = [], = {} in signatures (Principle #7)Read the entire spec without referring to other docs. Note:
Go through each check systematically. For every failure:
For each contract, ask:
Trace through phases in order:
For each demo:
# Spec Evaluation: [Sprint Name]
**Verdict: PASS / NEEDS WORK / FAIL**
**Summary:** [One sentence assessment]
---
## Blocking Issues
Issues that MUST be fixed before implementation.
### 1. [Short Title]
**Location:** [Section/line reference]
**Category:** [Structural/Principle/Consistency/Ambiguity/Testability/Architecture]
**Problem:** [Specific description]
**Impact:** [What goes wrong if not fixed]
**Suggested Fix:** [How to fix it]
---
## Warnings
Issues that SHOULD be addressed but aren't blocking.
### 1. [Short Title]
**Location:** [Section/line reference]
**Problem:** [Description]
**Suggestion:** [How to improve]
---
## Notes
Observations that aren't issues but worth considering.
- [Note 1]
- [Note 2]
---
## Checklist Results
| Category | Pass | Fail | Issues |
|----------|------|------|--------|
| Structural | 5 | 1 | Missing file structure |
| Principles | 4 | 0 | — |
| Consistency | 3 | 1 | Undefined type |
| Ambiguity | 3 | 2 | Weasel words |
| Testability | 3 | 0 | — |
| Architecture | 4 | 0 | — |
---
## Verdict Explanation
**PASS:** No blocking issues. Warnings are minor. Spec is ready for implementation.
**NEEDS WORK:** No blocking issues but warnings are significant. Review before proceeding.
**FAIL:** Blocking issues found. Must fix and re-evaluate before implementation.
| Issue | Why It Blocks | |-------|---------------| | Missing contract for stated scope | Implementer won't know what to build | | Undefined type referenced | Code won't compile | | Default parameter in signature | Principle #7 violation | | Phase dependency violation | Phase N can't be built | | Ambiguous return type | Implementer will guess wrong |
| Issue | Why It's a Warning | |-------|-------------------| | Vague error message | Implementation will work but be unhelpful | | Missing edge case | Might cause bug but not blocking | | Inconsistent naming | Annoying but not fatal | | Demo doesn't fully exercise contract | Partial verification |
/implement-sprint/eval-sprint againdevelopment
Spec-fidelity verification tracing requirements through code.
development
Analyze Claude Code session transcripts — search, summarize, list, or inspect how a session went.
testing
Course designer mode for creating exercises, configs, and QA criteria.
testing
System architect mode for designing interfaces, contracts, and architecture decisions.