skills/design-evaluate/SKILL.md
(Evaluator) Score a feature Design Doc's mechanics quality across 6 dimensions: feedback loop, pacing, progression, replayability, extensibility, emotional arc; writes a scored evaluation report under docs/runs/<feature_id>/...
npx skillsauth add dvduongth/skills design-evaluateInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Produce a scored mechanics evaluation for a feature Design Doc, then automatically remediate weak Pacing and Emotional Arc dimensions:
Phase 1 — Score:
Phase 2 — Remediation (conditional):
Use after design-validate returns PASS for any game mechanic feature.
You are a mechanics evaluator. Your job is to assess the quality of the game design — not its structural completeness (that is design-validate's job). You evaluate whether the mechanics are well-designed, balanced, and likely to produce good player experiences.
Do not validate structure, AC format, or section presence. Focus exclusively on mechanics quality as defined by the rubric.
Output language: Vietnamese
canonical (default) or explicit path
docs/design-docs/<feature_id>/design-doc.mdmemory/patterns.md exists: read active patterns relevant to mechanics evaluation. Calibrate scoring accordingly. (Do NOT fail if missing)memory/mistakes.md exists: read active mistakes relevant to evaluation. Watch for known recurring issues. (Do NOT fail if missing)templates/mechanics-evaluation-rubric.mdSection numbers differ by doc mode:
| Content | Section | |---------|---------| | Context & Goal | §1 | | Users / Player motivation profile | §2 | | Defines / Glossary | §3 | | Mechanics & Rules (core loop, feedback loop, progression, pacing) | §4 | | Data Model & Contracts | §3 [OPTIONAL] | | Config & Balancing | §7 | | Metrics | §9 |
Fields to extract from §1 and §2 during Step 0, used for calibrating evaluation:
| Field | Source | Used for |
|-------|--------|----------|
| game_mode | §1 Game at a glance | Determines if RE-6 (multiplayer downtime) applies |
| player_count | §1 Game at a glance | Multiplayer vs solo calibration |
| session_length_target | §1 Game at a glance | PA-4, PA-6 calibration |
| platform | §1 Game at a glance | PA-6 TA fit check |
| design_pillars | §1 Design pillars | Lens for all dimensions — do not penalize intentional pillar trade-offs |
| non_negotiables | §1 Non-negotiable decisions | Exclude from improvement suggestions |
| ta_frustration_tolerance | §2 Player motivation profile | EM-4, EM-6 calibration |
| ta_session_behavior | §2 Player motivation profile | PA-6 calibration |
design-validate (PASS required)Pre-flight is BLOCKING: validate_structure.py must pass (exit 0) before mechanics evaluation begins. Mechanics evaluation requires a structurally valid doc.
Run before reading the design doc:
python tools/validate_structure.py <feature_id>
python tools/scaffold_run.py <feature_id> evaluate
Creates the run folder + input.md. Note the printed path.Read §1 "Game at a glance", §1 "Design pillars", §1 "Non-negotiable decisions", and §2 "Player motivation profile". Extract and hold in working context all fields from the Context extract fields table (see Knowledge Layer).
If §1 session parameters are missing (§1 "Game at a glance" table not present or empty): note evaluation_confidence = low in output. Proceed but flag all session-parameter-dependent scores as uncertain.
Walk CX checklist items (CX-1 through CX-5) — these are informational, not pass/fail gates.
Verify before scoring:
If either fails: write a short report noting the doc is not ready for mechanics evaluation. Set verdict to WEAK with score 0 and stop.
Trace 3 archetype scenarios qua rules thực tế của doc. Scenario definitions và output format: xem rubric § Simulation Layer.
Với mỗi scenario:
Evidence từ trace được cite trực tiếp trong Step 5 (scoring). Không tạo file riêng — trace notes tồn tại trong working context.
Sau khi trace xong 3 scenarios, kiểm tra 4 anti-aesthetics theo catalog trong rubric § Anti-Aesthetic Catalog.
Với mỗi anti-aesthetic được phát hiện:
Score caps từ Step 3 được áp dụng trong Step 5. Nếu không có anti-aesthetic nào: ghi "No anti-aesthetics detected" và tiếp tục Step 4.
For each of the 30 checklist items (see rubric):
Apply score caps first: Check anti-aesthetic caps noted in Step 3 before assigning any score. Simulation evidence from Step 2 is preferred over doc-reading evidence at Score 4–5 — if both conflict, simulation evidence takes precedence.
For each of the 6 dimensions:
Scoring rule: A dimension scores N only if it fully satisfies the criteria for N. Partial satisfaction of level N means the score is N-1.
Evaluate the 5 key pairs defined in the rubric:
For each pair, determine the relationship type (reinforcing / tension / independent / undermining) and explain based on doc evidence.
EXCELLENT: normalized ≥ 0.85 AND no dimension below 3GOOD: normalized ≥ 0.65 AND no dimension below 2NEEDS_WORK: normalized ≥ 0.45 OR any dimension below 2WEAK: normalized < 0.45Write all 4 output files following the schemas defined in the rubric reference doc.
Trigger: Pacing score < 3 OR Emotional Arc score < 3
If triggered, perform the following sub-steps:
Read and extract the following generic sources (resolve section numbers using the Section number resolution table in the Knowledge Layer):
| Source | What to extract | Maps to | |--------|----------------|---------| | Config table (§7 Config & Balancing, both modes) | Parameter tiers/thresholds that affect intensity (e.g., multiplier levels, difficulty tiers) | Pacing phase intensity levels | | Config session length / maxRound (§7 Config & Balancing or §1 Context & Goal) | Round count + time target | Pacing phase round ranges | | Rules involving penalty/loss (§4 Mechanics & Rules table) | Penalty trigger + recovery mechanic | Tension source + frustration recovery | | Rules involving reward/bonus (§4 Mechanics & Rules table) | Reward trigger + multiplier or bonus action | Relief/delight moment | | Progression design — pacing beats field (§4 Mechanics & Rules) | Any pacing information stated | Pacing driver | | Scenarios or early game hook (§2 Player motivation profile / §4 Mechanics & Rules) | Scenario sequence or early hook description | Implied emotional beats |
Synthesize into two draft blocks:
Pacing phases draft (for §4 Mechanics & Rules → Progression design → Pacing phases table):
| Phase | Rounds | Time (approx) | Intensity | Active mechanics | Rest moment |
Emotional arc draft (for §2 Player motivation profile → Emotional arc):
| Phase | Dominant emotion | Tension source | Relief source |
Mark all auto-derived values as UNCONFIRMED where designer intent cannot be determined from the doc.
Write to run folder: pacing-arc-draft.md
Structure:
# Pacing & Emotional Arc Draft — <feature_id>
> AUTO-DERIVED — requires designer review before merge
## How to review
- [ ] Confirm or adjust phase round boundaries
- [ ] Confirm or rename emotional beat labels
- [ ] Add any intent the auto-derivation cannot know (see "Designer input required" below)
- [ ] When ready: reply "confirm" or edit this file directly, then say "merge"
## Designer input required (cannot be auto-derived)
- First-win moment: Which specific mechanic/action gives the player their first identifiable win, and in which turn?
- Drama peak: Which round/phase is intended as the climax of the game?
- Signature delight: What is the one moment that defines this game's feel?
## Draft: Pacing Phases (→ will merge into §4 Mechanics & Rules)
<pacing phases table>
- Pacing driver: <derived mechanic>
- Early game hook: <best guess from scenarios — mark as UNCONFIRMED>
## Draft: Emotional Arc (→ will merge into §2 Player motivation profile)
<emotional arc table>
- First-win moment: UNCONFIRMED — designer must fill
- Frustration recovery path: <derived from penalty mechanism + recovery path>
- Surprise/delight moments: <derived from reward chaining + bonus action mechanics>
After writing pacing-arc-draft.md, pause and ask the designer:
"Pacing ({score}/5) and/or Emotional Arc ({score}/5) scored below 3. I've auto-derived a draft at
{run_folder}/pacing-arc-draft.md. Please review the draft, fill in the 3 designer-only fields, and confirm. Reply 'confirm' when ready to merge, or edit the draft first."
Wait for designer confirmation before proceeding to 7.4.
After designer confirms:
Pacing phases table + pacing driver + early game hookpacing-arc-merged.md to run folder: brief diff summary of what changedRe-run Phase 1 (Steps 1–8) on the updated design doc.
rescore.json in the same run foldernotes.mdnotes.md, stop and reportCreate: docs/runs/<feature_id>/<YYYYMMDD_HHMM>_evaluate/
Phase 1 files:
input.mdmechanics-evaluation.jsonmechanics-evaluation.mdnotes.mdPhase 2 files (if triggered):
pacing-arc-draft.md — auto-derived draft for designer reviewpacing-arc-merged.md — diff summary after merge into design docrescore.json — re-score result after mergeinput.md content requirementsmechanics-evaluation.json content requirementsFollow the JSON schema defined in templates/mechanics-evaluation-rubric.md exactly.
mechanics-evaluation.md content requirementsFollow the human-readable report structure defined in templates/mechanics-evaluation-rubric.md:
notes.md content requirementsCorrectness: Every score cites at least one piece of evidence from a specific doc section; improvement suggestions are actionable (not "make it better" but "add a secondary loop that...").
Completeness: All 6 dimensions scored; 30-item checklist walked; dimension relationships analyzed for all 5 pairs; verdict computed with thresholds applied.
Context-fit: Scores not inflated — when uncertain, score lower; context extract fields loaded before scoring; session-parameter-dependent scores flagged as uncertain if §1 game at a glance is absent.
Consequence: Do not penalize for sections outside scope (e.g., visual design, engineering handoff); do not penalize intentional pillar trade-offs; non-negotiable decisions excluded from improvement suggestions.
development
Hiểu sâu bất kỳ codebase nào đã được GitNexus index — architecture, execution flows, symbol relationships, blast radius. Dùng khi hỏi về codebase architecture, symbol context, impact analysis, hoặc index status.
tools
Search GIF providers with CLI/TUI, download results, and extract stills/sheets.
documentation
Fetch GitHub issues, spawn sub-agents to implement fixes and open PRs, then monitor and address PR review comments. Usage: /gh-issues [owner/repo] [--label bug] [--limit 5] [--milestone v1.0] [--assignee @me] [--fork user/repo] [--watch] [--interval 5] [--reviews-only] [--cron] [--dry-run] [--model glm-5] [--notify-channel -1002381931352]
tools
Gemini CLI for one-shot Q&A, summaries, and generation.