aops-core/skills/design-rubric/SKILL.md
Design-stage fitness rubric — persona immersion, scenario design, dimensions that define what excellence looks like for the people a feature serves. Two modes — author (produce a rubric for a new spec) and critique (red-team an existing spec). Output lives on the spec, not in the verification brief. Owned by pauli.
npx skillsauth add nicsuzor/academicops design-rubricInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Persona immersion done at QA time is too late. The reviewer has the artifact in front of them, no original context, and the temptation to rationalise. Persona immersion done at design time shapes the spec — the acceptance criteria encode the persona's needs, and the verifier's job becomes simple: did the artifact meet the AC the persona drove?
The rubric is part of the spec. Marsha never re-derives it; she reads it. If marsha is invoking this skill, something has gone wrong upstream.
Author mode — you are writing a new spec or epic body. The deliverable has a real user. You need to define what excellence looks like before anyone builds anything.
Critique mode — a spec or decomposition already exists. Before dispatch, you red-team it: if we build this as specified, will it serve the user, or overwhelm them? The output is the same shape as author mode (a Fitness Rubric on the spec), but the work is mostly identifying gaps and unstated assumptions.
Both modes produce a ## Fitness Rubric section on the spec or epic body.
Not needed for purely mechanical work (lint fix, test repair, dependency bump). If you can't name a user whose situation matters, you don't need a rubric.
Any one is sufficient to require a Fitness Rubric:
If a task triggers one of these and arrives at marsha without a ## Fitness Rubric, the verdict is REVISE — fitness rubric missing; escalate to pauli/design-rubric. Marsha does not improvise the rubric.
A single artifact often has both a mechanical bar (does it run / render / parse?) and a qualitative bar (is it any good for the user?). The rubric does not replace mechanical AC — it sits alongside them. At verify time, marsha checks both. Expect the qualitative judgement to take more reviewer attention than the mechanical check; if the brief makes the mechanical check dominate, the verdict will silently collapse to mechanical-pass-equals-ship.
Inhabit the user at the moment they encounter this feature. Not demographics, not identity. Situation: emotional state, what just happened, what's about to happen, cognitive constraints active right now, what success would feel like.
Example: "Nic, with ADHD, back at his desk after three hours away. Four sessions ran while he was gone. He's lost the thread. The paper deadline is this week. He opens the dashboard. He needs a lifeline, not a data dump."
If you can't write the persona paragraph without resorting to generic prose ("a busy professional who values efficiency"), you don't understand the user well enough yet. Stop and gather context.
Each scenario is a short paragraph naming entry state, real goal (which may differ from the stated goal), constraints active in that moment, and what success feels like.
The stressed path is where most features quietly fail. Spend the most time here.
A dimension is a question that requires interpretation, not a checkbox.
| Anti-pattern (don't) | Dimension (do) | | --------------------------------------- | --------------------------------------------------------------------------- | | "Does the header show session goal?" | "Can the user reconstruct their working narrative? At what cognitive cost?" | | "Are timestamps HH:MM?" | "Does temporal info help orient, or add noise?" | | "Is the dropped-threads section first?" | "Does the display create appropriate urgency without triggering shame?" |
Dimensions to consider in most rubrics:
For each dimension, describe what excellent and poor look like in prose. Not "9/10". Not "meets standard." Sentences that a reviewer can later cite.
Excellent: the dropped-threads callout creates gentle urgency — surfaces abandoned work prominently but frames it as "pick up where you left off" rather than "you failed."
Poor: dropped threads listed in the same visual register as completed work. The user must scan everything equally.
If one user concern clearly dominates the design, flag one dimension as **load-bearing**. This is not a numeric weight — it's a tie-breaker the verifier's synthesis paragraph must address explicitly. Marsha cannot reach PASS if the load-bearing dimension is poor, even if every other dimension is excellent. Use sparingly; if every dimension is load-bearing, none is.
The rubric lives on the spec or epic body under ## Fitness Rubric. Shape:
## Fitness Rubric
**Persona:** <one paragraph, situational>
**Scenarios:**
- Golden: <entry / real goal / constraints / success feel>
- Stressed: ...
- Edge: ...
**Dimensions & quality spectrum:**
- <dimension question> — excellent: <prose>; poor: <prose>
- ...
**Red-team notes (critique mode only):**
- <gap or unstated assumption surfaced during review>
That section is what marsha reads at verify time. Pauli's verify-brief points at it; the brief does not duplicate it.
When red-teaming an existing spec, the work is mostly:
The output is still a Fitness Rubric — but it now also documents the gaps that were closed (or are still open) before any worker is dispatched.
/verify against the rubric.| Anti-pattern | Why it fails | Instead | | -------------------------- | ----------------------------------------------------------- | ---------------------------------------------------- | | Pass/Fail tables | Reduces nuance to binary; reviewer stops thinking | Narrative quality spectrum | | Point scoring | False precision; 73/100 means nothing | Qualitative judgement with cited evidence | | Demographics-first persona | Treats persona as identity, not situation | Situation-first: what's happening right now | | Rubric in the brief | Becomes a checklist when copied into the verifier's hands | Rubric lives on the spec; brief links to it | | Generic personas | "Busy professional who values efficiency" tells you nothing | Specific situation, real constraints, real day | | Skipping stressed path | Most features fail at the depleted-user moment | Stressed path gets the most attention, not the least |
Task(subagent_type="aops-core:pauli",
prompt="Author a Fitness Rubric for <spec or epic>. Mode: author / critique.
The rubric is for <user persona at a glance>.
Land it on the spec body under `## Fitness Rubric`.")
tools
Program / portfolio supervision — the autonomous top loop above /supervisor. "Ready the release" → discover and decompose the constituent epics → run /supervisor on each → surface only escalations + merge-ready PRs. Stateless tick driven by /loop; all cross-tick state lives in the program task body.
development
Mirror PKB tasks onto the Cowork native task list at claim time and sync completion back to PKB. Cowork-only; ships only in the cowork build of aops-core.
testing
Instruction quality gate — reviews agent instructions (task bodies, workflow steps, skill procedures, self-test protocols) for shallow-execution vulnerabilities before deployment. Two modes: author (pre-hoc review) and audit (trace a failure back to the instruction gap). The bar is excellence, not compliance.
tools
Analyze writing samples and create a comprehensive personal writing style guide