aops-core/skills/survey/SKILL.md
Survey a corpus, classify, and dispatch outputs. Three modes: retro (transcript review → issues), trend (longitudinal performance analysis), sweep (GitHub issue triage → fix-epics). Delegates execution to junior/jr to keep main context clean.
npx skillsauth add nicsuzor/academicops surveyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
One abstract pattern, three modes: survey a corpus, classify findings, dispatch outputs.
| Mode | Corpus | Primary output |
| ------- | ----------------------------------- | ------------------------------ |
| retro | Session transcripts (one at a time) | GitHub issues filed via gh |
| trend | Many sessions / audit files | Trend report + recommendations |
| sweep | Open GitHub issues | PKB tasks, fix-epics, closures |
Privacy rule: Anonymise all findings before filing. No real names, emails, student details, session dumps.
See also: [[../AXIOMS.md]] · [[../HEURISTICS.md]]
This skill delegates execution to keep the main context window clean. The invoking agent dispatches, passes this skill as the execution spec, and exits.
# retro or trend mode — junior has transcript/knowledge and execution access
Agent(
subagent_type='junior',
prompt='Read aops-core/skills/survey/SKILL.md. Execute in [retro|trend] mode. [user context]',
tools=['Bash','Read','Grep','Glob','Edit','Write','Skill',
'mcp__plugin_aops-core_pkb__search',
'mcp__plugin_aops-core_pkb__task_search',
'mcp__plugin_aops-core_pkb__get_document',
'mcp__plugin_aops-core_pkb__pkb_context',
'mcp__plugin_aops-core_pkb__append',
'mcp__plugin_aops-core_pkb__create_task',
'mcp__plugin_aops-core_pkb__update_task'],
)
# sweep mode — jr handles interactive confirmation gates
Agent(
subagent_type='aops-core:jr',
prompt='Read aops-core/skills/survey/SKILL.md. Execute in sweep mode. [user context]',
tools=['Bash','Read','Grep','Glob','Skill','AskUserQuestion',
'mcp__plugin_aops-core_pkb__get_task',
'mcp__plugin_aops-core_pkb__create_task',
'mcp__plugin_aops-core_pkb__update_task',
'mcp__plugin_aops-core_pkb__append',
'mcp__plugin_aops-core_pkb__task_search'],
)
Purpose: Read a recent transcript through a framework-development lens. Provide a brutal, concise critical review identifying problems. File every finding as a GitHub issue. We are aiming for EXCELLENCE, not "running code".
# List recent transcripts, newest first
# (transcripts are sharded into yyyy-mm/ subdirs after rotation; recent ones stay at top level)
find ~/.aops/sessions/transcripts -name '*.md' -printf '%T@ %p\n' | sort -rn | head -20 | cut -d' ' -f2-
# Show unreviewed transcripts (missing reviewed_by in frontmatter)
for f in $(find ~/.aops/sessions/transcripts -name '*.md' -printf '%T@ %p\n' | sort -rn | head -20 | cut -d' ' -f2-); do
grep -q "^reviewed_by:" "$f" || echo "$f"
done | head -10
If the user specified a path or session ID, use that. If no unreviewed transcripts exist, report and stop.
Read every line. Do not skim. For large files, read in chunks:
Read(file_path="<path>", offset=1, limit=500)
Read(file_path="<path>", offset=500, limit=500)
# continue until EOF
Evaluate the transcript critically. We are aiming for EXCELLENCE, not "running code".
You are trusted to find what matters. Read the transcript. Find what's worth fixing. Write it up however makes most sense. Flag your own biases.
recusal)The retro is a forensic instrument. It does not legislate. Per AXIOMS.md § recusal, the agent that just read the transcript is normatively recused from proposing framework change motivated by that transcript. Cross-incident judgment about adding/escalating/propagating rules happens in the detached sweep mode (a separate context, no prior exposure to this incident).
However, what counts as forensic includes structural articulation. You may articulate structural shape factually (e.g., "design X is impossible because Y", "the rule forces the agent into an impossible loop"). You may flag: "this looks structural; sweep should ask whether the rule shape is right, not just whether the rule fired".
For each finding, trust your judgment to provide a clear, forensic description that stops at the facts. Include what happened (quoted from the transcript), the structural context (how the framework's shape contributed), and the concrete impact.
Crucially, do not propose what should be added, escalated, or propagated. That is the sweep agent's job, not yours. You are trusted to identify the root cause category and causal chain naturally without adhering to a strict template.
You may flag the finding as severe; you may not author the legislation that severity might motivate. If you find yourself writing "we should add…", "the framework needs…", "an axiom against… would prevent this," strike it. The detached reviewer reading this report later, with the enforcement map and the incident register open, is the agent allowed to write that sentence.
## Transcript Review: <filename>
**Session**: <session_id> **Date**: <date> **Project**: <project>
**Verdict**: [EXCELLENT | GOOD | ADEQUATE | POOR | FAILING]
### Findings
[Free-form description of what went wrong, what went well, and what could be improved. You may use lists, paragraphs, or whatever structure best articulates the issues. Group symptoms under structural causes if applicable.]
### Patterns (Optional)
[If you notice a pattern across findings, articulate the single upstream/structural cause here.]
For every Finding that requires action, file or update a GitHub issue.
Group symptoms; split causes. Findings that are symptoms of one structural cause — where one edit would resolve all of them — collapse into one issue. Findings that are distinct fixable causes — even when they share a meta-class — split into one issue per cause.
Test for "distinct": each cause must be independently fixable, meaning closable by a single PR scoped to one framework surface. Same meta-class is not the disqualifier; same fix is. If two seams collapse to the same edit, they are one issue; if they require coordinated edits across different surfaces (different skill files, different agent prose, different hook code), they are separate issues. The granularity of the issue tracker should match the granularity at which the framework actually gets fixed (one feature, one surface, one PR at a time) — so that /issue-sweep can rank seam-by-seam and each PR can close one issue cleanly.
Cross-link pattern. When a meta-class spans multiple seams, the meta-class itself still earns a record: preferably a parent issue (with the per-seam children listed by number in its body) or, for simpler cases, a "meta-class anchor" note inside one of the per-seam issues. Each per-seam child links back to the meta-class via Refs #N; the meta-class anchor links forward to each child. The children are individually rankable and individually closable; the meta-class record persists as the structural-shape memory.
Search for existing issues and open PRs first — add volume to existing ones rather than duplicating:
gh issue list --repo nicsuzor/academicOps --search "<failure keywords>"
gh pr list --search "<failure keywords>" --repo nicsuzor/academicOps
If an open PR addresses the issue, post your finding there (where it is most actionable). If both an issue and a PR exist, comment on both and cross-link them.
If an existing issue (and/or PR) matches:
gh issue edit — do not comment.# Delta comment on PR — actionable feedback where the fix is happening:
gh pr comment <PR_N> --repo nicsuzor/academicOps \
--body "New incident (<date>): [what happened]. Impact: [concrete cost]."
# Delta comment on issue — new incident facts only, no background recap:
gh issue comment <ISSUE_N> --repo nicsuzor/academicOps \
--body "New incident (<date>): [what happened]. Impact: [concrete cost]."
# If both a PR and an issue exist, append the cross-link to each body:
# PR body suffix: " (Cross-posted to issue #<ISSUE_N>)"
# Issue body suffix: " (Cross-posted to PR #<PR_N>)"
# Structural correction — edit title, body, or both:
gh issue edit <ISSUE_N> --repo nicsuzor/academicOps --title "<new-title>" --body-file /tmp/issue-<slug>.md
If no existing issue, perform root cause analysis and file:
# Write root-cause report to /tmp/issue-<slug>.md, then:
gh issue create --repo nicsuzor/academicOps \
--title "Bug: <brief-slug>" \
--body-file /tmp/issue-<slug>.md \
--label "bug" --label "criticality:<level>"
New issue body: forensic fields only, no narrative preamble. Lead with the failure; stop at impact. Factual structural articulation is allowed; no remediation proposals (recusal):
Provide a clear, unstructured incident report containing:
Do NOT propose what should change — only document what existed at the time of the incident and the structural realities of why it failed.
Issues that include a "suggested axiom," "proposed gate," or any remediation are out-of-scope under recusal and will be edited down to the forensic core by the sweep agent. Volume bumps on existing issues (gh issue comment) follow the same discipline — facts and impact only.
Why this discipline: the sweep agent (or a strategic review) reads many incident reports against the enforcement map and the axiom set, and decides what to add, propagate, escalate, or leave alone. That cross-incident judgment is undermined when each incident report ships pre-packaged with the legislation its author thought it implied. The detached reviewer needs facts; the framework needs coherence; the recused incident agent provides one and protects the other by withholding the second.
In batch mode: cap at 3 issues per session.
reviewed_by:
- agent: "<model-name>"
date: "<ISO 8601 datetime>"
machine: "<hostname via bash>"
verdict: "<EXCELLENT|GOOD|ADEQUATE|POOR|FAILING>"
issues_filed: <count>
Append (do not replace) if reviewed_by already exists.
## Framework Reflection
**Prompts**: /survey retro [transcript path]
**Outcome**: [success/partial/failure]
**Accomplishments**: Reviewed <transcript>, filed <N> issues, stamped as reviewed.
**Issues filed**: [GitHub Issue URLs]
| Anti-pattern | What to do instead |
| ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------- |
| Skimming | Read every line |
| "Overall good with minor issues" | Quote specifically |
| Filing four separate issues for symptoms of one structural flaw (where one edit fixes all) | Identify the structural flaw and file one issue mapping the symptoms back to it |
| Bundling N distinct fixable causes into one issue because they share a meta-class | File one issue per cause; cross-link each child to the meta-class parent (or anchor) via Refs #N |
| Inventing praise | Only genuine strengths |
| Reviewing your own session | Review a DIFFERENT session |
| Filing > 3 issues per session | Triage first; group structural causes; cap at 3 issues |
| New issue for known pattern | Comment on existing issue |
| Restating background or title in a bump comment | Post only the new delta: date, new incident facts, new impact angle. Reader already read the issue |
| Verbose bump comment with narrative setup or recap | One short paragraph max — lead with what happened this time; stop there |
| Verbose new issue body with narrative preamble or framing | Lead with the failure facts; no throat-clearing; no framing preamble |
| Including "suggested axiom", "add a gate", or any remediation proposal in the report (recusal) | Stop at facts + structural-context + impact. The sweep agent legislates from a detached context |
| "We should change Y because I just hit X" | The agent that hit X is recused (recusal). Surface the incident; leave the change proposal to sweep |
| Citing a single session as justification for a new mechanism | Recurrence is the evidence base for framework change, not the salience of one transcript |
Purpose: Review many sessions to assess whether a SYSTEM (gate, agent, skill, workflow) is achieving its goals across the population. Produces an evidence-based trend report with recommendations.
Distinction from retro: retro reviews individual sessions for agent behavior. trend reviews MANY sessions to assess systemic effectiveness.
If vague, clarify: what component, what success criteria, what time window, what specific concern?
| Source | Contents | Location |
| -------------------- | ----------------------------------- | ------------------------------- |
| Markdown transcripts | Full conversations (primary source) | ~/.aops/sessions/transcripts/ |
| Session summaries | High-level overviews | ~/.aops/sessions/summaries/ |
| PKB tasks | Task lifecycle data | Via mcp__pkb__task_search |
Read relevant framework workflows first. For hook/gate reviews, check 09-session-hook-forensics.md.
Record your sample with filenames and selection rationale.
Read every line. Extract:
Before aggregating, define what observable success looks like. If it's not observable in the data, state that upfront.
# Trend Review: <Component Name>
**Question**: <review question> **Date**: <today>
**Corpus**: <N> files, <date range> **Sample**: <N> files (criteria: ...)
## Executive Summary
[3-5 sentences. Is it working? Trend? Biggest issue? If data can't answer, say so — that IS the finding.]
## Objectives Verdict
| Objective | Verdict | Evidence |
| --------- | -------------------------------------------- | -------- |
| [obj 1] | ANSWERED / PARTIALLY ANSWERED / UNANSWERABLE | [why] |
## Individual Assessments
### <filename> (<date>)
- **Context**: ... **Component behavior**: ... **Accuracy**: [TP/FP/FN] **Impact**: ...
## Aggregate Analysis
### Signal Quality / Temporal Trends / Coverage Map / Cost-Benefit
## Recommendations
[Specific, actionable, prioritised. Each cites evidence.]
## Confidence and Limitations
mkdir -p ~/.aops/sessions/reviews
# Save to: ~/.aops/sessions/reviews/<component>-trend-<date>.md
| Anti-pattern | What to do instead | | ---------------------------------------- | ------------------------------------------ | | Reviewing all files | Sample strategically, read deeply | | Claims without citations | Every claim cites a specific file | | No temporal comparison | Always compare early vs late | | Treating absence as evidence | Distinguish measurement gaps from findings | | Platform-specific bugs as general claims | Isolate platform variables | | Burying the primary finding | Lead with structural limitations | | Substituting proxy findings | Deliver explicit verdicts per objective |
Purpose: Run ONE cycle of the open-issue sweep on nicsuzor/academicOps. Classify ≤ 20 open issues, present the proposed dispatch plan, wait for sign-off, execute confirmed actions, log the cycle. HALT after one cycle. Fix-epics are left queued for /supervisor in a later session.
Sweep runs undirected (cursor-driven, the default) or directed — given a focus (a specific blocker, failure, or issue the user is fighting right now), it investigates and prioritises that focus first while still pulling and reviewing the general backlog. Direction is not a recency exemption — see "Directed vs undirected sweep" below.
Detached judgment role (recusal): sweep is the framework's legislative phase. The agents that diagnosed each incident are recused; the sweep agent reads their forensic reports together with the enforcement map and the axiom set, and is the one allowed to propose adding, propagating, escalating, or retiring rules. Recency exposure is what makes this work — sweep enters with no prior context on any individual incident, so the cross-incident pattern (not the salience of any one report) drives the call.
Hard halts: No silent dispatch. No improvised dispositions. No cursor in task body (labels ARE the cursor). No proposal to add or escalate enforcement (new gate, new axiom, tier-bump, new hook firing surface) without ≥3 cited recurrences (the CBA bar in ENFORCEMENT-MAP.md). Bug fixes within an existing enforcement surface at the same tier, and user-directed architectural changes, are NOT add-or-escalate proposals — a single forensic incident (or explicit user directive) is sufficient for fix-epic or single-task.
Sweep accepts an optional focus — a specific blocker, failure, or issue the user is fighting right now (passed as the invocation argument, e.g. /issue-sweep 1573 or /issue-sweep "agent infers rendered state instead of looking"). With no focus, sweep runs undirected: cursor-driven over the oldest-untriaged batch.
A focus changes what sweep attends to first, not how it judges.
recusal exists to defeat — that bias is about adopting an incident's pre-authored legislation verbatim, not about which issue you look at first. Distinguish two different user inputs: a focus ("investigate this blocker first") only sets attention; a directed change ("change rule X this way") substitutes for the recurrence bar for that specific change (per §2b). A focus is never, by itself, a directed change. So any add-or-escalate proposal arising from the focal issue MUST show its evidence inline in the cycle plan — the ≥3 recurrence links, or the explicit user directive quoted — and must NOT cite the user's focus, or the focal incident's salience, as that evidence. If you cannot show ≥3 recurrences or a directive, the disposition is defer with needs-more-recurrences, exactly as in an undirected sweep. A clear bug fix within an existing surface at the same tier still dispatches on a single incident.The point: a user fighting a live blocker can aim the sweep at it without the sweep abandoning its backlog scan or lowering its bar for what counts as a justified framework change.
| Disposition | Criterion | Action | Label |
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------- | ----------------------- |
| close-as-stale | > 90d + no recent comments + root cause fixed, OR describes behaviour framework no longer has (verifiable by cheap grep) | gh issue close with comment citing fix | triaged-stale |
| consolidate-duplicate | Volume bump on duplicate issue | Close duplicate; MUST comment citing canonical via #N | triaged-duplicate |
| evidence-bump | Volume bump accumulating evidence on a related open issue (e.g. fix-epic) | Leave open; MUST comment citing canonical via #N | triaged-evidence-bump |
| single-task | Atomic: AC clear, ≤ 3 files, one obvious implementation, no cross-component coordination | File polecat task with Closes #N | triaged-single |
| fix-epic | Multi-step, multi-file, or design-required | Propose to user; on y: create epic + decompose, leave queued | triaged-epic |
| defer | Real but blocked or low-criticality | Apply triaged-defer + revisit-by-YYYY-MM-DD comment | triaged-defer |
Every disposition must be decidable in < 30 seconds by a fresh agent. If longer: "Needs human triage."
gh issue list --repo nicsuzor/academicOps --state open --limit 100 \
--search 'sort:created-asc -label:triaged-stale -label:triaged-comment -label:triaged-duplicate -label:triaged-evidence-bump -label:triaged-single -label:triaged-epic -label:triaged-defer' \
--json number,title,labels,createdAt,updatedAt,comments,body \
> /tmp/issue-sweep-batch.json
# Client-side sort: criticality-desc, age-asc. Take top 20.
Directed focus (optional). When the sweep is given a focus, additionally resolve the focal issue and its root-cause cluster and rank them ahead of the cursor batch:
# Focal issue(s) + same-root-cause siblings, pulled to the front:
gh issue list --repo nicsuzor/academicOps --state open --limit 30 \
--search '<focus keywords, or the issue number>' \
--json number,title,labels,createdAt,updatedAt,comments,body \
> /tmp/issue-sweep-focus.json
# Triage the focal cluster first; then continue with the undirected batch above (dedup by issue number).
gh issue list --repo nicsuzor/academicOps --state open --limit 1 --json totalCount
gh label list --repo nicsuzor/academicOps --limit 200 | grep -E '^(triaged-|criticality:)'
Create missing labels, then read the loop epic:
mcp__pkb__get_task(id="epic-a0523a25") # halt if not in_progress
For each issue: read body + ≤ 3 recent comments. Apply rubric. Group fix-epic candidates by root cause (cap 5 issues per proposed epic). Note one-line rationale per issue.
recusal — sweep's legislative role)This step runs only for issues whose remediation would add or escalate a
rule — a new gate, a new axiom, a tier-bump (e.g. L1→L3), a new hook firing
surface. Bug fixes within an existing enforcement surface at the same
tier (correcting wrong logic or wrong prose in an existing skill, agent,
hook, or gate) follow the normal fix-epic / single-task path — they do
not require this section, and a single forensic incident is sufficient
evidence. User-directed architectural changes do NOT bypass this section —
cost-ladder reasoning still applies to establish where the fix lands on
the enforcement ladder. Only the ≥3 recurrence requirement is waived;
the user's directive substitutes for that evidence.
For genuine add-or-escalate proposals, run this sequence before assigning a disposition. This is the work that retro is forbidden to do; sweep is the only mode allowed to author it.
recusal). If it carries a "suggested axiom" or "proposed gate," strip that from your reasoning — the proposal was authored under prejudicial recency and is evidence of urgency, not of the right answer. Edit the issue to remove the stripped section and leave a comment explaining the recusal split./craft audit, not a mechanism change.defer with a needs-more-recurrences comment, not an escalation.exercise-authority recurrences are L1 propagation failures; assume the same here unless evidence contradicts.halt-on-failure" is.close-as-stale (or consolidate-duplicate to track volume) — not a framework change. Recurrence count is the evidence base; one slip is not.The output of this step feeds the disposition decision in the rubric below (most often fix-epic for L1 propagation work, defer for "needs more recurrences," or close-as-stale for "no change warranted"). Surface every add-or-escalate proposal to the user gate in step 3 with the cost-ladder reasoning visible. Bug-fix dispositions go through the normal user gate without this cost-ladder rationale — they require only the bug description and corrective scope. User-directed dispositions still include cost-ladder reasoning (citing the user's directive as the evidence base in place of recurrence links).
## Cycle <N> — proposed dispatches (open before: <K>; batch: <M>)
### Fix-epic 1: <title>
- Issues: #A, #B - Why grouped: ... - Proposed scope: ... - Estimated effort: S/M/L
- Confirm? [y / edit / defer / split]
### Single-tasks
- #X → "<title>" (XS)
Confirm batch? [y / edit / defer all]
### Close (stale) / Consolidate / Evidence bump
- Close (stale): #P
- Close duplicate: #R → bumps #S
- Evidence bump (leave open): #T → bumps #U
Confirm? [y / edit]
### Needs human triage
- #Z (rubric ambiguous: <reason>)
Directed sweep: lead the plan with a Focal cluster section (the focus issue + its same-root-cause siblings), THEN include the standard sections above for the general batch. The general-batch section is mandatory and must carry its reviewed-count; if it is empty because the focal cluster used the whole budget, say so explicitly and name the unreviewed cursor range (per "Directed vs undirected sweep" → Coverage). Any escalation inside the focal cluster shows its recurrence links or quoted user directive inline (per Judgment standard) — a focal disposition with no shown evidence is not gateable.
Use AskUserQuestion for each gate. Halt cleanly on decline — re-emit and gate again.
Order: consolidate-duplicate → evidence-bump → close-stale → defer → single-task → fix-epic.
All task-creation actions MUST omit severity (or pass severity=0). Severity is a target-node-only signal — see skills/planner/SKILL.md (~lines 608-626). Setting it on a leaf inverts the focus queue.
state: closed post-application, unless explicit carve-out. Workers verify before reporting Done.mcp__pkb__create_task with issue body, AC, and Closes #N instruction. Apply the Trust the Worker doctrine ([[../aops/references/authoring-discipline]]).verify-parent task. Apply the Trust the Worker doctrine ([[../aops/references/authoring-discipline]]) — intent+AC, no mid-stream approval theatre. Leave queued. Do NOT invoke /supervisor.triaged-* label after each confirmed action.# Read template for schema; create a fresh datestamped instance for this cycle
template = mcp__pkb__get_task(id="epic-a0523a25")
cycle_id = "epic-a0523a25-" + YYYYMMDD + "-" + HHMM # e.g. epic-a0523a25-20260125-1430
instance = mcp__pkb__create_task(id=cycle_id, body=template.body)
mcp__pkb__append(id=cycle_id, content="<cycle entry per schema>")
Log must include: cursor=label-based; batch size; issues processed; per-disposition lists; open count after; triaged-* totals; stopping condition met (y/n with evidence).
Skill(skill="qa", args="Verify cycle <N> of /issue-sweep on epic-a0523a25. Sample 20% (min 3). Reviewers: Pauli (cohesion) + RBG (axiom compliance). Add Marsha if single-tasks dispatched.")
Do not start the next cycle. Re-invoke /survey sweep (or /issue-sweep) for cycle N+1.
Loop stops only when ALL open issues are either: < 7 days old, stamped triaged-defer with revisit comment, linked to an in-progress fix-epic, or closed. PLUS: zero criticality:critical open without active fix-epic, zero criticality:high open > 14 days without active fix-epic.
| Anti-pattern | What to do instead |
| -------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Authoring "surface for sign-off" or "review before promoting" prose | Apply the Trust the Worker doctrine ([[../aops/references/authoring-discipline]]). State intent + AC, no mid-stream approval theatre. |
| Skipping user-confirmation gate | Always present and wait |
| Stamping triaged-epic before user y | All stamps live after gate returns y |
| Invoking /supervisor inline | Leave fix-epics queued; user dispatches later |
| Inventing a sixth disposition | Surface under "Needs human triage" |
| Storing numeric cursor in task body | Labels are the cursor |
| Parenting fix-epics under epic-a0523a25 | Parent under relevant component epic |
| Bundling > 5 issues into one fix-epic | Split or surface as human-triage |
| Re-running cycle without halting | Halt; re-invoke for next cycle |
| Adopting a "suggested axiom" from an incident report verbatim | Strip per recusal; redo the cost-ladder reasoning from the detached vantage |
| Proposing escalation from one incident | Need ≥3 cited recurrences (CBA); otherwise defer with needs-more-recurrences |
| Deferring a clear bug fix to wait for more recurrences | ≥3 rule is for add-or-escalate only. A clear bug in an existing surface at the same tier dispatches as fix-epic on a single incident |
| Treating a user-directed architectural change as escalation | User directive substitutes for recurrence count. Dispatch as fix-epic with the user's directive cited; cost-ladder reasoning still applies to where the fix lands |
| "Add a gate" / "add an axiom" without naming the ENFORCEMENT-MAP row | Cite the specific row the fix propagates from or would add; default L0/L1 |
tools
Program / portfolio supervision — the autonomous top loop above /supervisor. "Ready the release" → discover and decompose the constituent epics → run /supervisor on each → surface only escalations + merge-ready PRs. Stateless tick driven by /loop; all cross-tick state lives in the program task body.
development
Mirror PKB tasks onto the Cowork native task list at claim time and sync completion back to PKB. Cowork-only; ships only in the cowork build of aops-core.
testing
Instruction quality gate — reviews agent instructions (task bodies, workflow steps, skill procedures, self-test protocols) for shallow-execution vulnerabilities before deployment. Two modes: author (pre-hoc review) and audit (trace a failure back to the instruction gap). The bar is excellence, not compliance.
content-media
Design-stage fitness rubric — persona immersion, scenario design, dimensions that define what excellence looks like for the people a feature serves. Two modes — author (produce a rubric for a new spec) and critique (red-team an existing spec). Output lives on the spec, not in the verification brief. Owned by pauli.