packages/apm-review-panel/SKILL.md
Use this skill to run a multi-persona expert advisory review on a labelled pull request in microsoft/apm. The panel fans out to five mandatory specialists plus a test-coverage specialist (active on every PR that touches src/) plus three conditional specialists (auth, doc-writer, performance-expert), all running in their own agent threads, and a CEO synthesizer. The orchestrator is the sole writer to the PR: ONE recommendation comment, no verdict labels, no merge gating. The panel is advisory -- it surfaces findings, prioritizes follow-ups, and renders a ship-recommendation that the maintainer and author weigh. Activate when a non-trivial PR needs a cross-cutting recommendation (architecture, CLI logging, DevX UX, supply-chain security, growth/positioning, optionally auth, docs, perf, and test coverage, with CEO arbitration).
npx skillsauth add microsoft/apm apm-review-panelInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The panel is FAN-OUT + SYNTHESIZER. Each persona runs in its own agent
thread (via the task tool) and returns JSON matching
assets/panelist-return-schema.json. The orchestrator schema-validates
each return, hands all returns to the apm-ceo synthesizer (also a task
thread, returns JSON matching assets/ceo-return-schema.json), then
renders ONE recommendation comment from assets/recommendation-template.md.
This skill is ADVISORY by design. It does not compute a binary verdict, it does not apply verdict labels, and it does not gate merge. The panel surfaces findings; the maintainer and the PR author decide ship.
APPROVE / REJECT,
no panel-approved / panel-rejected label, no deterministic verdict
computation. The CEO returns a ship_recommendation.stance (ship_now
/ ship_with_followups / needs_discussion / needs_rework); this is
prose for the human reviewer, never auto-applied as a label or status
check. This is the architectural fix for the previous regime's
over-strictness: removing the binary gate removes the incentive for
panelists to inflate required[] defensively.severity: blocking | recommended | nit. blocking is the highest
signal a panelist can send and renders prominently in the comment; it
still does not block merge. recommended is the default for substantive
feedback. nit is one-line polish. The orchestrator never reads
severity to gate anything.add-comment and one remove-labels call. The
remove-labels call always sweeps panel-review (trigger
idempotency) AND defensively removes panel-approved /
panel-rejected if present (legacy verdict labels from the
pre-advisory regime; they have no meaning here and would mislead
readers if left on a PR after a fresh advisory pass). NO add-labels
call -- there are no verdict labels to apply. Panelist subagents and
the CEO subagent return JSON only and MUST NOT call any gh write
command, post comments, apply labels, or touch the PR state.assets/recommendation-template.md after all subagents
return.| Agent | Role | Always active? | |-------|------|----------------| | Python Architect | Architectural Reviewer + supplies mermaid diagrams | Yes | | CLI Logging Expert | Output UX Reviewer | Yes | | DevX UX Expert | Package-Manager UX | Yes | | Supply Chain Security Expert | Threat-Model Reviewer | Yes | | OSS Growth Hacker | Adoption Strategist | Yes | | Auth Expert | Auth / Token Reviewer | Conditional (see below) | | Doc Writer | Documentation Reviewer | Conditional (see below) | | Test Coverage Expert | Test-Presence Reviewer (paired with DevX UX) | Yes (skipped only on docs-only PRs -- see below) | | Performance Expert | Package-Manager Performance Reviewer | Conditional (see below) | | APM CEO | Strategic Arbiter / Synthesizer | Yes |
apm-review-panel SKILL (orchestrator thread)
|
FAN-OUT via task tool (panelists in parallel)
|
+-----+-------+-------+-----+-----+------+-----------+----------+
v v v v v v v v v (cond.)
py cli dx-ux sec grw auth doc-writer test-cov
| | | | | | | |
| each returns JSON per panelist-return-schema.json
+-----+-------+-------+-----+-----+------+-----------+----------+
|
v <-- S4 schema-validate
v <-- on malformed: re-spawn that persona
v
task: apm-ceo synthesizer
- aggregates findings across panelists
- resolves dissent
- emits headline + arbitration prose + principle alignment
- emits curated recommended_followups (prioritized)
- emits ship_recommendation (stance + prose)
- returns ceo-return-schema.json
|
v <-- S4 schema-validate
v
orchestrator (sole writer)
| |
v v
add-comment remove-labels
(max:2) [panel-review,
panel-approved,
panel-rejected]
(trigger reset +
legacy verdict sweep)
Three personas are conditional (auth, doc-writer, performance-expert). A
fourth (test-coverage) is mandatory on every PR that touches src/ and
only skipped on documentation-only PRs -- see its section below for why.
The orchestrator ALWAYS spawns ALL four tasks to keep the schema
return shape uniform; the prompt instructs the subagent to set
active: false with an inactive_reason if the condition does not
hold.
Activate when the PR changes any of:
src/apm_cli/core/auth.pysrc/apm_cli/core/token_manager.pysrc/apm_cli/core/azure_cli.pysrc/apm_cli/deps/github_downloader.pysrc/apm_cli/marketplace/client.pysrc/apm_cli/utils/github_host.pysrc/apm_cli/install/validation.pysrc/apm_cli/install/pipeline.pysrc/apm_cli/deps/registry_proxy.pyFallback self-check (when no fast-path file matched): "Does this PR
change authentication behavior, token management, credential resolution,
host classification used by AuthResolver, git or HTTP authorization
headers, or remote-host fallback semantics? If unsure, answer YES."
Activate when the PR changes any of:
README.mdCHANGELOG.mdMANIFESTO.mddocs/src/content/docs/**.apm/skills/**/*.md.apm/agents/**/*.md.github/skills/**/*.md.github/agents/**/*.md.github/instructions/**/*.md.github/workflows/*.md (gh-aw natural-language workflows)packages/apm-guide/**Fallback self-check (when no fast-path file matched): "Does this PR change user-facing documentation, agent or skill prose, instruction files, CHANGELOG entries, README claims, or any natural-language artifact a reader will rely on? If unsure, answer YES."
When the doc-writer is active and the PR includes documentation changes, the persona reviews them for: (a) consistency with the existing voice and structure, (b) accuracy against the code being changed, (c) completeness for the typical reader (no orphan claims, no missing prerequisites), (d) discoverability (cross-links, sidebar order if Starlight content). When the doc-writer is active because of code changes that SHOULD have updated docs but did not, the persona surfaces that gap as a finding.
Activate when the PR changes any of:
src/apm_cli/cache/**src/apm_cli/deps/**src/apm_cli/install/phases/**src/apm_cli/install/pipeline.pysrc/apm_cli/install/resolve.pyscripts/perf/**src/apm_cli/core/command_logger.py (when the diff adds perf-instrumentation logs)Also activate when the PR description claims a performance win (speedup ratio, latency reduction, bytes-on-disk reduction, throughput improvement) or attaches a perf-harness measurement table.
Fallback self-check (when no fast-path file matched): "Does this PR change the hot path for dependency download, materialization, cache layout, transport (git protocol, partial clone, sparse checkout), parallelism, or any user-visible install/update wall-time? If unsure, answer YES."
When active, the performance-expert reviews against the package-manager performance playbook: transport minimization (depth, filter, sparse scope), cache layering and dedup keys, parallelism and lock contention, working-tree materialization cost, perf-harness methodology (cache wipe, warm/cold separation, statistical noise), and pervasive application of the chosen technique across install / update / run surfaces (not just the one path the PR exercises).
Active by default on every PR that touches src/**/*.py. The only
condition that flips this persona to active: false is a
documentation-only PR -- the diff contains zero src/**/*.py files.
In that case set inactive_reason: "documentation-only PR -- no runtime code paths to defend".
The activation rule is intentionally narrow: under the advisory regime,
test outcomes are LOAD-BEARING for CEO arbitration (passed / failed /
missing test evidence outranks opinion-only findings -- see
apm-ceo.agent.md and panelist-return-schema.json evidence block).
A persona whose findings carry that weight cannot be silently skipped
on a heuristic. Better to spawn it on a pure refactor and have it
return a single nit-severity "no behavior surface touched -- no
coverage finding" line than to skip it and leave the CEO without
evidence to weigh. (Earlier revisions of this skill paired test-coverage
with auth and doc-writer as conditional for symmetry; that symmetry
broke when test evidence became load-bearing.)
The test-coverage-expert is paired with the devx-ux-expert lens and
defends the user-promise contracts the DevX persona enumerates (CLI
surface, error wording, install idempotency, lockfile determinism, auth
resolution). It MUST verify "no test exists" claims with view/grep
on the test tree before emitting a finding -- false-positive coverage
findings destroy trust in the field. It does NOT compute coverage
percentages, does NOT flag tests for pure refactors, and does NOT
duplicate python-architect on test-code design.
These routes describe WHICH specialist's findings the CEO weights more heavily for a given PR type. They do NOT change which personas run -- every mandatory persona always runs. Routing is a CEO synthesis hint.
dissent_notes.Work through these steps in order. Do not skip ahead. Do not emit any output to the PR before step 6.
Read PR context (the orchestrating workflow already fetched it
via gh pr view / gh pr diff). Identify changed files for the
conditional panelist routing decisions (auth-expert and doc-writer).
Resolve the conditional panelists using the rules above. Decide
for EACH conditional persona: spawn active OR spawn with
active: false + an inactive_reason. Either way, all three
conditional personas ARE spawned -- the schema requires uniform
return shape.
Fan out panelist tasks. Spawn the following tasks in PARALLEL
via the task tool, one task per persona:
python-architect (also asked to supply extras.diagrams:
class_diagram (mermaid classDiagram), component (mermaid
flowchart TD), and OPTIONAL sequence (mermaid
sequenceDiagram) blocks per the persona's section 1/2/3 contract)cli-logging-expertdevx-ux-expertsupply-chain-security-expertoss-growth-hackerauth-expert (always - active per step 2)doc-writer (always - active per step 2)test-coverage-expert (always - active per step 2)performance-expert (always - active per step 2)Each task prompt MUST:
assets/panelist-return-schema.json and require the subagent
to emit JSON matching that schema as its FINAL message.severity: blocking
ONLY for correctness regressions, security/auth bypasses, or
architectural faults that compound, with explicit rationale.
Default substantive feedback to recommended. Use nit for
one-line polish. The panel is advisory; nothing you return blocks
merge -- pick the severity that honestly matches your signal
strength."gh write commands, NO posting
comments, NO label changes, NO touching PR state. JSON return only.S4 schema gate. When each panelist task returns, parse the JSON
and validate against assets/panelist-return-schema.json. On
validation failure:
{persona: "<slug>", active: true, summary: "Schema failure -- see extras.", findings: [], extras: {schema_failure: "<reason>"}}
and surface the failure in the CEO arbitration prompt.Spawn the CEO synthesizer task. Pass the full set of validated
panelist JSON returns to a task invocation that loads
../../agents/apm-ceo.agent.md. The prompt MUST:
assets/ceo-return-schema.json and require JSON return.ship_recommendation.stance is prose for the
human reviewer, not a gate. NO gh write commands.Validate the CEO return against assets/ceo-return-schema.json. On
failure, re-spawn once with the violation cited.
Resolve the notification audience. The advisory comment must surface in the inboxes of the people who will act on it. Run:
gh pr view <PR_NUMBER> --json author,reviewRequests
Build notify_audience as the deduplicated list:
@login (always included);@login (these are the
CODEOWNERS-resolved reviewers GitHub auto-requested for the
touched paths, plus any explicitly-requested human reviewers);@org/team-slug (CODEOWNERS team
entries).Filter out:
[bot] or matching
dependabot|github-actions|copilot-pull-request-reviewer);Cap the final list at 6 handles to avoid notification noise (PR
author + up to 5 reviewers/teams). If the cap trims, prefer team
handles over individual logins. Pass the resulting list to the
template renderer as notify_audience.
This step replaces the maintainer-notification signal that the pre-advisory verdict labels carried. It is the only mechanism by which a fresh panel pass announces itself.
Render the comment. Load assets/recommendation-template.md,
fill the placeholders from the panelist + CEO JSON, and emit it as
exactly ONE comment.
Filling rules:
summary field.python-architect.extras.diagrams.
If absent, render the placeholder lines from the template (do NOT
invent diagrams).Sweep labels via safe-outputs.remove-labels. The list MUST be
[panel-review, panel-approved, panel-rejected] -- always all three,
regardless of which are currently on the PR. panel-review is the
re-run idempotency reset; the other two are LEGACY VERDICT LABELS
from the pre-advisory regime that have no meaning under the advisory
contract and would mislead readers if left on a freshly-reviewed PR.
safe-outputs.remove-labels is idempotent on missing labels, so
sweeping all three on every run is safe and self-healing. NO
verdict labels are applied.
assets/recommendation-template.md. The safe-outputs.add-comment.max: 2 is a fail-soft ceiling; the discipline lives here.remove-labels call sweeping
[panel-review, panel-approved, panel-rejected].add-labels call. The advisory regime has no verdict to encode.gh pr comment, NEVER call gh pr edit --add-label. They return JSON.
The orchestrator is the sole writer.blocking from recommended. If a
panelist marks everything blocking, the comment becomes noisy and
the maintainer learns to ignore the field. The panelist prompts state
the contract explicitly; the CEO arbitration prose is the safety
valve when a panelist over-flags.extras.diagrams.class_diagram,
extras.diagrams.component, and the OPTIONAL
extras.diagrams.sequence. The template renders nothing when they
are missing -- it does NOT invent diagrams. Real diagrams are
what makes the comment scannable for the human reviewer.classDiagram :::cssClass shorthand gotcha. GitHub's
mermaid renderer rejects :::cssClass appended to relationship
lines (e.g. A *-- B:::touched); use standalone
class Name:::cssClass declarations instead. Authority:
python-architect.agent.md:146-154.AuthResolver. If
a diff changes how a remote host, org, token source, or fallback path
is selected and you are not certain it is auth-neutral, activate
auth-expert as active: true.view/grep on the tests/ tree
before emitting a finding. A false-positive coverage finding (test
exists but persona claimed it does not) destroys maintainer trust in
the field. The persona scope file enforces this; the orchestrator
passes the diff and trusts the persona to probe.gh toolset. The
"subagents must not write" rule is enforced by the prompt contract in
each .agent.md plus the safe-outputs.add-comment.max: 2
fail-soft. If a subagent ever tries to post a comment, the cap
catches it.pr-panel-label-reset.yml that stripped verdict
labels on every push. The advisory regime has no verdict labels to
strip; that workflow is removed.testing
Use only as the composed drive-to-merge stage of an APM batch orchestrator (batch-bug-shepherd, apm-issue-autopilot) that has already selected ONE open pull request in microsoft/apm. Do NOT use for user-facing requests to triage issues, sweep a queue, or open PRs -- the parent orchestrator owns those. Spawn one shepherd-driver subagent per PR: it classifies copilot-pull-request-reviewer[bot] inline review, runs the apm-review-panel, folds (by default) every recommendation inside the PR's stated scope, pushes to the head branch or a superseding PR that preserves authorship via commit trailers, watches CI to green, and iterates under fixed caps until ready-to-merge, advisory-with-deferred, superseded, or blocked. Also provides the cross-PR conflict-resolution and mergeability-gate phase. This is NOT a standalone entrypoint.
testing
Use this skill to write the PR description (PR body) for any pull request opened against microsoft/apm. Produces one self-sufficient GitHub-Flavored Markdown artifact: TL;DR, Problem (WHY), Approach (WHAT), Implementation (HOW), 1-3 validated mermaid diagrams, explicit trade-offs, validation evidence, and a How-to-test section -- with every WHY-claim backed by a verbatim quote from PROSE or Agent Skills. Activate when the user asks to "write a PR description", "draft a PR body", "open a PR", "fill in the PR template", or any equivalent.
testing
Use this skill to triage a single newly opened, reopened, or `status/needs-triage`-labelled issue in microsoft/apm. Emit one synthesized comment with a triage decision, label set, milestone, and suggested next action.
tools
An example skill from the mock plugin