marketplace/bundles/pm-plugin-development/skills/ext-self-review-plan-marshall/SKILL.md
Plan-marshall-domain implementor of the ext-self-review-{domain} extension point. Surfaces deterministic candidates (regexes, user-facing strings, markdown sections, symmetric-pair functions, flag-guard pairs, contract sources, schema-bearing files, keep markers, producer-consumer pairs, source-of-truth duplicates, same-document normative directives, description-vs-body frontmatter, lone-unguarded-boundary calls, stale count-prose, near-identical-hunk touched claims) for pre-submission structural self-review.
npx skillsauth add cuioss/plan-marshall ext-self-review-plan-marshallInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Role: Plan-marshall-domain implementor of the ext-self-review-{domain} extension point (see ../../../plan-marshall/skills/extension-api/standards/ext-point-self-review-surfacing.md). Surfaces concrete candidates from the worktree's staged diff so the LLM cognitive review pass in ../../../plan-marshall/skills/phase-6-finalize/workflow/pre-submission-self-review.md can apply the twelve structural-defect checks (symmetric pair, regex over-fit, wording disambiguation, duplication, contract drift, producer-without-consumer, source-of-truth drift, same-document contradiction, description-vs-body drift, unguarded boundary, stale count-prose, touched-claim re-check) against a bounded surface, not an unbounded read of the whole diff.
Execution mode: Library script; invoked via the standard 3-part executor notation by plan-marshall:phase-6-finalize/workflow/pre-submission-self-review.md Step 1.
Prohibited actions:
git without git -C {project_dir} (per dev-agent-behavior-rules)/tmp/ or any path outside .plan/temp/ when staging intermediate stateConstraints:
status: error and a non-zero exit codefile (repo-relative path) and line (1-based line number in the post-diff file content) — these are the only fields the LLM cognitive review consumes for navigationInvoked exclusively by default:pre-submission-self-review (see plan-marshall:phase-6-finalize/workflow/pre-submission-self-review.md Step 1). No other caller is supported. The script is registered as pm-plugin-development:ext-self-review-plan-marshall:self_review and is NOT user-invocable from a slash command — user-invocable: false per the script-only registration convention; this skill is intentionally absent from pm-plugin-development/.claude-plugin/plugin.json.
Authors can flag identifiers that are load-bearing for grep-based external tests, downstream parsers, or any other consumer that asserts a token's literal presence in the source. The marker tells the LLM cognitive review pass that the identifier MUST remain grep-able in the post-image — prose consolidation, rename, or refactor MUST NOT remove it.
Syntax: <!-- self-review: keep <identifier> -->
<identifier> is a single whitespace-free token (e.g. phase_breakdown_override_content, --body, triage_helpers). Whitespace inside the identifier is not supported — use one marker per identifier.self-review:, around keep, and between keep and the identifier is tolerant.Placement: any line within the file or section that authoritatively owns the identifier. The detector scans the diff's added/post-image lines for marker matches — placing the marker on or near the line that defines or references the protected token gives the LLM review the strongest navigational anchor. Multiple markers on the same file are allowed and each emit an independent candidate.
Semantics: for each recognized marker in an added hunk, the surface scan checks whether the protected identifier still appears anywhere in the file's post-image OUTSIDE the marker line itself (the marker comment's own copy of the identifier is excluded from the grep so it cannot mask a removal):
keep_protected. The identifier joins the surface's protected_identifiers set so the LLM review knows to refuse any consolidation that would drop the token in a subsequent revision.keep_violation. The marker is orphaned — the consolidation under review has already removed the protected token. The LLM review surfaces this as a refusal-grade defect.The marker is a pure structural signal — no LLM call is added to the surface scan; the detector emits a keep_markers candidate list and a protected_identifiers set alongside the other heuristic lists. See the keep_markers[] and protected_identifiers[] shapes under § Output below.
surfaceSurfaces fifteen candidate lists from the worktree's staged diff against the base branch.
| Argument | Required | Description |
|----------|----------|-------------|
| --plan-id PLAN_ID | Yes | Plan identifier (kebab-case). Used to derive the plan footprint on demand from the worktree ({base}...HEAD ∪ porcelain) and (when --project-dir is omitted) to auto-resolve the worktree path via manage-status get-worktree-path. |
| --project-dir PROJECT_DIR | No | Absolute path to the active git worktree (escape hatch). When omitted, the path is auto-resolved from --plan-id. All git calls run as git -C {project_dir} .... |
| --base-branch BRANCH | No | Base branch for diff computation. Defaults to main. |
| --contract-radius N | No | Directory levels to walk up when collecting schema-bearing markdown files (default: 3). |
TOON to stdout. The candidate-list keys are always present (possibly empty):
status: success
plan_id: {plan_id}
project_dir: {project_dir}
base_branch: {base_branch}
counts:
regexes: N1
user_facing_strings: N2
markdown_sections: N3
symmetric_pairs: N4
flag_guard_pairs: N5
contract_sources: N6
schema_bearing_files: N7
keep_markers: N8
protected_identifiers: N9
producer_consumer: N10
source_of_truth: N11
same_document_consistency: N12
description_vs_body: N13
unguarded_boundaries: N14
count_prose: N15
touched_claims: N16
total: N1+N2+N3+N4+N5+N8+N10+N11+N12+N13+N14+N16
regexes[N1]{file,line,pattern}:
{repo-relative-path},{line},{regex-pattern-string}
user_facing_strings[N2]{file,line,context,text}:
{repo-relative-path},{line},{context-tag},{string-text}
markdown_sections[N3]{file,line,heading,siblings}:
{repo-relative-path},{line},{heading},{semicolon-joined-sibling-headings}
symmetric_pairs[N4]{file,line,name,partner,test_present}:
{repo-relative-path},{line},{function-name},{inferred-partner-name},{true|false}
flag_guard_pairs[N5]{file,line,flag,forms_covered}:
{repo-relative-path},{line},{--flag},{space|equals|both}
contract_sources[N6]{file,sources}:
{repo-relative-path},{semicolon-joined-contract-source-paths}
schema_bearing_files[N7]{file,format}:
{repo-relative-path},{json|toon}
keep_markers[N8]{file,line,identifier,kind}:
{repo-relative-path},{line},{identifier},{keep_protected|keep_violation}
protected_identifiers[N9]:
{identifier}
producer_consumer[N10]{file,line,key,consumed}:
{repo-relative-path},{line},{produced-key},false
source_of_truth[N11]{name,files,values}:
{constant-name},{semicolon-joined-declaring-files},{semicolon-joined-distinct-values}
same_document_consistency[N12]{file,line,keyword,text}:
{repo-relative-path},{line},{normative-keyword},{directive-line-text}
description_vs_body[N13]{file,line,key,description}:
{repo-relative-path},{line},{description|summary},{frontmatter-description-text}
unguarded_boundaries[N14]{file,line,boundary,guarded}:
{repo-relative-path},{line},{boundary-call-kind},false
count_prose[N15]{file,line,text}:
{repo-relative-skill-md-path},{line},{matched-count-prose-line}
touched_claims[N16]{file,line,text}:
{repo-relative-path},{line},{added-line-text}
The
totalcount covers the twelve line-level heuristics (regexes,user_facing_strings,markdown_sections,symmetric_pairs,flag_guard_pairs,keep_markers,producer_consumer,source_of_truth,same_document_consistency,description_vs_body,unguarded_boundaries,touched_claims) only.contract_sources,schema_bearing_files, andcount_proseare review-anchor categories with their own counts; they are not summed intototal—contract_sourcesandschema_bearing_filesbecause each modified file contributes at most one entry whose payload is references rather than candidates, andcount_prosebecause it anchors a sibling-SKILL.md re-check rather than flagging an added line.protected_identifiersis a derived index overkeep_markersentries withkind: keep_protected— it does not contribute tototaleither.
Regexes — added lines (+ hunks) in .py and .md files containing one of:
re.compile(...), re.match(...), re.search(...), re.findall(...), re.sub(...), re.fullmatch(...)fnmatch.fnmatch(...), fnmatch.filter(...)r"..." or r'...' containing regex metacharacters (^$.*+?[](){}|\)choices=[...] or --*-globs config arrays
The pattern field captures the literal string between the function call's first quote pair (or the raw-string body), truncated to 120 characters.User-facing strings — added lines containing one of:
def /class print(...) first positional argumentdescription=, help=, epilog= (any string literal directly assigned)raise XxxError("..."), raise XxxError(f"...") first argument^#+\s+)^[-*]\s+)
The context field is one of docstring, print, argparse_description, argparse_help, argparse_epilog, raise_message, markdown_heading, markdown_bullet. The text field is the captured string, truncated to 200 characters.Markdown sections — for each .md file appearing in the diff:
^#+\s+ headings in the post-diff file contentsiblings field is a semicolon-joined list of sibling heading texts (peer headings under the same parent), excluding the entry's own headingSymmetric-pair candidates — added lines in .py files matching ^def\s+(\w+). The captured function name is split on _ and inspected for any of the 6 pair tokens: save/load, init/restore, push/pop, acquire/release, open/close, start/stop. When a match is found, the partner field is the same function name with the matched token swapped to its pair (e.g., save_state → load_state). Each entry also carries a deterministic test_present flag (Tier-2 missing-test heuristic): the test/ tree under --project-dir is searched for a word-boundary occurrence of the function name (same (?<![a-zA-Z0-9_-]) / (?![a-zA-Z0-9_-]) lookaround discipline used for keep-identifier markers). test_present=false is the Tier-2 missing-test signal — a newly added symmetric function with no test surface. The LLM half of the check (deciding whether the missing coverage is a real defect) lives in the consumer's Step 3 check 1; see ../../../plan-marshall/skills/phase-6-finalize/workflow/pre-submission-self-review.md.
Flag-guard pairs — added lines in .py files containing an argument-presence guard over a --flag token. Two guard shapes are recognized: membership/substring tests where a quoted --flag literal is the left operand of an in test (e.g., '--project-dir' in args, '--plan-id=' in argv) and startswith checks over a quoted --flag literal (e.g., arg.startswith('--project-dir')). For each guarded flag the detector classifies which flag forms the guard covers: the bare --flag token guards the space-separated form (--flag value), and the --flag= prefix guards the equals form (--flag=value). Coverage is aggregated per (file, flag) across all guards in the file — space when only the bare token appears, equals when only the --flag= prefix appears, and both when both appear. The line field records the first guard occurrence for the flag in the file. The list anchors the cognitive review's flag-form-coverage comparison: when one guard in a symmetric pair covers both forms while its sibling covers only one, the uncovered form is a defect (e.g., a --project-dir guard covering only the space form risks double-injection that violates the mutually-exclusive-arguments contract).
Contract sources — each modified file's sources field is the union of two origins:
--project-dir) looking for the nearest ancestor containing SKILL.md. When found, that SKILL.md plus every *.md under the same skill's standards/ subdirectory are sources..md files only): when a modified workflow/standards .md doc's added lines contain BOTH an execute-script.py invocation via {bundle}:{skill}:{script} notation AND a TOON-field reference (a {field} interpolation token, e.g. {status} or {error}), the referenced script's SKILL.md — resolved to marketplace/bundles/{bundle}/skills/{skill}/SKILL.md — is added as a source. The two signals need not share a line; the doc's added hunk content as a whole must satisfy both. A notation whose SKILL.md does not exist on disk surfaces nothing. This surfaces a sibling script's output-contract document on the doc that interpolates its TOON fields, even when the doc lives outside that script's skill directory.The sources field is a ; -joined, sorted, deduplicated list of the unioned repo-relative paths. A modified file with neither origin contributes no entry. The list anchors the LLM cognitive review on the contract documents that govern the changed code.
Schema-bearing files — *.md files within --contract-radius directory levels of any modified file (default 3 levels up, bounded by --project-dir) whose content contains a fenced JSON or TOON block (```json or `` ``` ``toon). The list is deduplicated; the format field reports the first fence type found. Schema-bearing files surface schema/contract documents the LLM pass must cross-reference against hunks that touch the same schema (e.g., a helper output schema declared in a markdown reference).
Keep-identifier markers — added lines containing <!-- self-review: keep <identifier> --> (see § Keep-Identifier Markers above for the marker contract). For each match the detector emits an entry with identifier, file, line, and kind. The kind is keep_protected when the identifier is still present elsewhere in the file's post-image (the marker line itself is excluded from the grep) and keep_violation when the identifier is no longer present — the second case is an orphaned marker that signals the consolidation removed a protected token. The deduplicated, sorted set of every identifier whose marker resolved to keep_protected is emitted as protected_identifiers so the LLM cognitive review can refuse a consolidation that would drop one of them.
Producer-consumer pairs — added .py lines that produce a value into a dict-keyed output slot (output['key'] = ..., output["key"] = ...) with NO matching consumer anywhere in the diff's added lines. A consumer is a subscript read (foo['key'] not on the LHS of an assignment) or a .get('key'...) call; the producer line's own LHS key never counts as a consumption of itself. The producer-consumer relation is diff-global (a key produced in one file and consumed in another counts as consumed). Each entry carries file, line (the producer line), key (the produced slot), and consumed (always false — only unconsumed producers are surfaced). A dangling producer surfaces a value emitted but never read by any downstream branch; the cognitive review's check 6 decides whether the missing consumer is a real defect.
Source-of-truth duplicates — added .py lines binding the SAME UPPER_SNAKE_CASE constant (NAME = <literal>) in two or more distinct diff files with NON-identical literal values. This is the source-of-truth drift signal: the diff changed the value in one declared SoT location but not the other. Each entry carries name (the constant), files (a ; -joined sorted list of the declaring files), and values (a ; -joined sorted list of the distinct literal RHS values, each truncated to 80 characters). A constant assigned the same value in two files, or a constant declared in a single file, surfaces nothing. The cognitive review's check 7 adjudicates which declaration is stale.
Same-document normative directives — added .md lines carrying an RFC-2119-style normative keyword (MUST, MUST NOT, SHALL, SHALL NOT, NEVER, ALWAYS, REQUIRED, FORBIDDEN). Each added normative directive is surfaced so the cognitive review can compare it against sibling normative statements ALREADY in the same document — a new normative rule that contradicts an existing one is the same-document-consistency defect. By construction this is a Mode-2 guard: an added normative line MUST surface a candidate, never an empty surface. Each entry carries file, line, keyword (the normative keyword that fired), and text (the directive line, truncated to 200 characters). The cognitive review's check 8 reads the surfaced directive against its document siblings.
Description-vs-body frontmatter — surfaces one candidate per modified .md file that carries a frontmatter description: (or summary:) key in its post-image AND has at least one added line in the document body (any added line below the closing frontmatter --- delimiter). A pure frontmatter-only edit, or a .md with no frontmatter description, surfaces nothing. The frontmatter description summarizes the document's model; when the body changes, the description may now describe a model the body no longer implements. Each entry carries file, line (the frontmatter description line in the post-image), key (description or summary), and description (the description value, truncated to 200 characters). The cognitive review's check 9 reads the description against the changed body and surfaces a drift when they diverge.
Lone unguarded boundaries — added .py lines opening a subprocess.* call (run, Popen, check_output, call, check_call) or a file-I/O call (open(, Path.read_text/write_text/read_bytes/write_bytes) that is unguarded. A subprocess.* call is unguarded when check=True is absent on the same line; a file-I/O call has no check kwarg and is always unguarded by that criterion. The second condition is that there is no enclosing try: block in the same function — tracked by a per-file walk that opens an "inside try" window at a try: opener and closes it at the next def/class header (a try cannot span a function boundary). Network calls (socket., urllib., http.client.) are out of scope and never matched, and the existing sibling-envelope unguarded-pair detection is a separate concern not re-implemented here. Each entry carries file, line, boundary (the matched call kind, e.g. subprocess.run or open), and guarded (always false for a surfaced entry). The cognitive review's check 10 decides whether the missing guard is a real defect.
Stale count-prose — for each modified file nested inside a skill directory (the nearest ancestor containing SKILL.md), every SKILL.md in that same skill directory is scanned for count-prose: a digit OR an English number word (one..twenty) immediately adjacent to one of the cardinality nouns operation, field, step, rule, command. The list is a review anchor (excluded from total), not a line-level defect flag — it surfaces every count phrase that may have gone stale because a sibling file in the directory changed. Each entry carries file (the SKILL.md path), line (the matched line, 1-based), and text (the truncated matched line), deduplicated per (file, line). The cognitive review's check 11 re-counts the referent and surfaces a drift when the prose number no longer matches.
Near-identical-hunk touched claims — for each adjacent removed/added (-/+) line pair within a hunk, both lines are tokenized; the pair fires when the two token sequences are equal in length AND differ in exactly one position (a single-token swap). The + line is surfaced so the cognitive pass re-verifies the REST of the line's claims, not just the swapped token. A whitespace-only difference (identical token sequences) and a multi-token difference are both excluded. Each entry carries file, line (the + line's post-image line number), and text (the truncated + line). The cognitive review's check 12 re-checks the surfaced line's surviving claims.
| Condition | Output |
|-----------|--------|
| Live footprint empty (no {base}...HEAD ∪ porcelain changes) | status: success with empty candidate lists (no diff scope) |
| git -C {project_dir} fails | status: error\nerror: git_unavailable\nmessage: ... (exit 1) |
| Base branch not found | status: error\nerror: base_branch_not_found\nbase_branch: {base} (exit 1) |
| Plan not found | status: error\nerror: plan_not_found (exit 1) |
This script is a worktree-scoped (Bucket B) script (per tools-script-executor/standards/cwd-policy.md): callers MAY identify the working tree via either --plan-id {plan_id} (auto-resolved through manage-status get-worktree-path) or --project-dir {worktree_path} (explicit override / escape hatch). The script does NOT reintroduce a sideways main-anchored resolver to discover its own root — the resolved path from those flags, or the uniform cwd walk-up (ADR-002), is the only authoritative source.
manage-references and manage-status reads inside this script do NOT receive --project-dir; they discover .plan/ via the uniform cwd walk-up (ADR-002) from the script's own cwd — main in phases 1-4, the pinned worktree in phase-5+.
test/plan-marshall/ext-self-review-plan-marshall/test_self_review.py covers:
.py and .md hunks (positive + negative)print(), and argparse help=test_present): true when the test/ tree references the function name, false (Tier-2 missing-test signal) when it does not, and word-boundary discipline (no substring false positives, missing test/ directory → false)both), a guard covering only the space form (space), a guard covering only the equals form (equals), the asymmetric-pair case (one both guard + one single-form sibling), and the negative case (no flag guard → empty list).md doc whose added lines reference a sibling script (execute-script.py {bundle}:{skill}:{script}) AND a TOON field ({status}) surfaces that script's SKILL.md; the doc-referenced source is unioned with any directory-structural sources; a dangling notation (no SKILL.md on disk) surfaces nothing; a notation without a TOON-field token surfaces nothing; a TOON-field token without a notation surfaces nothing--project-dir honoring (script does not discover root from cwd)keep_protected when the identifier is still grep-able in the post-image; keep_violation when the consolidation removed the token; marker syntax variations (whitespace tolerance, multiple markers per file) all recognizedoutput['key'] = ... producer with no consumer surfaces a candidate (positive); the same key read back via subscript or .get() (in the same or another file) suppresses it (negative); the producer line's own LHS key never self-consumes.md line carrying a normative keyword (MUST/NEVER/etc.) surfaces a candidate (positive); a non-normative added line, or an added .py line, surfaces nothing (negative).md with a frontmatter description/summary key AND an added body line surfaces a candidate (positive); a pure frontmatter-only edit, or a .md with no frontmatter description, surfaces nothing (negative)subprocess.run(...) with no check=True outside a try, and a file-I/O call (open(...), Path.read_text(...)) outside a try, each surface a candidate (positive); the same calls with check=True or inside a try/except in the same function surface nothing; a def header resets the try-window across functions; network calls (urllib, socket) surface nothing (out of scope)twelve fields / 5 rules surfaces those count-prose lines (positive); a digit NOT adjacent to a cardinality noun surfaces nothing; a modified file outside any skill directory surfaces nothing; the same skill dir reached via two modified siblings deduplicates per (file, line)_iter_changed_line_pairs) yields (file, post_line, removed, added) for adjacent -/+ pairs and ignores unpaired lines and context-broken pairs; a -/+ pair differing by exactly one token surfaces the + line as a touched_claim (positive); a many-token difference, an identical pair, and a differing-token-count pair each surface nothing (negative)The canonical argparse surface for self_review.py. The plugin-doctor analyzer (_analyze_manage_invocation.py) reads this section as source-of-truth for the manage-invocation-invalid and missing-canonical-block rules. Consuming docs xref this section by name instead of restating the command inline. See pm-plugin-development:plugin-script-architecture cross-skill-integration.md § "Script invocation in documentation".
python3 .plan/execute-script.py pm-plugin-development:ext-self-review-plan-marshall:self_review surface \
--plan-id PLAN_ID [--project-dir PROJECT_DIR] [--base-branch BASE_BRANCH] [--contract-radius CONTRACT_RADIUS]
--project-dir is optional: when omitted, the worktree path is auto-resolved from --plan-id. Supplying both is allowed because --plan-id also drives modified-files lookup.
../../../plan-marshall/skills/extension-api/standards/ext-point-self-review-surfacing.md — extension-point contract this skill implements../../../plan-marshall/skills/phase-6-finalize/workflow/pre-submission-self-review.md — sole consumer of this script's output../../../plan-marshall/skills/manage-execution-manifest/standards/decision-rules.md — pre_submission_self_review_inactive pre-filter that gates dispatch of the consumer step../../../plan-marshall/skills/tools-script-executor/standards/cwd-policy.md — Bucket B cwd contract this script obeysdevelopment
The single append-only change-ledger — one worktree_sha-stamped substrate for kind=build and kind=change entries — plus the first-class worktree-sha freshness API
development
Authoring standards for ASCII box diagrams in skill and doc source — box-drawing conventions, right-border alignment, and a deterministic check/fix validator over fenced/literal code blocks in .md and .adoc files
testing
Recipe for verifying and fixing alignment of ASCII box diagrams across .md skill source and .adoc documentation, one deliverable per offending file
development
Pure platform-agnostic terminal-title composition consumed by platform-runtime via PYTHONPATH