look-before-you-leap/skills/lbyl-digest/SKILL.md
INTERNAL conductor-dispatched digester skill for the look-before-you-leap plugin. Reads raw on-disk artifacts (Codex exploration MD, consensus batch MD files, codex-receipt-step-N.json + diff + modified files) and produces bounded, substance-preserving digests so the main Claude thread never reads raw bytes. Three operating modes: (1) co-exploration digester — merges Claude+Codex exploration outputs into discovery-digest.md; (2) consensus digester — collapses multi-batch consensus outputs into consensus-round-N-digest.md with ACCEPT/MODIFY/REJECT counts and open disagreements; (3) verification digester — reads the codex-receipt-step-N.json evidence artifact, the file changes Codex made, and the modified files themselves, and writes a sibling claude-review file with PASS/FINDINGS verdict + per-finding evidence. Dispatched ONLY by the conductor (look-before-you-leap, codex-dispatch, writing-plans) via the Skill or Agent tool. NEVER assignable as a plan-step skill in plan.json. NEVER write code, NEVER mint HMAC sidecars, NEVER pass --model flags.
npx skillsauth add miospotdevteam/claude-control lbyl-digestInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
INTERNAL skill — conductor-dispatched only.
This skill is invoked exclusively by the conductor (the
look-before-you-leap,codex-dispatch, andwriting-plansskills) through the Skill tool or as the sub-agent context for an Agent tool dispatch. It MUST NOT appear as the value ofstep.skillin anyplan.json. It is not a plan-routable skill — it has no acceptance criteria, no progress items, and no result template. It is a stateless reader/summarizer that the conductor uses to keep its own context bounded.If you find yourself reading this SKILL.md because a plan step listed
look-before-you-leap:lbyl-digestas its skill, that is a routing error. Stop, log it, fall back to"none"for that step, and notify the conductor.
The codex-first-conductor refactor moves implementation, verification,
and exploration into sub-agents so the main Claude thread reads only
receipts and digests, never raw artifacts. Three classes of raw
artifact recur:
codex-exploration.md (often plus codex-convergence.md). Together
these can be hundreds of lines of bullets, half of which restate
things the conductor already knows.codex-dispatch runs Codex in batches of 5 and produces
codex-consensus-batch-1.md, -batch-2.md, etc. Each one repeats
the ACCEPT/MODIFY/REJECT structure. The conductor needs counts and
the surviving disagreements, not the full prose.codex-receipt-step-N.json (per the
schema in references/codex-receipt-schema.md) plus git diff of
the step's files plus the files themselves. The conductor needs to
know whether the Codex receipt actually matches what changed on disk.lbyl-digest reads each class of artifact in a fresh sub-agent context
and returns a bounded payload to the conductor.
Include every finding that would change a plan decision or identify a consumer/blast-radius. Drop restatements, tutorials, and file listings that aren't consumed.
This is the central correctness rule. It is not a brevity guideline — it is a substance rule. Trim restatement, never trim substance. There is no hard size limit on any digest. A correct digest of a 500-finding consensus round may legitimately be longer than an incorrect digest of a 20-finding round.
Bad (restatement, would be cut):
"Codex reviewed step 4 and noted that step 4's acceptance criteria include passing tsc. Codex confirmed that tsc passes. Step 4 also requires that the new component renders without errors, and Codex confirmed it does. Codex reviewed each of the 6 acceptance criteria and confirmed all 6 pass."
Why bad: The conductor already has the acceptance criteria from
plan.json. Repeating "Codex confirmed criterion N" 6 times burns
context for zero new information. The PASS verdict alone (with
per-criterion verdict counts pulled from the receipt) carries the same
information.
Good (substance, must be kept):
"PASS, 6/6 criteria. Note: Codex reports that
src/components/Modal.tsx:142-156was modified to handle the new prop, but the same prop type is also consumed bysrc/components/Drawer.tsx:88andsrc/components/Sheet.tsx:73-80, neither of which Codex modified. Conductor should grep these consumers before marking step done."
Why good: The blast-radius observation (two unmodified consumers of the
same prop type) is exactly the kind of finding that would change a
plan decision. It is not in the receipt's criteria[] array — it is a
cross-cutting observation that only emerges when the digester reads
the receipt and the surrounding files together.
NEVER pass --model flags when dispatching sub-agents or running
codex exec from inside this skill. Inherit the machine defaults.
sonnet, haiku, gpt-5, or any non-default
model variant — not "to save tokens", not "because the task seems
small", not "because the dispatch is just a digester".A digester running on a weaker model produces a worse digest, and the conductor cannot detect the degradation because the conductor is literally avoiding reading the raw artifacts. Model downgrades here silently destroy the conductor's accuracy. There are zero acceptable exceptions.
If you find yourself typing --model, -m, --effort low, or any
similar override flag, stop. The default is the contract.
~/.claude/look-before-you-leap/state/<projectId>/<planId>/ is
written ONLY by run-codex-verify.sh / run-codex-implement.sh via
receipt_utils.sign(). Touching the sidecar from a digester would
invalidate the artifact↔sidecar binding (see Hook/security note
below).codex-receipt-step-N.json in place. The HMAC
sidecar binds data.artifactSha256 to the exact bytes of the
artifact on disk; mutating the artifact breaks the strict verifier
in references/codex-receipt-schema.md §1.1..codex-result-step-N.txt as authoritative. That
file is a human trace only (per references/codex-receipt-schema.md
§1 and §6). Authoritative parsing reads
codex-receipt-step-N.json only.The conductor passes:
<plan-dir> — absolute path to the active plan directory
(e.g., .temp/plan-mode/active/<plan-name>/).discovery.md (already contains Claude's exploration
notes; may also contain Codex's appended findings).Files to read:
<plan-dir>/discovery.md — Claude's exploration notes plus any
## [Codex: …] sections appended via cat codex-exploration.md >> discovery.md.<plan-dir>/codex-exploration.md — raw Codex exploration output
from Phase 1 (if present and not yet merged).<plan-dir>/codex-convergence.md — raw Codex convergence output
from Phase 2 (if present).Fields to extract:
Discard:
Write to: <plan-dir>/discovery-digest.md
Format: one section per topic, each topic begins with the topic name in square brackets and a one-line headline, then bullets with concrete file:line evidence. Example:
# Discovery Digest — <plan-name>
## [Blast radius]
- src/lib/auth-guard.ts is consumed by 14 callers across 3 packages
(apps/web/src/middleware/*.ts:12-89, apps/api/src/handlers/*.ts:7-42,
packages/shared/auth/index.ts:5). Plan must split per package or
serialize.
## [Disagreements]
- Claude says SessionStore is stateful (discovery.md:88-94); Codex says
the writes are debounced via the queue at src/lib/session-queue.ts:34.
Resolution: both correct — store is stateful but writes are async.
Conductor should plan tests that exercise both paths.
## [Open questions]
- Should the new auth flow share the existing rate-limit middleware
(apps/api/src/middleware/rate-limit.ts:18) or get its own? No
precedent in repo.
{
"kind": "co-exploration",
"digestPath": "<plan-dir>/discovery-digest.md",
"topicsCount": <int>,
"openQuestionsCount": <int>,
"summary": "<2-4 sentence prose>: blast radius headline, key
disagreement (if any), open questions count."
}
The conductor reads summary, sees openQuestionsCount, and decides
whether to surface open questions to the user before proceeding to
writing-plans. The conductor does NOT read the digest file unless
the summary indicates it must.
Drop file inventories ("here are the 47 files in src/lib"). Keep consumer counts and disagreement citations — those change planning decisions.
The conductor passes:
<plan-dir> — absolute path to the active plan directory.<round-N> — integer round number (1, 2, or 3).Files to read (glob patterns):
<plan-dir>/codex-consensus-round<N>.md — single-call consensus
output (used when plan has ≤5 steps).<plan-dir>/codex-consensus-batch-*.md — multi-batch outputs (used
when plan has >5 steps; may be batch-1.md, batch-2.md, etc.).<plan-dir>/codex-consensus-cross-cutting.md — optional, present
only when the conductor ran a follow-up cross-cutting check across
merged batches.Fields to extract:
ACCEPT,
MODIFY <changes>, or REJECT <reason>).Discard:
plan.json — the conductor
has those.Write to: <plan-dir>/consensus-round-<N>-digest.md
Format:
# Consensus Round <N> Digest — <plan-name>
## Counts
- ACCEPT: <int>
- MODIFY: <int>
- REJECT: <int>
- Total steps reviewed: <int>
## Modifications (one bullet per step that needs work)
- Step <id> "<title>": MODIFY — <Codex's concrete change request,
one line>. Conductor action: <accept | counter-propose | escalate>.
## Rejections
- Step <id> "<title>": REJECT — <reason>. Conductor action: <…>.
## Cross-cutting concerns
- <one bullet per cross-cutting issue, with the steps it touches>
## Open disagreements (carry to next round if any)
- <one bullet per disagreement Claude has not yet responded to>
{
"kind": "consensus",
"round": <N>,
"digestPath": "<plan-dir>/consensus-round-<N>-digest.md",
"counts": { "accept": <int>, "modify": <int>, "reject": <int>,
"total": <int> },
"decisions": [
{ "stepId": <int>, "title": "<step title>",
"verdict": "MODIFY" | "REJECT",
"request": "<one-line: Codex's concrete change/reason>",
"conductorAction": "accept" | "counter-propose" | "escalate" }
],
"openDisagreements": [
{ "stepId": <int>, "title": "<step title>",
"summary": "<one-line>" }
],
"summary": "<2-3 sentence prose>: counts, headline disagreements."
}
decisions MUST include EVERY MODIFY and REJECT step (no cap — each
one is an actionable plan decision the conductor must respond to;
ACCEPT items are intentionally omitted because the count alone is
sufficient). openDisagreements carries any disagreement Claude has
not yet responded to (may overlap with decisions when a MODIFY is
also unresolved across rounds).
The conductor reads counts to decide whether the plan can advance to
Orbit (e.g., reject == 0 && openDisagreements.length == 0) or
whether Round 2 is needed. It reads decisions to know which steps
need plan edits and what action to take, and openDisagreements to
know what to respond to — all without opening the digest file.
Drop the per-step ACCEPT prose — a single count is sufficient. Keep every MODIFY/REJECT line because each one is a plan decision the conductor must respond to.
The conductor passes:
<plan-dir> — absolute path to the active plan directory.<step-N> — integer step number being verified.<project-root> — absolute path to the project root (for git diff
scoping).Files to read:
<plan-dir>/codex-receipt-step-<N>.json — the authoritative
evidence artifact, schema per
references/codex-receipt-schema.md §2. This is the only source of
truth for what Codex claims it did.<project-root> git diff scoped to the step's files array
(read via git diff -- <file1> <file2> … against the parent of
Codex's working commit). The digester checks that
receipt.filesChanged[].path matches the actual diff and that
receipt.filesChanged[].sha256After matches the file on disk.receipt.criteria[].evidence[].file/lineStart/lineEnd. The digester
reads only the cited line ranges, not entire files.Fields to extract from the receipt (per schema):
schemaVersion, kind, stepId, owner, mode — sanity check.codexExitCode — must be 0 for any non-FAIL outcome.criteria[].id, criteria[].acceptanceCriterion,
criteria[].acceptanceCriterionSha256, criteria[].verdict,
criteria[].evidence[] — per-criterion verdicts and evidence.filesChanged[].path, filesChanged[].changeType,
filesChanged[].sha256After — what Codex says it changed.findings[] — Codex's self-reported findings (severity, category,
file, line, summary).finalVerdict — Codex's mechanically-computed top-level verdict
(PASS / FINDINGS / FAIL).digestHints (optional) — Codex's hints to digesters about which
files/findings to surface first.Independent checks the digester MUST run:
git diff --name-only for the step's files MUST appear in
receipt.filesChanged[] (and vice versa). Mismatches are a
FINDINGS-class problem — Codex modified a file it did not declare,
or declared a file it did not modify.receipt.filesChanged[i].path,
re-hash the file at <project-root>/<path> and verify it equals
receipt.filesChanged[i].sha256After. A mismatch means the file
has been touched after Codex finished — escalate.criteria[i].evidence[] entry of
type: "file", read the cited line range and verify it actually
contains code that satisfies criteria[i].acceptanceCriterion.
This is the qualitative check the receipt cannot self-attest.receipt.findings[] is non-empty,
note that the receipt's own finalVerdict should be FINDINGS
(not PASS). If the receipt claims PASS while findings exist,
that is a schema violation — escalate.Write to: <plan-dir>/codex-receipt-step-<N>.claude-review.json
This is a sibling file to the receipt, NOT an in-place update of
the receipt. The receipt JSON is left byte-identical so the HMAC
sidecar binding in ~/.claude/look-before-you-leap/state/… remains
valid (see Hook/security note below).
Sibling-file shape:
{
"schemaVersion": "1.0.0",
"kind": "claude-verification-digest",
"stepId": <N>,
"receiptPath": "<plan-dir>/codex-receipt-step-<N>.json",
"receiptSha256": "<hex sha256 of the receipt file as read>",
"claudeVerified": "PASS" | "FINDINGS",
"findings": [
{
"severity": "HIGH" | "MEDIUM" | "LOW",
"category": "INCOMPLETE_WORK" | "MISSED_CONSUMER" |
"TYPE_SAFETY" | "SILENT_SCOPE_CUT" |
"WRONG_PATTERN" | "MISSING_TEST" |
"MISSING_I18N" | "OTHER",
"summary": "<one-line>",
"rationale": "<why this is a finding>",
"evidence": [
{ "type": "file", "file": "<rel path>",
"lineStart": <int>, "lineEnd": <int> }
],
"criterionId": <int | null>
}
],
"crossChecks": {
"diffMatchesReceipt": true | false,
"sha256AllMatch": true | false,
"findingsReceiptConsistent": true | false
},
"generatedAt": "<ISO 8601 UTC>"
}
Why a sibling file rather than an in-place claudeVerified field on
the artifact: see "Hook/security note" below. The sibling is bound to
the receipt by receiptPath + receiptSha256 — downstream readers
(hook updates in plan steps 5–10) verify the binding by re-hashing
the receipt file and checking equality.
{
"kind": "verification",
"stepId": <N>,
"claudeVerified": "PASS" | "FINDINGS",
"findingCount": <int>,
"reviewPath": "<plan-dir>/codex-receipt-step-<N>.claude-review.json",
"criteria": [
{ "id": <int>, "verdict": "PASS" | "FAIL" | "SKIPPED" }
],
"summary": "<2-4 sentence prose>: PASS or FINDINGS verdict, the
top-severity finding category if FINDINGS, the
cross-check that failed if any."
}
criteria MUST include EVERY entry from the receipt's criteria[]
array (id + verdict only — no rationale/evidence; those live in the
sibling reviewPath file). This mirrors the receipt schema's
criteria[] shape (see references/codex-receipt-schema.md §2,
required: ["id", ..., "verdict", ...]) so the conductor can show
per-criterion status without opening the review file. No size cap is
needed: criterion counts are bounded by the step's
acceptanceCriteria length (typically 1–6).
The conductor reads claudeVerified + findingCount for the gate
decision and criteria to surface per-criterion status. On PASS it
proceeds to mark the step done (subject to the existing strict
receipt gate). On FINDINGS it inspects criteria for the failed ids
and decides whether to re-dispatch Codex, patch via a Claude
sub-agent, or escalate.
Drop the receipt's per-criterion prose if every criterion verdict is
PASS — the verdict counts and claudeVerified: PASS carry the
information. Keep every cross-check failure (diff/receipt mismatch,
sha256 mismatch) because each one is a trust-anchor problem the
conductor must act on.
The digester MUST NOT modify the HMAC sidecar. The signed sidecar
in ~/.claude/look-before-you-leap/state/<projectId>/<planId>/codex_verify-step-<N>.json
binds data.artifactSha256 to the exact bytes of
<plan-dir>/codex-receipt-step-<N>.json on disk. Any in-place
mutation of the artifact — including adding a claudeVerified key —
would change the artifact bytes, change the sha256, and break the
binding. The strict verifier defined in
references/codex-receipt-schema.md §1.1 rejects sidecars whose
data.artifactSha256 does not match the on-disk file. So an in-place
update would silently invalidate the receipt the next time the hook
runs, and the step would become un-verifiable.
The digester ALSO MUST NOT re-mint or re-sign the sidecar. Only
run-codex-verify.sh and run-codex-implement.sh are allowed to call
receipt_utils.sign() (the secret key in
~/.claude/look-before-you-leap/state/secret.key is mode 0600 and the
direction-locked scripts are the only sanctioned writers). Calling
receipt_utils.sign() from a digester sub-agent would conceptually
extend the trust boundary in a way the security model does not
permit.
Chosen approach: sibling file. The digester writes
<plan-dir>/codex-receipt-step-<N>.claude-review.json next to the
receipt, with receiptPath + receiptSha256 binding the review back
to the exact receipt bytes it inspected. Downstream consumers
(verify-step-completion.sh updates in plan steps 5–10, the
conductor's done-gate) check both files exist, that
review.receiptSha256 == sha256(open(review.receiptPath).read()), and
that review.claudeVerified == "PASS" before allowing the step to be
marked done.
This approach was chosen over two alternatives:
claudeVerified field on the artifact — rejected
because it would invalidate the HMAC sidecar binding (above).Plan steps 5–10 (which wire the new digest pipeline into hooks) MUST
update verify-step-completion.sh to:
*.claude-review.json if it exists.review.receiptSha256 matches a fresh re-hash of the
artifact (catches review-vs-artifact drift).review.claudeVerified == "PASS" for owner: "codex"
steps, replacing the current "result must contain Claude: verified" string check.The conductor never calls this skill speculatively — only when raw artifacts exist on disk that need digesting before the conductor reads them. Three concrete invocation sites in the existing plugin skills:
| Conductor site | Mode | When |
|---|---|---|
| look-before-you-leap Step 1 (Explore) Phase 1/2 — co-exploration | (1) co-exploration | After both codex-exploration.md and codex-convergence.md exist on disk and Claude's Phase 1 notes are appended to discovery.md. |
| codex-dispatch "Plan Consensus Dispatch" Round 1/3 | (2) consensus | After all codex-consensus-batch-*.md (and optional codex-consensus-cross-cutting.md) for round N have been written by background codex exec calls. |
| look-before-you-leap Step 3 execution loop, codex-impl verification | (3) verification | After run-codex-implement.sh has finished and the conductor has the codex-receipt-step-N.json artifact on disk. |
In all three cases the conductor dispatches a sub-agent (Agent tool,
subagent_type: "general-purpose") and the sub-agent loads this
SKILL.md as its primary guidance. The sub-agent receives the input
contract values (plan-dir, step-N, etc.) in its prompt, performs the
reads, writes the on-disk output, and returns the payload shape
defined for that mode. The conductor consumes only the returned
payload — never the underlying raw files.
If the conductor needs to dispatch this skill to inspect a raw artifact that does not fit one of the three modes (e.g., an ad-hoc debugging request to digest a single Codex stream JSONL), it MUST fall back to a generic "read this file and return a bounded summary" sub-agent dispatch — not pretend it's one of the three modes. The three modes have on-disk output contracts that downstream hooks rely on; do not reuse those output paths for ad-hoc digests.
| Mode | Reads | Writes | Returns |
|---|---|---|---|
| co-exploration | discovery.md, codex-exploration.md, codex-convergence.md | <plan-dir>/discovery-digest.md | { kind, digestPath, topicsCount, openQuestionsCount, summary } |
| consensus | codex-consensus-round<N>.md OR codex-consensus-batch-*.md (+ optional codex-consensus-cross-cutting.md) | <plan-dir>/consensus-round-<N>-digest.md | { kind, round, digestPath, counts, decisions, openDisagreements, summary } |
| verification | codex-receipt-step-<N>.json + git diff of step files + cited line ranges in modified files | <plan-dir>/codex-receipt-step-<N>.claude-review.json (sibling, NOT in-place) | { kind, stepId, claudeVerified, findingCount, reviewPath, criteria, summary } |
Before returning the payload, the digester MUST verify:
--model flag was passed in any sub-shell invocation made
during the digest..claude-review.json is the only new file written.If any check fails, fix before returning. Returning a bad digest silently corrupts every downstream conductor decision.
development
Use after discovery to write implementation plans with TDD-granularity steps. Produces plan.json (immutable definition, frozen after approval), progress.json (mutable execution state), and masterPlan.md (user-facing proposal for Orbit review). Every step is one component/feature; TDD rhythm (test, verify fail, implement, verify pass, commit) lives in its progress items. Consumes discovery.md from exploration phase. Make sure to use this skill whenever the user says discovery is done, exploration is finished, discovery.md is ready, or asks to write/create/draft the implementation plan — even if they don't mention plan.json or masterPlan.md by name. Also use when the user references completed exploration findings, blast radius analysis, or consumer mappings and wants them converted into actionable steps. Do NOT use when: the user says 'just do it' or 'no plan', resuming or executing an existing plan, during exploration or brainstorming (discovery not yet complete), debugging, or code review.
tools
End-to-end webapp testing with Playwright MCP integration. Use when: writing Playwright tests, E2E testing, browser testing, webapp testing, visual regression testing, accessibility testing with axe-core, testing user flows through a web UI, verifying frontend behavior in a real browser. Integrates with test-driven-development skill for test-first browser tests and engineering-discipline for verification. Do NOT use when: unit tests only (no browser UI involved), API tests without UI, mobile native testing (use react-native-mobile), testing CLI tools, or writing backend-only integration tests.
development
Test-Driven Development workflow enforcing red-green-refactor cycles. Use when writing new features, adding behavior, or implementing functions where tests should drive design. Requires explicit test-first prompting because Claude naturally writes implementation first. Integrates with writing-plans (TDD rhythm in Progress items) and engineering-discipline (verification). Do NOT use when: fixing a bug in existing tested code (use systematic-debugging), writing tests for existing untested code (characterization tests are a different workflow), refactoring without behavior change (use refactoring), or the project has no test infrastructure.
development
Use when encountering any bug, test failure, or unexpected behavior. Enforces root cause investigation before fixes. Four phases: investigate, analyze patterns, form hypotheses, implement. Prevents guess-and-check thrashing. Use ESPECIALLY when under pressure or when 'just one quick fix' seems obvious. Do NOT use for: learning unfamiliar APIs (use exploration), performance optimization without a specific regression, or code review without a reported bug.