look-before-you-leap/codex-skills/lbyl-verify/SKILL.md
Verification protocol for reviewing Claude-implemented plan steps. Read plan.json, check every acceptance criterion mechanically, run type checker/linter/tests, check consumers via deps-query, report PASS or structured findings. Never modify source files.
npx skillsauth add miospotdevteam/claude-control lbyl-verifyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are verifying work done by another AI agent (Claude). Your job is to independently confirm that the changes match the specification. You are a reviewer, not an implementer.
You must NEVER modify project source files. You may only read files, run commands, and write findings logs.
plan.json at the path given in the prompttitle — what the step is aboutdescription — what was supposed to be implementedacceptanceCriteria — the concrete conditions to verifyfiles — which files should have been modifiedprogress — the sub-tasks and their expected statusesdiscovery.md in the same directory for codebase context:
git diff --name-only to see modified tracked filesgit status --short for untracked new filesfiles array — every listed file should
appear as modified or newly createdGo through the acceptanceCriteria string word by word. For each
concrete condition:
Common verification commands:
tsc --noEmit, bun run tsgo, mypy, cargo checkeslint, ruff, clippypackage.json scripts, Makefile, pyproject.toml for
the project's standard test commandbash -n <script>Pre-existing failures are NOT exempt. If the acceptance criteria say "tsc passes" and tsc does not pass, report it as a finding — regardless of whether the failure was introduced by this step or existed before.
These checks run on EVERY step regardless of what the acceptance criteria say. They catch the most common failure patterns from historical findings.
If the step modified or created files with user-visible strings:
packages/i18n/messages/*.json) for
corresponding entriest("key", "English text")) count as missingaccessibilityLabel= "Close", placeholder="Search") count as missing — they bypass the
translation pipelinegrep -rn 't(' <changed-files> to list every
new translation call, then cross-check each key against every locale
file. Flag any key missing from any locale as MISSING_I18NIf the step modified UI code:
The step description often has more detail than the acceptance criteria.
description word by wordIf the step adds new behavior, verify companion artifacts exist:
If the step added conditional UI ({data && ...}, data?.length > 0,
guards):
If the step implements a pattern that already exists elsewhere in the codebase (swipeable rows, modals, steppers, pickers):
If the step modified shared code (types, utilities, API signatures, exports):
.claude/look-before-you-leap.local.md
with a dep_maps sectiondeps-query.py on each modified shared file:
# Find deps-query.py in the plugin
find ~/.claude/plugins -name "deps-query.py" -path "*/look-before-you-leap/*" 2>/dev/null | head -1
# Run it
python3 <path-to-deps-query.py> <project-root> "<modified-file>"
Your output is consumed by run-codex-verify.sh. It is both a human trace
and the source for <plan-dir>/codex-receipt-step-N.json.
You MUST emit:
```codex-receipt-v1
{ ...valid JSON... }
```
The fenced block MUST be the last block in the response. Do not put prose
inside the fence. Do not emit more than one codex-receipt-v1 fence.
The fenced JSON block MUST match look-before-you-leap/references/codex-receipt-schema.md
schema version 1.0.0.
Required top-level fields:
schemaVersion: exactly "1.0.0"kind: exactly "verify"stepId: numeric plan step idowner: step owner from plan.jsonmode: step mode from plan.jsonplanName: plan .namecodexExitCode: 0 when Codex completed normallycriteria: one entry per acceptance criterionfilesChanged: changed files inspected during verificationfindings: [] on PASS, otherwise structured finding objectsfinalVerdict: "PASS" or "FINDINGS"; use "FAIL" only when Codex
itself could not complete verificationgeneratedAt: UTC ISO-8601 timestampOptional but preferred fields:
projectRoot, planPathresultTxtPath, resultTxtSha256streamJsonlPath, streamJsonlSha256commandsdigestHintsEach criteria[] item MUST include:
id: 1-based criterion indexacceptanceCriterion: verbatim criterion text from plan.jsonacceptanceCriterionSha256: sha256 of the normalized criterion textverdict: "PASS", "FAIL", or "SKIPPED"evidence: array of addressable evidenceFor file evidence, use:
type: "file"file: project-relative pathlineStart and lineEnd: the evidence range; these are the schema fields
for the required evidence[].rangesha256: sha of the referenced file or relevant excerpt when availableFor command evidence, use:
type: "command"command: exact command runexitCode: command exit codestdoutSha256 and/or stderrSha256: output shas when availableFor output evidence, use:
type: "output"label: output labelsha256: output shafinalVerdict MUST be:
"PASS" only when every criteria[].verdict is "PASS", findings is
empty, and codexExitCode is 0."FINDINGS" when verification completed but any criterion failed, was
skipped, or any finding exists."FAIL" only when verification could not complete because Codex or the
verification environment failed.Before the fenced JSON, use these exact headings:
VERDICT:
<PASS|FINDINGS|FAIL> — <one sentence>
CRITERIA:
- C<id> <PASS|FAIL|SKIPPED>: <brief result>
FINDINGS:
- none
When findings exist, replace - none with one bullet per finding:
- HIGH INCOMPLETE_WORK path/to/file.ts:42 — <summary>
VERDICT:
PASS — all acceptance criteria verified.
CRITERIA:
- C1 PASS: Verified exact headings and receipt fence in the modified skill.
FINDINGS:
- none
{
"schemaVersion": "1.0.0",
"kind": "verify",
"stepId": 15,
"owner": "codex",
"mode": "codex-impl",
"projectRoot": "/Users/me/Projects/claude-code-setup",
"planPath": ".temp/plan-mode/active/codex-first-conductor/plan.json",
"planName": "codex-first-conductor",
"codexExitCode": 0,
"criteria": [
{
"id": 1,
"acceptanceCriterion": "Both SKILL.md files specify exact headings / fenced-JSON delimiters.",
"acceptanceCriterionSha256": "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef",
"verdict": "PASS",
"evidence": [
{
"type": "file",
"file": "look-before-you-leap/codex-skills/lbyl-verify/SKILL.md",
"lineStart": 176,
"lineEnd": 260,
"sha256": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"note": "Output contract includes exact heading and fence delimiters."
},
{
"type": "command",
"command": "python3 - <<'PY' ... yaml frontmatter validation ... PY",
"exitCode": 0,
"stdoutSha256": "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"
}
]
}
],
"filesChanged": [
{
"path": "look-before-you-leap/codex-skills/lbyl-verify/SKILL.md",
"changeType": "modified",
"sha256After": "cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc"
}
],
"commands": [
{
"command": "python3 - <<'PY' ... assert removed mode token absent ... PY",
"exitCode": 0,
"stdoutSha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
}
],
"findings": [],
"finalVerdict": "PASS",
"generatedAt": "2026-04-24T18:32:11Z"
}
VERDICT:
FINDINGS — one acceptance criterion is not satisfied.
CRITERIA:
- C1 FAIL: Required fenced JSON delimiter is missing.
- C2 PASS: Frontmatter remains valid.
FINDINGS:
- HIGH INCOMPLETE_WORK look-before-you-leap/codex-skills/lbyl-verify/SKILL.md:176 — Missing codex-receipt-v1 output fence.
{
"schemaVersion": "1.0.0",
"kind": "verify",
"stepId": 15,
"owner": "codex",
"mode": "codex-impl",
"projectRoot": "/Users/me/Projects/claude-code-setup",
"planPath": ".temp/plan-mode/active/codex-first-conductor/plan.json",
"planName": "codex-first-conductor",
"codexExitCode": 0,
"criteria": [
{
"id": 1,
"acceptanceCriterion": "Both SKILL.md files specify exact headings / fenced-JSON delimiters.",
"acceptanceCriterionSha256": "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef",
"verdict": "FAIL",
"rationale": "The output contract still describes prose-only PASS reporting.",
"evidence": [
{
"type": "file",
"file": "look-before-you-leap/codex-skills/lbyl-verify/SKILL.md",
"lineStart": 176,
"lineEnd": 180,
"sha256": "dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd",
"note": "No codex-receipt-v1 fence is specified."
}
]
},
{
"id": 2,
"acceptanceCriterion": "Frontmatter stays valid.",
"acceptanceCriterionSha256": "eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee",
"verdict": "PASS",
"evidence": [
{
"type": "command",
"command": "python3 - <<'PY' ... yaml frontmatter validation ... PY",
"exitCode": 0,
"stdoutSha256": "ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff"
}
]
}
],
"filesChanged": [
{
"path": "look-before-you-leap/codex-skills/lbyl-verify/SKILL.md",
"changeType": "modified"
}
],
"commands": [
{
"command": "rg -n \"codex-receipt-v1\" look-before-you-leap/codex-skills/lbyl-verify/SKILL.md",
"exitCode": 1,
"stdoutSha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
}
],
"findings": [
{
"severity": "HIGH",
"category": "INCOMPLETE_WORK",
"file": "look-before-you-leap/codex-skills/lbyl-verify/SKILL.md",
"lineStart": 176,
"lineEnd": 180,
"summary": "Missing codex-receipt-v1 output fence.",
"rationale": "The wrapper cannot extract a parseable JSON receipt from prose-only output.",
"suggestedFix": "Add the exact codex-receipt-v1 fenced JSON block required by the schema.",
"criterionId": 1
}
],
"finalVerdict": "FINDINGS",
"generatedAt": "2026-04-24T18:55:09Z"
}
When you find issues (finalVerdict is not "PASS"), write a JSON findings
report to the plugin repo's usage-errors/codex-findings/ directory.
The plugin repo is always at ~/Projects/claude-code-setup — write
findings there regardless of which project the plan runs in. Create the
directory if it does not exist.
Filename: YYYY-MM-DD-{plan-name}-step-{N}.json
Re-verify rounds: YYYY-MM-DD-{plan-name}-step-{N}-reverify-{M}.json
Get the plan name from plan.json's .name field. Use today's date.
{
"plan": "<plan.name>",
"project": "<project root path>",
"step": <step id>,
"stepTitle": "<step.title>",
"acceptanceCriteria": "<step.acceptanceCriteria>",
"date": "YYYY-MM-DD",
"findings": [
{
"severity": "HIGH | MEDIUM | LOW",
"category": "INCOMPLETE_WORK | MISSED_CONSUMER | TYPE_SAFETY | SILENT_SCOPE_CUT | WRONG_PATTERN | MISSING_TEST | MISSING_I18N | OTHER",
"file": "relative/path/to/file",
"line": 0,
"summary": "One-line description",
"detail": "Full explanation with suggested fix",
"preventable": "Which instruction could have prevented this"
}
]
}
Severity guide:
development
Use after discovery to write implementation plans with TDD-granularity steps. Produces plan.json (immutable definition, frozen after approval), progress.json (mutable execution state), and masterPlan.md (user-facing proposal for Orbit review). Every step is one component/feature; TDD rhythm (test, verify fail, implement, verify pass, commit) lives in its progress items. Consumes discovery.md from exploration phase. Make sure to use this skill whenever the user says discovery is done, exploration is finished, discovery.md is ready, or asks to write/create/draft the implementation plan — even if they don't mention plan.json or masterPlan.md by name. Also use when the user references completed exploration findings, blast radius analysis, or consumer mappings and wants them converted into actionable steps. Do NOT use when: the user says 'just do it' or 'no plan', resuming or executing an existing plan, during exploration or brainstorming (discovery not yet complete), debugging, or code review.
tools
End-to-end webapp testing with Playwright MCP integration. Use when: writing Playwright tests, E2E testing, browser testing, webapp testing, visual regression testing, accessibility testing with axe-core, testing user flows through a web UI, verifying frontend behavior in a real browser. Integrates with test-driven-development skill for test-first browser tests and engineering-discipline for verification. Do NOT use when: unit tests only (no browser UI involved), API tests without UI, mobile native testing (use react-native-mobile), testing CLI tools, or writing backend-only integration tests.
development
Test-Driven Development workflow enforcing red-green-refactor cycles. Use when writing new features, adding behavior, or implementing functions where tests should drive design. Requires explicit test-first prompting because Claude naturally writes implementation first. Integrates with writing-plans (TDD rhythm in Progress items) and engineering-discipline (verification). Do NOT use when: fixing a bug in existing tested code (use systematic-debugging), writing tests for existing untested code (characterization tests are a different workflow), refactoring without behavior change (use refactoring), or the project has no test infrastructure.
development
Use when encountering any bug, test failure, or unexpected behavior. Enforces root cause investigation before fixes. Four phases: investigate, analyze patterns, form hypotheses, implement. Prevents guess-and-check thrashing. Use ESPECIALLY when under pressure or when 'just one quick fix' seems obvious. Do NOT use for: learning unfamiliar APIs (use exploration), performance optimization without a specific regression, or code review without a reported bug.