plugins/work/skills/verify-work/SKILL.md
Verifies completed work against validation criteria. Works in two modes: with a plan (runs the plan's Validation section) or ad-hoc (builds checks from git diff, project config, and project docs). Use when the user wants to "verify the work", "validate the work", or "run checks", or after completing all tasks in a plan.
npx skillsauth add bcbeidel/wos verify-workInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Verify that completed work meets validation criteria — either from a plan's Validation section or from a hypothesis built from git diff, project conventions, and project docs.
Announce at start: "I'm using the verify-work skill to verify this work."
Also fires when the user phrases the request as:
Plan files are trusted user input — the user authored the plan (or
approved its generation in /work:plan-work) and the Validation section
is part of that authored content. Commands listed there run with the
agent's normal permissions; the user's review of the plan is the
authorization for those commands. This skill does not execute arbitrary
content from anywhere else.
Plan mode: A plan file path was provided, or the user references a plan. Read the plan file. Locate the Validation section. If it contains criteria, proceed to Step 2 (Plan Preconditions).
If the Validation section is missing or empty, stop and report: "This plan has no validation criteria. Add concrete criteria to the Validation section before running validation."
Ad-hoc mode: No plan provided or referenced. Proceed to Step 1b (Build Hypothesis).
Gather signals from three sources. See adhoc-validation for the full protocol.
Git diff: Run git diff main...HEAD --stat, git diff --stat, and
git diff --cached --stat. Categorize changed files (source, tests,
config, docs).
Config files: Scan for project config files to discover available checks (test runners, linters, type checkers, build tools). Only propose checks for tools actually configured.
Project docs: Read CLAUDE.md, AGENTS.md, README.md, and
CONTRIBUTING.md for explicit test/lint/build commands and conventions.
Present the hypothesis:
Based on your changes and project setup, here's what I'd validate:
Changes detected:
- [N] source files modified ([list key files])
- [N] test files modified
- [N] doc files modified
Proposed checks:
1. [auto] `command` — description
2. [auto] `command` — description
3. [human] Description of qualitative check
Add, remove, or modify any of these? Or confirm to run.
Every proposed check must cite its signal source (git diff, config file, or project doc). Wait for user confirmation before executing.
Check that all task checkboxes are complete:
bash <plugin-skills-dir>/start-work/scripts/check_tasks_complete.sh <path>
The script exits 0 with "OK: all tasks complete" if all boxes are checked. It exits 1 and prints the open task lines if any remain. If tasks remain, report them and stop:
"[N] task(s) incomplete. Complete all tasks before validating."
Do not proceed with partial validation.
Tag each numbered item in the Validation section:
Execute commands from automated and mixed criteria in priority order (numbered list order). Capture exit code and output per criterion.
For mixed criteria, run the automated part first. If it fails, mark the criterion as failed without proceeding to the human component.
Show the full numbered list with results:
Validation Results:
1. [PASS] `python python -m pytest tests/ -v` — 42 passed
2. [FAIL] `ruff check src/` — 3 errors found
3. [PENDING] All API responses use consistent error format
4. [PENDING] Documentation covers all new endpoints
Ask the user to confirm each pending (human) criterion. Default: present the full list and ask for confirmation on all pending items at once. If the user prefers one-by-one, switch to that mode.
See human-validation for presentation patterns, judgment vs. confirmation criteria, and escalation.
If any criterion failed:
Plan mode:
executing stateIf the user adds tasks, insert them into the plan's Tasks section
(before the Validation heading, not after it). This is critical —
assess_plan.py only parses tasks under task-related headings. Tasks
appended after the Validation heading will be invisible to the execution
tooling. Update the plan file and save. The plan returns to active
execution.
Ad-hoc mode:
No plan file to update — suggestions are conversational.
When all criteria pass (automated + human confirmed) and none are marked uncertain:
Plan mode:
status: completedValidation Complete — ALL PASSED
Plan: [plan name]
Criteria: [N] total ([A] automated, [H] human, [M] mixed)
Results:
1. [PASS] criterion description
2. [PASS] criterion description (human-confirmed)
...
Status updated: executing → completed
Ad-hoc mode:
Validation Complete — ALL PASSED
Criteria: [N] total ([A] automated, [H] human)
Results:
1. [PASS] criterion description
2. [PASS] criterion description (human-confirmed)
...
executing on failure (plan mode). Never mark a
plan as failed or completed when validation criteria fail. Add tasks
to address gaps or abandon with a reason.Chainable to: finish-work (on pass), start-work (on fail)
tools
Use when the user wants to "audit a help skill", "review my plugin index", or "verify my help-skill is up to date". Audits a plugins/<plugin>/skills/help/SKILL.md against the help-skill rubric — coverage, freshness, frontmatter fidelity, plus five judgment dimensions and a trigger-collision check.
tools
Use when the user wants to "scaffold a help skill", "add a /<plugin>:help command", or "build a plugin index skill", or wants to give a plugin an orientation surface that lists its skills and common workflows. Produces a SKILL.md at plugins/<plugin>/skills/help/SKILL.md.
tools
Audits pair-level integrity of a primitive-pair (the artifact `/build:build-skill-pair` produces) by walking the four required artifact slots — principles doc, `build-<primitive>/SKILL.md`, `check-<primitive>/SKILL.md`, and the `primitive-routing.md` registration — and reports cross-artifact issues a per-SKILL.md checker cannot see: missing principles doc, divergent principles paths between halves, absent routing registration, missing build→check handoff. Per-half structural compliance with the unified pattern (`check-skill-pattern.md`) is delegated to `plugins/build/_shared/scripts/check_skill_pattern.py`. Use when the user wants to "audit a skill pair", "review a primitive pair", or "validate the skill pair for X". Not for auditing a single SKILL.md — route to `/build:check-skill`. Not for re-distilling a stale principles doc — route to `/build:build-skill-pair`.
testing
Audit a root-level resolver — verify AGENTS.md pointer, managed-region integrity, filing-table coverage against disk, context-table actionability, and trigger-eval pass rate. Use when the user wants to "audit a resolver", "validate routing table", or "find dark capabilities".