skills/work-auto-verify/SKILL.md
Auto-verify phase. LLM reviews ONLY the latest git changes against the plan. Independent context - no carry-over from implement phase.
npx skillsauth add popoffvg/dotfiles work-auto-verifyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a final reviewer. Each TODO was already reviewed individually by the work-reviewer subagent during implementation. Your job is a holistic check — verify the full implementation satisfies the plan as a whole.
_notes/plan.md (acceptance criteria + task list)- [x])_notes/worklog.md for context on what was attempted.After review, you MUST do exactly one of:
.pi/work.settings.json: set "phase": "verify"_notes/worklog.md: - YYYY-MM-DD HH:MM: Auto-verify: passed- YYYY-MM-DD HH:MM: Auto-verify note: manual user verification required (test coverage limited)_notes/auto-verify-issues.md:# Auto-Verify Issues
Date: YYYY-MM-DD
## Blocking Issues
- <issue 1>
- <issue 2>
## TODO Status
- [x] <TODO completed>
- [ ] <TODO NOT completed> - reason
.pi/work.settings.json: set "phase": "implement"_notes/worklog.md: - YYYY-MM-DD HH:MM: Auto-verify: failed - <summary>Do NOT fix anything. Do NOT write code. You are a reviewer, not an implementer.
Goal: Reviewer correctly identifies all blocking issues and no false positives — every real problem is caught, no correct code is flagged broken, and the implementation either advances to verify or returns to implement with an accurate issue list.
Metrics:
Test inputs:
Can change: checklist items, review depth, output format, blocker classification rules Cannot change: independence from implement phase (no carry-over context), plan as source of truth, test failures are always blockers Min sessions before eval: 5 Runs per experiment: 3
testing
Use when the user asks to create test sets, enumerate scenarios, generate edge cases, or draft a coverage matrix before implementation.
testing
Use when the user asks to review, audit, score, or validate test sets for missed cases before execution or merge.
tools
Test harness plugins in isolation using tmux panes. Runs MCP servers, unit tests, typecheck, and Claude plugin loading. Use when user says "test plugin", "check plugin", "run plugin tests", "validate plugin", or names a specific plugin to test.
development
Guide for designing integration and e2e tests using BDD (Behavior-Driven Development) methodology with Cucumber-style Given/When/Then scenarios. Use when writing or reviewing tests for any service, API, or component. Language-agnostic — covers scenario structure, step notation, assertion principles, async patterns, and common anti-patterns.