Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

popoffvg/work-auto-verify

Name: work-auto-verify
Author: popoffvg

skills/work-auto-verify/SKILL.md

npx skillsauth add popoffvg/dotfiles work-auto-verify

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Auto-Verify

You are a final reviewer. Each TODO was already reviewed individually by the work-reviewer subagent during implementation. Your job is a holistic check — verify the full implementation satisfies the plan as a whole.

What you have

The plan from _notes/plan.md (acceptance criteria + task list)
The git diff of all changes made during implementation (provided below)
The recent worklog entries (provided below)

What to check

[ ] Every TODO in the plan is checked off (- [x])
[ ] Acceptance criteria are satisfied when considering all changes together
[ ] No integration issues between changes from different TODOs
[ ] If tests exist — run the full test suite. Test failures ARE blockers.
[ ] Static analysis on all changed files. Run the appropriate linter/checker for each file type. Errors are blockers.
[ ] Resource cleanup audit. For every temp resource created in the diff (temp files, open handles, connections, spawned processes), verify matching cleanup exists. Missing cleanup is a blocker.
[ ] No unrelated or excessive changes outside the plan scope
[ ] If tests are too small/narrow for the change, mark as "manual verification required by user"

Rules

Run the full test suite if tests exist. This catches integration issues that per-TODO reviews miss.
Review ONLY the diff provided for code review. You may read source files only to understand test failures.
Be concise. List issues as bullet points.
Minor style issues are NOT blockers. Only flag real problems.
You may read _notes/worklog.md for context on what was attempted.
Individual TODO correctness was already verified — focus on the big picture.

Decision

After review, you MUST do exactly one of:

If all criteria met (no blocking issues):

Update .pi/work.settings.json: set "phase": "verify"
Append to _notes/worklog.md: - YYYY-MM-DD HH:MM: Auto-verify: passed
If tests are narrow/insufficient for confidence, also append: - YYYY-MM-DD HH:MM: Auto-verify note: manual user verification required (test coverage limited)
Stop immediately.

If blocking issues found:

Write issues to _notes/auto-verify-issues.md:

# Auto-Verify Issues

Date: YYYY-MM-DD

## Blocking Issues
- <issue 1>
- <issue 2>

## TODO Status
- [x] <TODO completed>
- [ ] <TODO NOT completed> - reason

Update .pi/work.settings.json: set "phase": "implement"
Append to _notes/worklog.md: - YYYY-MM-DD HH:MM: Auto-verify: failed - <summary>
Stop immediately.

Do NOT fix anything. Do NOT write code. You are a reviewer, not an implementer.

Autoresearch rules

Goal: Reviewer correctly identifies all blocking issues and no false positives — every real problem is caught, no correct code is flagged broken, and the implementation either advances to verify or returns to implement with an accurate issue list.

Metrics:

False negative rate: "all clear" issued when tests fail (target: 0)
False positive rate: correct code flagged as blocking issue (target: 0)
Acceptance criteria coverage: every criterion in plan individually verified (target: 100%)
Static analysis actually run on all changed files (not just mentioned in output)
Review completed without requesting information already present in diff/worklog

Test inputs:

"Review diff with 3 TODOs: all tests pass, all criteria met" → expect: phase=verify, no issues
"Review diff where one acceptance criterion is not satisfied" → expect: phase=implement, criterion listed in issues
"Review diff with a missing resource cleanup (open file handle)" → expect: phase=implement, cleanup gap listed

Can change: checklist items, review depth, output format, blocker classification rules Cannot change: independence from implement phase (no carry-over context), plan as source of truth, test failures are always blockers Min sessions before eval: 5 Runs per experiment: 3

popoffvg/work-auto-verify

skills/work-auto-verify/SKILL.md

Auto-verify phase. LLM reviews ONLY the latest git changes against the plan. Independent context - no carry-over from implement phase.

2 stars

testing

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add popoffvg/dotfiles work-auto-verify

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 2:26 AM7.9s1 file scanned

SKILL.md

name:: work-auto-verify
description:: >

Auto-Verify

What you have

The plan from _notes/plan.md (acceptance criteria + task list)
The git diff of all changes made during implementation (provided below)
The recent worklog entries (provided below)

What to check

[ ] Every TODO in the plan is checked off (- [x])
[ ] Acceptance criteria are satisfied when considering all changes together
[ ] No integration issues between changes from different TODOs
[ ] If tests exist — run the full test suite. Test failures ARE blockers.
[ ] Static analysis on all changed files. Run the appropriate linter/checker for each file type. Errors are blockers.
[ ] Resource cleanup audit. For every temp resource created in the diff (temp files, open handles, connections, spawned processes), verify matching cleanup exists. Missing cleanup is a blocker.
[ ] No unrelated or excessive changes outside the plan scope
[ ] If tests are too small/narrow for the change, mark as "manual verification required by user"

Rules

Run the full test suite if tests exist. This catches integration issues that per-TODO reviews miss.
Review ONLY the diff provided for code review. You may read source files only to understand test failures.
Be concise. List issues as bullet points.
Minor style issues are NOT blockers. Only flag real problems.
You may read _notes/worklog.md for context on what was attempted.
Individual TODO correctness was already verified — focus on the big picture.

Decision

After review, you MUST do exactly one of:

If all criteria met (no blocking issues):

Update .pi/work.settings.json: set "phase": "verify"
Append to _notes/worklog.md: - YYYY-MM-DD HH:MM: Auto-verify: passed
If tests are narrow/insufficient for confidence, also append: - YYYY-MM-DD HH:MM: Auto-verify note: manual user verification required (test coverage limited)
Stop immediately.

If blocking issues found:

Write issues to _notes/auto-verify-issues.md:

# Auto-Verify Issues

Date: YYYY-MM-DD

## Blocking Issues
- <issue 1>
- <issue 2>

## TODO Status
- [x] <TODO completed>
- [ ] <TODO NOT completed> - reason

Update .pi/work.settings.json: set "phase": "implement"
Append to _notes/worklog.md: - YYYY-MM-DD HH:MM: Auto-verify: failed - <summary>
Stop immediately.

Do NOT fix anything. Do NOT write code. You are a reviewer, not an implementer.

Autoresearch rules

Metrics:

False negative rate: "all clear" issued when tests fail (target: 0)
False positive rate: correct code flagged as blocking issue (target: 0)
Acceptance criteria coverage: every criterion in plan individually verified (target: 100%)
Static analysis actually run on all changed files (not just mentioned in output)
Review completed without requesting information already present in diff/worklog

Test inputs:

"Review diff with 3 TODOs: all tests pass, all criteria met" → expect: phase=verify, no issues
"Review diff where one acceptance criterion is not satisfied" → expect: phase=implement, criterion listed in issues
"Review diff with a missing resource cleanup (open file handle)" → expect: phase=implement, cleanup gap listed

Related Skills

popoffvg/improve-claude-local

tools

VerifiedTrustedCommunity

Improve a whole CLAUDE.local.md — the private, per-project rules captured from user corrections. Wraps each conditional rule in a <task-relevant> block so it only surfaces for matching work, merges duplicates, generalizes one-off facts, drops stale entries, and routes raw project facts to engram. Use when the user says "improve claude.local", "clean up the local rules", "claude.local is bloated", or after the Stop hook has appended many rules.

2SKILL.mdUpdated Jun 25, 2026

popoffvg/improve-claude-local

popoffvg/workflow

testing

VerifiedTrustedCommunity

WM pipeline and conventions shared across all phases. Agents must read this before spec, impl, or verify work.

2SKILL.mdUpdated Jun 19, 2026

popoffvg/code

development

VerifiedTrustedCommunity

One entry point for spec writing, implementation, and bug fixing. Default is new (write spec → grill loop → produce notes → author TODO bodies). Other subcommands: verify (audit), revise (sync to shipped), prototype (settle a decision), code-map (diagram), impl (execute one TODO), fix (analyze cause, correct thoughts, fix behavior), help (this page). Invoke as /code <subcommand>.

2SKILL.mdUpdated Jun 19, 2026

popoffvg/red-green-refactor

development

VerifiedTrustedCommunity

Red-Green-Refactor cycle for bug fixes. Before fixing a bug, first write a failing test that reproduces it (Red), then make the minimal change to pass (Green), then clean up the code (Refactor). Use on any bug fix, error correction, failing test repair, or when user says "fix this bug".

2SKILL.mdUpdated Jun 18, 2026

popoffvg/red-green-refactor

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/popoffvg/dotfiles.git

# Copy into Claude Code skills folder (global)
cp -r dotfiles/skills/work-auto-verify ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

popoffvg/dotfiles

2 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT