skills/acceptance-criteria-validate/SKILL.md
Use when an agent executing an implementation plan claims to have finished, to validate that all acceptance criteria were actually met. Locates acceptance criteria from the plan file, acceptance_criteria.md, or the request itself, then investigates the codebase and surfaces a pass/fail verdict per criterion.
npx skillsauth add giladresisi/ai-dev-env acceptance-criteria-validateInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
If no request, task description, or plan file was provided as context (the skill was invoked with no arguments and nothing in the conversation identifies what was implemented):
Work through the following sources in order, stopping as soon as criteria are found.
Check whether the context references a plan file (a .md path, usually under .agents/plans/).
If a plan file is identified:
## ACCEPTANCE CRITERIA## Acceptance Criteria## Success Criteria## Completion Criteria## Done When## COMPLETION CHECKLIST section — individual checklist items there may function as acceptance criteria even if not labeled as suchacceptance_criteria.md fileIf no criteria were found in the plan file, search for acceptance_criteria.md in:
.agents/acceptance_criteria.mdacceptance_criteria.md (project root)If found:
If neither of the above yielded criteria, re-read the current request (the message that triggered this skill or the broader conversation context). Look for:
If found in the request → proceed to Step 2 with these.
If no criteria were found in any source:
First, derive your own suggested criteria based on the request description and any plan content you have read (apply the same dimension framework as the acceptance-criteria-define skill: functional correctness, error handling, integration/E2E, validation, non-functional, out of scope).
Then use AskUserQuestion:
Question: "I couldn't find any acceptance criteria for this request. Here's what I'd suggest — do these look right, or would you like to define your own?"
Present the suggested criteria in the question context, then offer:
Do not proceed to Step 2 until criteria are confirmed.
For each criterion, investigate whether it has been met. Use a structured approach per criterion:
Code inspection:
Test evidence:
Configuration and wiring:
main.py, app.py, etc.) and verify the wiring is actually present, not just documentedDocumentation and artifacts:
PROGRESS.md for notes indicating the criterion was addressed or explicitly deferred.agents/execution-reports/ for any execution report for this featureRun validation commands (if safe and appropriate):
## VALIDATION COMMANDS section lists commands and it is safe to run them in the current environment, run them and use the results as evidenceDeferred demo/script validation — MANDATORY structural probe: When a demo script or CLI is marked UNVERIFIABLE or deferred due to environmental constraints (live server, external auth, human approval step), do NOT accept the deferral without running a structural probe:
python -c "import <module>" — verifies no import/syntax errorsinput(), etc.) — catches encoding errors, misconfigured subprocess invocations, and wiring bugs before the environmental gateCLAUDECODE, PATH, secret vars)
Mark as PARTIAL (not UNVERIFIABLE) if these structural checks are skipped. "Covered at unit level" is only a valid deferral for library behavior, never for the executable surface of the deliverable.Hardcoded HTTP endpoint URLs:
For any criterion involving HTTP calls introduced in the feature, verify at least one representative hardcoded URL against the live server's registered routes (e.g., curl -I <url> or check main.py router/mount registrations). A 404 on a newly introduced URL is a FAIL, not UNVERIFIABLE — incorrect URLs cause silent tool-call failures that are very hard to diagnose.
For each criterion, produce one of:
Format the report as follows:
## Acceptance Criteria Validation Report
**Feature / Request:** <short description>
**Plan File:** <path or "none">
**Criteria Source:** <Plan file / acceptance_criteria.md / Request / User-defined>
**Validated:** <date>
---
### Results
| # | Criterion | Verdict | Evidence |
|---|-----------|---------|----------|
| 1 | <criterion text> | PASS / FAIL / PARTIAL / UNVERIFIABLE | <1–2 sentence evidence summary> |
| 2 | ... | ... | ... |
---
### Summary
**PASS:** <n>
**FAIL:** <n>
**PARTIAL:** <n>
**UNVERIFIABLE:** <n>
**Total:** <n>
**Overall verdict:** ACCEPTED / REJECTED / NEEDS REVIEW
- ACCEPTED — all criteria PASS (UNVERIFIABLE items acknowledged)
- REJECTED — one or more criteria FAIL
- NEEDS REVIEW — one or more criteria PARTIAL or UNVERIFIABLE with no other FAILs
---
### Failures & Gaps
<For each FAIL or PARTIAL criterion:>
#### Criterion <#>: <criterion text>
**Verdict:** FAIL / PARTIAL
**What is missing:** <specific description>
**Where to look / fix:** <file path, function, or area of the codebase>
**Suggested next step:** <concrete action — e.g., "Implement X in file Y", "Add test case for Z">
---
### Notes
<Any UNVERIFIABLE items with explanation of what manual verification would require>
<Any observations about gaps in the criteria themselves>
If invoked as a background subagent (spawned via the Agent tool from another skill, e.g., from the execute skill's post-execution flow):
.agents/acceptance-validations/<feature-name>-validation.md<feature-name> from the plan file name or the request titleIf the caller requested output to a file (e.g., "save the validation report", "write to a file"):
.agents/acceptance-validations/<feature-name>-validation.md<feature-name> from the plan file name or the request title.agents/acceptance-validations/<feature-name>-validation.md."If the caller requested output to the CLI (default when invoked interactively with no file destination specified):
In all cases, end with one of these verdicts printed prominently:
Overall: ACCEPTED — all acceptance criteria met, ready for commit/review
Overall: REJECTED — <n> criteria not met, execution is incomplete
Overall: NEEDS REVIEW — <n> criteria need manual confirmation before accepting
If verdict is ACCEPTED:
execute skill flow, return control to that skill to proceed with the Output ReportIf verdict is REJECTED or NEEDS REVIEW:
Do NOT declare execution complete
List the failing or unresolved criteria clearly
Ask the user (or return to the calling skill) with:
"The following criteria were not met. Should I fix them now, or do you want to review first?"
Options:
Do not silently pass failed criteria.
testing
Creates a new git worktree in the auto-co-trader project for any purpose — optimization, regression, backtesting, brainstorming, etc. Use this skill when the user wants to CREATE or SET UP a new worktree — phrases like "prepare a new worktree", "set up a worktree", "create a new worktree for <purpose>", "prep a new worktree", "new worktree for autoresearch", "prepare optimization from [strategy]", or "create a worktree using [strategy]". Do NOT use this skill when the user is already in a worktree and wants to start/run/begin a task — that is handled by the relevant program file in the worktree session.
development
Use when running comprehensive project validation including tests, type checking, linting, API connectivity checks, and server startup verification
research
Use when performing a meta-level analysis of plan adherence after implementation to identify process improvements and suggest CLAUDE.md updates
documentation
Use when investigating a GitHub issue to identify root cause, assess impact, and create a fix strategy document