skills/verification/SKILL.md
Use before any completion claim, success statement, or marking a task done. Triggers when about to say "Great!", "Perfect!", "Done", "All set", "Ready to commit", before creating a PR, before moving to the next task, or when code has changed since the last test run.
npx skillsauth add joshsymonds/gambit verificationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Claiming work is complete without verification is dishonesty, not efficiency.
Core principle: Evidence before claims, always.
Announce at start: "I'm using gambit:verification to confirm this with evidence."
LOW FREEDOM - NO exceptions. Run verification command, read output, THEN make claim.
Violating the letter of the rules is violating the spirit of the rules.
| Claim | Verification Required | Not Sufficient |
|-------|----------------------|----------------|
| Tests pass | Run full test command, see 0 failures | Previous run, "should pass" |
| Build succeeds | Run build, see exit 0 | Linter passing |
| Bug fixed | Test original symptom, passes | Code changed |
| Linter clean | Linter output: 0 errors | Partial check, extrapolation |
| Regression test works | Red-green cycle verified | Test passes once |
| Task complete | Check ALL success criteria individually | "Implemented the feature" |
| All tasks done | TaskList shows all completed | "All tasks done" |
| Agent completed | VCS diff shows changes | Agent reports "success" |
NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
If you haven't run the verification command in THIS message, you cannot claim it passes.
If you catch yourself doing ANY of the following, you are about to claim completion without evidence. STOP and run verification before the next word you say or tool you call.
Language signals:
State signals:
Emotional / fatigue signals (these are the ones that bypass rational guards):
All of these mean: STOP. Run the verification command fresh. Read the output. THEN claim the result.
ALWAYS before:
Don't use for: Deciding what to build (gambit:brainstorming), how to build it (gambit:executing-plans)
Before making any completion claim, ask: "What command proves this?"
| Claim Type | Verification |
|------------|-------------|
| Tests pass | Project's test command (go test ./..., npm test, etc.) |
| Build succeeds | Project's build command |
| Linter clean | Project's lint command |
| Task complete | Each success criterion verified individually |
| Epic complete | TaskList + each epic criterion verified |
Execute the FULL command. Not a partial run. Not a cached result.
For verbose output: Dispatch a general-purpose agent:
Task
subagent_type: "general-purpose"
description: "Run verification"
prompt: "Run: [command]. Report pass/fail counts, exit code, and any failures."
For quick commands: Run directly with Bash.
Critical: If you changed code since the last run, previous results are STALE. Run again.
Don't:
Do:
If verification FAILS: State actual status with evidence.
Tests: 33 passed, 1 failed.
Failure: test_login_with_expired_token still fails.
The fix didn't handle expired tokens.
Investigating...
If verification PASSES: State claim WITH evidence.
Tests pass. [Ran: go test ./..., Output: 34/34 passed, exit 0]
Ready to commit.
Before marking any task complete:
TaskGet — re-read success criteriaTaskGet taskId: "current-task-id"
→ Success criteria from task:
1. POST /auth/login returns valid JWT
2. Invalid password returns 401
3. All tests pass
→ Verification:
1. Ran: curl -X POST localhost:8080/auth/login -d '...'
Output: {"token": "eyJ..."} — VERIFIED
2. Ran: curl with wrong password
Output: 401 {"error": "Invalid credentials"} — VERIFIED
3. Ran: go test ./...
Output: 34/34 passed, exit 0 — VERIFIED
→ All criteria verified.
TaskUpdate taskId: "current-task-id" status: "completed"
For epic completion: Run TaskList first, confirm ALL subtasks show completed, then verify epic-level criteria the same way.
All of these mean: STOP. Run verification.
| Excuse | Reality | |--------|---------| | "Should work now" | RUN the verification | | "I'm confident" | Confidence ≠ evidence | | "Just this once" | No exceptions | | "Linter passed" | Linter ≠ compiler ≠ tests | | "Agent said success" | Verify independently | | "Partial check is enough" | Partial proves nothing | | "I already verified earlier" | You changed code since then | | "Different words so rule doesn't apply" | Spirit over letter |
Before claiming tests pass:
Before claiming build succeeds:
Before marking task complete:
Before marking epic complete:
TaskListCan't check all boxes? Return to the process.
See REFERENCE.md for detailed good/bad examples including:
This skill is called by:
gambit:test-driven-development (verify tests pass/fail)gambit:executing-plans (verify task success criteria)gambit:debugging (confirm the failing test reproduces the bug; verify a fast-path fix is green)This skill calls:
subagent_type: "general-purpose") for running verbose commandsCalled by:
Workflow:
About to claim completion
↓
Step 1: What command proves this?
↓
Step 2: Run it (fresh)
↓
Step 3: Read output completely
↓
Step 4: State claim WITH evidence
↓
Step 5: For tasks, verify EACH criterion
↓
Evidence confirms → Make claim
testing
Use when creating a new skill, modifying an existing skill, writing or rewriting a SKILL.md file, auditing a skill's description for discoverability, or when user mentions "create a skill", "write a skill", "new skill", "modify skill", "improve skill", "edit the skill".
data-ai
Use when starting an isolated feature branch, when working on multiple features simultaneously, when experimenting without affecting the main workspace, or when a clean workspace is needed before beginning implementation. User phrases like "start a new branch", "set up a worktree", "isolated workspace", "work on feature X separately".
development
Use at the start of every session before any response or action. Also invoke whenever uncertain which gambit skill applies, when about to implement / debug / refactor / test / plan / brainstorm, or when a user request could match any gambit skill even at 1% probability.
development
Use when bugs keep slipping through despite high test coverage, when suspecting tests are giving false confidence, before a major refactor that will depend on the existing test suite, or when coverage metrics don't match incident rates. User phrases like "do these tests actually catch bugs?", "is this suite any good?", "why didn't the tests catch this?".