skills/verification-loop/SKILL.md
Comprehensive 6-check verification framework for validating implementation quality across build, types, lint, tests, security, and diff review. This skill ensures code meets all quality gates before phase completion. Triggers on "verify implementation", "run verification", "/verification-loop", or automatically as part of implement-phase Step 2.
npx skillsauth add mhylle/claude-skills-collection verification-loopInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A systematic 6-check verification framework that validates implementation quality across multiple dimensions: build, types, lint, tests, security, and diff review. Catches issues early, ensures code compiles, type-checks, passes linting, runs tests, has no security issues, and contains no unintended changes.
Terminology: this skill uses "Checks" (Check 1–6). Don't confuse with "Phases" (plan phases) or "Steps" (implement-phase steps).
Defense in depth — each check catches a different category of issue:
| Check | Catches | Why it matters | |---|---|---| | Build | Syntax errors, missing deps, bundling issues | Code must compile to run | | Types | Type mismatches, null safety, interface violations | Type safety prevents runtime errors | | Lint | Style violations, code smells, potential bugs | Consistent, maintainable code | | Tests | Logic errors, regressions, broken contracts | Functional correctness | | Security | Secrets, debug code, vulnerable patterns | Production safety | | Diff | Unintended changes, scope creep, leftover code | Change discipline |
Fail fast — checks are ordered by detection speed. Build errors appear in seconds; security scans take longer. Fast checks first provide rapid feedback on common issues.
Project-agnostic — detect project type automatically, apply the matching tooling. Node.js, Python, Go, Rust, and mixed-language projects all work.
Idempotent — running the loop multiple times produces the same result. Each check passes or fails deterministically based on codebase state.
Automatically invoked by implement-phase as Step 2 (Exit Condition Verification), before integration testing, after implementation subagents complete.
Manually invoked for:
Don't use for:
Detect project type first, load the matching reference file, then run the 6 checks with those commands.
| Type | Primary indicator | Reference file |
|---|---|---|
| Node.js / TypeScript | package.json exists (with tsconfig.json → TypeScript) | references/nodejs-typescript.md |
| Python | pyproject.toml or setup.py | references/python.md |
| Go | go.mod exists | references/go.md |
| Rust | Cargo.toml exists | references/rust.md |
| Mixed / monorepo | Multiple indicators | Load each relevant reference |
Detection logic:
detect_project_type() {
if [ -f "package.json" ]; then
if [ -f "tsconfig.json" ]; then echo "typescript"; else echo "nodejs"; fi
elif [ -f "pyproject.toml" ] || [ -f "setup.py" ]; then echo "python"
elif [ -f "go.mod" ]; then echo "go"
elif [ -f "Cargo.toml" ]; then echo "rust"
else echo "unknown"; fi
}
After detection, only load the matching reference file — don't load all four, that's wasted context.
Purpose: ensure the codebase compiles, bundles, and produces valid artifacts.
PASS: build command exits 0, no compilation errors, all artifacts generated, no missing-dependency errors.
FAIL: non-zero exit code, compilation/bundling errors, missing or incomplete artifacts, unresolved deps.
Commands → look up the matching language reference file (references/<lang>.md).
Failure handling: parse the error output → identify root cause (missing import? syntax error? type mismatch?) → spawn fix subagent with error context → re-run build. Max 3 retries.
Output shape:
CHECK_1_BUILD_VERIFICATION:
STATUS: PASS | FAIL
COMMAND: [command run]
EXIT_CODE: 0 | [non-zero]
DURATION: 12.3s
ARTIFACTS: [list]
ERRORS: [] | [list of errors]
Purpose: ensure type safety across the codebase with no type errors.
PASS: type checker exits 0, no type errors, all assertions valid, no missing type definitions.
FAIL: type errors, missing type definitions for dependencies, invalid type assertions, unreachable code (in strict mode).
Commands → language reference.
Failure handling: parse errors → categorize (missing types → add annotations; type mismatch → fix types; missing definitions → install @types/* or stubs) → spawn fix subagent → re-run. Max 3 retries.
Output shape:
CHECK_2_TYPE_VERIFICATION:
STATUS: PASS | FAIL
COMMAND: [command run]
EXIT_CODE: 0 | [non-zero]
DURATION: 8.2s
FILES_CHECKED: [count]
ERRORS: [] | [ { file, line, error } ]
Purpose: ensure code follows style guidelines and catches potential bugs.
PASS: linter exits 0, no errors (warnings may be acceptable by config), all auto-fixable issues resolved, formatting matches project standards.
FAIL: lint errors, unfixable formatting issues, security-related lint rules violated, complexity thresholds exceeded.
Commands → language reference. Always run auto-fix first, then re-run the check without it and assert clean — this resolves mechanical issues before flagging real ones.
Failure handling: auto-fix → parse remaining errors → spawn fix subagent for non-auto-fixable issues → re-run. Max 3 retries.
Output shape:
CHECK_3_LINT_VERIFICATION:
STATUS: PASS | FAIL
COMMANDS: [
{ cmd: [lint-fix], exit_code: 0 },
{ cmd: [lint-check], exit_code: 0 }
]
AUTO_FIXED: [count]
REMAINING_ERRORS: 0 | [ { rule, file, line } ]
WARNINGS: [count]
Purpose: ensure all tests pass and new code has appropriate coverage.
PASS: all tests pass (exit 0), no unjustified skips, coverage thresholds met (if configured), no flaky failures.
FAIL: any test failure, coverage below threshold, test timeout, test infrastructure errors.
Commands → language reference.
Failure handling: identify failing tests → categorize (test bug → fix the test; implementation bug → fix the code; environmental → fix test setup) → spawn fix subagent → re-run failing tests first (faster feedback), then full suite after fix. Max 3 retries.
Output shape:
CHECK_4_TEST_VERIFICATION:
STATUS: PASS | FAIL
COMMAND: [command run]
EXIT_CODE: 0 | [non-zero]
DURATION: 45.2s
TESTS_RUN: [count]
TESTS_PASSED: [count]
TESTS_FAILED: [count]
TESTS_SKIPPED: [count]
COVERAGE: [percent]
FAILURES: [] | [ { test, file, error } ]
Purpose: detect secrets, debug code, and security vulnerabilities before code reaches production.
PASS: no hardcoded secrets, no console.log/print in production code, no debug flags enabled, no known vulnerable deps (if scanned).
FAIL: secrets detected (API keys, passwords, tokens), debug code in production paths, debug flags left enabled, critical vulnerabilities in deps.
Use whichever tool is installed; pick the first available:
# git-secrets
git secrets --scan
# gitleaks
gitleaks detect --source .
# trufflehog
trufflehog filesystem .
# Manual patterns (fallback)
grep -r "API_KEY\|SECRET\|PASSWORD\|TOKEN" src/ --include="*.ts" --include="*.js"
grep -r "sk-\|pk_\|api_key\|secret_key" src/
Common secret patterns:
# AWS
AKIA[0-9A-Z]{16}
aws_secret_access_key
# API Keys (generic)
[aA][pP][iI]_?[kK][eE][yY].*=.*['"][a-zA-Z0-9]{20,}['"]
# Private Keys
-----BEGIN (RSA|DSA|EC|OPENSSH) PRIVATE KEY-----
# Tokens
(ghp|gho|ghu|ghs|ghr)_[A-Za-z0-9_]{36,} # GitHub
xox[baprs]-[0-9a-zA-Z-]+ # Slack
Language-specific — see the matching reference file. Allowed exceptions: logger framework calls (logger.info, logger.debug), error handling in catch blocks (if configured), test files, development-only files.
# Environment checks
grep -rn "NODE_ENV.*development\|DEBUG.*true\|SKIP_AUTH" src/
# Disabled security
grep -rn "verify.*false\|secure.*false\|https.*false" src/
# TODO/FIXME with security implications
grep -rn "TODO.*security\|FIXME.*auth\|HACK.*bypass" src/
Per-language commands in the matching reference (npm audit, pip-audit, govulncheck, cargo audit).
Output shape:
CHECK_5_SECURITY_SCAN:
STATUS: PASS | FAIL
SCANS_RUN: [
{ scan: "secrets", status: "PASS", findings: 0 },
{ scan: "console_logs", status: "WARN", findings: 3 },
{ scan: "debug_flags", status: "PASS", findings: 0 },
{ scan: "dependencies", status: "PASS", findings: 0 }
]
CRITICAL_FINDINGS: 0 | [count]
WARNINGS: [count]
DETAILS: [] | [ { type, file, line } ]
Purpose: verify only intended changes are present; no unintended modifications.
PASS: all changes relate to the current phase/task, no unrelated file modifications, file count within expected range, no accidental deletions or additions, no formatting-only changes to unrelated files.
FAIL: changes outside scope, unexpected file additions/deletions, modifications to protected files, large diffs in files that should have minimal changes.
git diff --name-only HEAD~1 # list
git diff --name-status HEAD~1 # with status (A/M/D)
git diff --name-only main...HEAD # compared to base branch
git diff --stat HEAD~1 # with line counts
# Expected patterns (input to verification)
EXPECTED_PATTERNS=(
"src/auth/*"
"tests/auth/*"
"src/types/auth.ts"
)
CHANGED_FILES=$(git diff --name-only HEAD~1)
for file in $CHANGED_FILES; do
matches_expected=false
for pattern in "${EXPECTED_PATTERNS[@]}"; do
if [[ $file == $pattern ]]; then matches_expected=true; break; fi
done
if ! $matches_expected; then echo "UNEXPECTED: $file"; fi
done
COUNT=$(git diff --name-only HEAD~1 | wc -l)
if [ $COUNT -lt $MIN_FILES ] || [ $COUNT -gt $MAX_FILES ]; then
echo "WARNING: Changed $COUNT files, expected $MIN_FILES-$MAX_FILES"
fi
PROTECTED_FILES=(
"package-lock.json"
".env*"
"*.lock"
"docker-compose.yml"
"Dockerfile"
".github/workflows/*"
)
for file in $CHANGED_FILES; do
for protected in "${PROTECTED_FILES[@]}"; do
if [[ $file == $protected ]]; then echo "PROTECTED FILE CHANGED: $file"; fi
done
done
git diff --stat HEAD~1
git diff --numstat HEAD~1 | while read added deleted file; do
total=$((added + deleted))
if [ $total -gt 500 ]; then echo "LARGE DIFF: $file (+$added -$deleted)"; fi
done
Output shape:
CHECK_6_DIFF_REVIEW:
STATUS: PASS | FAIL
FILES_CHANGED: [count]
EXPECTED_RANGE: [min-max]
SCOPE_VIOLATIONS: 0 | [list]
PROTECTED_FILES_CHANGED: 0 | [list]
LARGE_DIFFS: 0 | [list]
SUMMARY: [ added, modified, deleted counts ]
DETAILS: [ { file, status, lines } ]
As Step 2 of implement-phase, this skill runs all 6 checks and returns a structured result.
Input context:
VERIFICATION_INPUT:
phase_number: 2
phase_name: "Authentication Service"
plan_path: docs/plans/auth-implementation.md
expected_file_patterns: ["src/auth/*", "tests/auth/*"]
expected_file_range: [5, 15]
protected_files: ["package-lock.json", ".env"]
skip_checks: [] # optional
Output format:
VERIFICATION_LOOP_STATUS: PASS | FAIL
CHECKS_RUN: 6
CHECKS_PASSED: 6 | [count]
CHECKS_FAILED: 0 | [count]
FAILED_CHECKS: [] | [ { check, name, error, details } ]
REPORT:
check_1_build: { status, duration }
check_2_types: { status, duration }
check_3_lint: { status, auto_fixed, duration }
check_4_tests: { status, tests_run, tests_passed, coverage, duration }
check_5_security: { status, secrets_found, console_logs, duration }
check_6_diff: { status, files_changed, scope_violations, duration }
TOTAL_DURATION: [seconds]
When PASS, the caller (implement-phase) proceeds immediately to Step 3 (Integration Testing) — no user pause. See implement-phase/SKILL.md for the continuous-execution contract.
Retry logic (handled by implement-phase):
1. Invoke verification-loop
2. If any check FAILS:
a. Identify failure type
b. Spawn fix subagent with failure context
c. Re-run verification-loop (or just failed checks)
d. Repeat until PASS (max 3 retries per check)
3. If all checks PASS:
a. Proceed to Step 3 (Integration Testing)
/verification-loop
# With specific checks
/verification-loop checks:build,types,tests
# Skip checks
/verification-loop skip:security,diff
# Specific project path
/verification-loop path:./packages/auth
Auto-detects project type and uses appropriate defaults. Override via project config:
// package.json or .verification.json
{
"verification": {
"checks": {
"build": { "command": "<command>", "timeout": "<ms>" },
"types": { "command": "<command>", "timeout": "<ms>" },
"lint": { "command": "<command>", "autoFix": true },
"tests": { "command": "<command>", "coverage": { "threshold": 80 } },
"security":{ "scanSecrets": true, "scanConsoleLogs": true, "excludePatterns": [] },
"diff": { "protectedFiles": [], "maxFileCount": 50 }
}
}
}
Environment variables:
VERIFICATION_SKIP_CHECKS=<comma-separated>
VERIFICATION_BUILD_TIMEOUT=<ms>
VERIFICATION_TEST_TIMEOUT=<ms>
VERIFICATION_SECRETS_SCAN=<bool>
VERIFICATION_CONSOLE_LOG_SCAN=<bool>
Check authors:
Verification consumers:
CI/CD integration:
| Issue | Cause | Solution |
|---|---|---|
| Build fails with missing deps | node_modules out of sync | Run npm install first |
| Type errors in node_modules | Wrong @types/* versions | Update or remove conflicting @types |
| Lint auto-fix causes more errors | Conflicting rules | Review ESLint/Prettier config |
| Tests timeout | Slow tests or hanging processes | Increase timeout or fix test |
| Security scan false positive | Valid use of flagged pattern | Add to exclude list with comment |
| Diff scope violation | Unrelated file touched | Review and revert or expand scope |
Debug mode:
/verification-loop --verbose
# Or
VERIFICATION_DEBUG=true /verification-loop
| Check | Purpose | Blocks on | |---|---|---| | 1. Build | Compilation | Any error | | 2. Types | Type safety | Any error | | 3. Lint | Code quality | Errors (not warnings) | | 4. Tests | Correctness | Any failure | | 5. Security | Production safety | Secrets, critical vulns | | 6. Diff | Change discipline | Scope violations |
# Fast path: build + types + lint
npm run build && npx tsc --noEmit && npm run lint
# All 6 checks
npm run build && \
npx tsc --noEmit && \
npm run lint && \
npm test && \
npm run security:scan && \
./scripts/verify-diff.sh
implement-phase — parent skill that invokes verification-loop in Step 2code-review — follows verification-loop in the implement-phase pipelinesecurity-review — deep security analysis (beyond Check 5)tools
--- name: tt-workflow-build description: Tasktracker-native trigger for a PARALLEL build via the Claude Code Workflow tool. Thin by design — it does two things, then drives to done: (1) ensure a tasktracker project exists (use the existing one, or create one), then (2) start a dynamic `Workflow` that builds it, tracking the work in tasktracker and using the build + verify skills. It does NOT analyze parallelism up front, ask the user to choose a mode, hand back, or fall back to a sequential skil
tools
--- name: grumpy-reviewer description: A single grumpy, nitpicky structural code reviewer that runs as an isolated subagent and treats the code as third-party work submitted by a junior programmer for validation. It cares about exactly one thing — maintainability — judged through separation of concerns, service-oriented design, helper-method extraction, small files, and the rule of 7 (as any grouping nears 7 members, it pushes for sub-groupings). It is deliberately kept OUT of the implementation
development
--- name: tt-workflow-run description: Tasktracker-native autonomous build-loop orchestrator. Drives a first-class `workflow_run` end-to-end — create the run (Gate 1 lifecycle completeness + Gate 2 zero-defects-in), then loop while `getNextReadyTask(projectId)` returns a slice — `setActiveTask` → record a pre-slice `scanArchitectureDrift` baseline → delegate the slice to `/tt-implement-phase` (which does the code work, registers the architecture delta in-slice, and auto-logs defects/learnings/fr
tools
Tasktracker-native project-wide parallel audit using the Claude Code Workflow tool (dynamic workflows). Partitions a repo / backlog / architecture and fans out read-only agents (one per partition) that return schema-checked findings, aggregates them into a deduplicated, ranked risk register, and OPTIONALLY writes fixes back as tasks under a Bug Fix phase — with all tasktracker writes done by the PARENT, never the parallel agents (single global active-task pointer). Journaled and resumable, so a rate-limit or crash mid-audit resumes without re-running completed partitions. Use for large, embarrassingly-parallel, read/analyze-heavy jobs where each unit is self-contained and the output aggregates — audit every file/component for risk, find all architecture drift (scanArchitectureDrift) or duplicate tasks (detectDuplicates/auditDuplicates), per-file tech-debt sweep, test-coverage or security-surface scan across a whole project. Triggers on "/tt-workflow-audit", "audit the whole repo", "parallel audit", "scan every file/component", "find all drift/duplicates", "tech-debt sweep (tasktracker)", or any whole-project analyze-at-scale request inside a session with a tasktracker project. Prefer this over /codebase-audit or /code-quality-audit when the project is tracked in tasktracker AND you want the findings written back as tasks; prefer it over team-* modes when the units don't need to negotiate live (they just report).