skills/strict-enforcement/SKILL.md
Use when running claudikins-kernel:verify, checking implementation quality, deciding pass/fail verdicts, or enforcing cross-command gates — requires actual evidence of code working, not just passing tests
npx skillsauth add elb-pr/claudikins-kernel strict-enforcementInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill when you need to:
claudikins-kernel:verify command"Evidence before assertions. Always." - Verification philosophy
Never claim code works without seeing it work. Tests passing is not enough. Claude must SEE the output.
Run the automated checks first. Fast feedback.
| Check | Command Pattern | What It Catches |
| ----- | ------------------------------------ | ----------------------------------- |
| Tests | npm test / pytest / cargo test | Logic errors, regressions |
| Lint | npm run lint / ruff / clippy | Style issues, common bugs |
| Types | tsc / mypy / cargo check | Type mismatches, interface drift |
| Build | npm run build / cargo build | Compilation errors, bundling issues |
Flaky Test Detection (C-12):
Test fails?
├── Re-run failed tests
├── Pass 2nd time?
│ └── Yes → STOP: [Accept flakiness] [Fix tests] [Abort]
└── Fail 2nd time?
├── Run isolated
└── Still fail? → STOP: [Fix] [Skip] [Abort]
This is the feedback loop that makes Claude's code actually work.
| Project Type | Verification Method | Evidence | | ------------ | ------------------------------------ | ------------------------------ | | Web app | Start server, screenshot, test flows | Screenshots, console logs | | API | Curl endpoints, check responses | Status codes, response bodies | | CLI | Run commands, capture output | stdout, stderr, exit codes | | Library | Run examples, check results | Output values, test coverage | | Service | Check logs, verify health endpoint | Log patterns, health responses |
Fallback Hierarchy (A-3):
If primary method unavailable, fall back:
Timeout: 30 seconds per verification method (CMD-30).
After verification passes, optionally run cynic for polish.
Prerequisites:
cynic Rules:
If tests fail after simplification:
See cynic-rollback.md for recovery patterns.
If stuck during verification:
Is mcp__claudikins-klaus available? (E-16)
├── No →
│ Offer: [Manual review] [Ask Claude differently] (E-17)
│ Fallback: [Accept with uncertainty] [Max retries, abort] (E-18)
└── Yes →
Spawn klaus via SubagentStop hook
The final gate. Present comprehensive evidence.
Verification Report
-------------------
Tests: ✓ 47/47 passed
Lint: ✓ 0 issues
Types: ✓ 0 errors
Build: ✓ success
Evidence:
- Screenshot: .claude/evidence/login-flow.png
- API test: POST /api/auth → 200 OK
- CLI test: mycli --help → exit 0
[Ready to Ship] [Needs Work] [Accept with Caveats]
Human decides. If approved, set unlock_ship = true.
Agents under pressure find excuses. These are all violations:
| Excuse | Reality | | ------------------------------------------ | --------------------------------------------------------------------- | | "Tests pass, that's good enough" | Tests aren't enough. SEE it working. Screenshots, curl, output. | | "I'll verify after shipping" | Verify BEFORE ship. That's the whole point. | | "The type checker caught everything" | Types don't catch runtime issues. Get evidence. | | "Screenshot failed but it probably works" | "Probably" isn't evidence. Fix the screenshot or use fallback. | | "Human checkpoint is just a formality" | Human checkpoint is the gate. No auto-shipping. | | "Code review is enough for this change" | Code review is last resort fallback. Try harder. | | "Tests are flaky, I'll ignore the failure" | Flaky tests hide real failures. Fix or explicitly accept with caveat. | | "Exit code 2 is too strict" | Exit code 2 exists to block bad ships. Pass properly. |
All of these mean: Get evidence. Human decides. No shortcuts.
If you're thinking any of these, you're about to violate the methodology:
All of these mean: STOP. Get evidence. Present to human. Let them decide.
The verify-gate.sh hook enforces the gate:
# Both conditions MUST be true
ALL_PASSED=$(jq -r '.all_checks_passed' "$STATE")
HUMAN_APPROVED=$(jq -r '.human_checkpoint.decision' "$STATE")
if [ "$ALL_PASSED" != "true" ]; then
exit 2 # Blocks claudikins-kernel:ship
fi
if [ "$HUMAN_APPROVED" != "ready_to_ship" ]; then
exit 2 # Blocks claudikins-kernel:ship
fi
File Manifest (C-6):
At verification completion, generate SHA256 hashes of all source files:
find . \( -name '*.ts' -o -name '*.py' -o -name '*.rs' \) \
| xargs sha256sum > .claude/verify-manifest.txt
This lets claudikins-kernel:ship detect if code was modified after verification.
claudikins-kernel:verify requires claudikins-kernel:execute to have completed:
if [ ! -f "$EXECUTE_STATE" ]; then
echo "ERROR: claudikins-kernel:execute has not been run"
exit 2
fi
This enforces the claudikins-kernel:outline → claudikins-kernel:execute → claudikins-kernel:verify → claudikins-kernel:ship flow.
| Agent | Role | When | | -------------- | ---------------- | ---------------------------------- | | catastrophiser | See code working | Phase 2: Output verification | | cynic | Polish pass | Phase 3: Simplification (optional) |
Both agents run with context: fork and background: true.
See agent-integration.md for coordination patterns.
{
"session_id": "verify-2026-01-16-1100",
"execute_session_id": "execute-2026-01-16-1030",
"branch": "execute/task-1-auth-middleware",
"phases": {
"test_suite": { "status": "PASS", "count": 47 },
"lint": { "status": "PASS", "issues": 0 },
"type_check": { "status": "PASS", "errors": 0 },
"output_verification": { "status": "PASS", "agent": "catastrophiser" },
"code_simplification": { "status": "PASS", "agent": "cynic" }
},
"all_checks_passed": true,
"human_checkpoint": {
"decision": "ready_to_ship",
"caveats": []
},
"unlock_ship": true,
"verified_manifest": "sha256:...",
"verified_commit_sha": "abc123..."
}
Don't do these:
| Situation | Reference | | -------------------------- | ----------------------------------------------------------------------------- | | Tests hang or timeout | test-timeout-handling.md | | Auto-fix breaks code | lint-fix-validation.md | | Primary verification fails | verification-method-fallback.md | | Type-check results unclear | type-check-confidence.md | | cynic breaks tests | cynic-rollback.md | | Large project state | verify-state-compression.md |
Full documentation in this skill's references/ folder:
development
Use when running claudikins-kernel:ship, preparing PRs, writing changelogs, deciding merge strategy, or handling CI failures — enforces GRFP-style iterative approval, code integrity validation, and human-gated merges
testing
Use when running claudikins-kernel:execute, decomposing plans into tasks, setting up two-stage review, deciding batch sizes, or handling stuck agents — enforces isolation, verification, and human checkpoints; prevents runaway parallelization and context death
testing
Use when running claudikins-kernel:outline, brainstorming implementation approaches, gathering requirements iteratively, structuring complex technical plans, or facing analysis paralysis with too many options — provides iterative human-in-the-loop planning with explicit checkpoints and trade-off presentation
development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.