Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

nyldn/skill-verification-gate

Name: skill-verification-gate
Author: nyldn

skills/skill-verification-gate/SKILL.md

npx skillsauth add nyldn/claude-octopus skill-verification-gate

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Host: Codex CLI — This skill was designed for Claude Code and adapted for Codex. Cross-reference commands use installed skill names in Codex rather than /octo:* slash commands. Use the active Codex shell and subagent tools. Do not claim a provider, model, or host subagent is available until the current session exposes it. For host tool equivalents, see skills/blocks/codex-host-adapter.md.

Verification Gate

The Iron Law

<HARD-GATE> NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE </HARD-GATE>

NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE

If you haven't run the verification command in this turn, you cannot claim it passes.

The Gate

Before claiming any success or expressing satisfaction:

IDENTIFY — What command proves this claim?
RUN — Execute the full command (fresh, not cached)
READ — Full output, check exit code, count failures
VERIFY — Does output actually confirm the claim?
ONLY THEN — State the claim WITH evidence

Skip any step = the claim is unverified.

What Counts as Evidence

| Claim | Requires | NOT Sufficient | |-------|----------|----------------| | Tests pass | Test command output showing 0 failures | Previous run, "should pass" | | Build succeeds | Build command exit 0 | Linter passing | | Bug fixed | Reproduce original symptom: now passes | "Code changed, should work" | | Regression test works | Red (fail without fix) → Green (pass with fix) | Test passes once | | Subagent completed task | git diff shows expected changes | Subagent says "done" | | Requirements met | Line-by-line checklist against spec | Tests passing | | Provider dispatch worked | Output contains expected content | No error ≠ success |

Red Flags — STOP and Verify

If you catch yourself thinking any of these, STOP:

| Thought | What to do instead | |---------|-------------------| | "Should work now" | Run the verification | | "I'm confident" | Confidence ≠ evidence | | "Just this once" | No exceptions | | "The linter passed" | Linter ≠ tests ≠ build | | "The agent said it worked" | Verify independently | | "It's a small change" | Small changes cause big bugs |

Multi-Provider Context

In Claude Octopus workflows, verification is especially critical because:

Provider outputs can be hallucinated — Codex/Gemini/Copilot may claim success without evidence
Consensus ≠ correctness — three models agreeing doesn't mean they're right
Synthesis files may be stale — check timestamps, don't assume freshness
orchestrate.sh exit code 0 ≠ quality — the script ran, but did it produce good output?

After any multi-provider workflow:

# Verify synthesis file exists and is recent
ls -la ~/.claude-octopus/results/*-synthesis-*.md | tail -1

# Verify it has content (not just headers)
wc -l ~/.claude-octopus/results/*-synthesis-*.md | tail -1

When to Apply

ALWAYS before:

Committing code
Creating PRs
Marking tasks complete
Moving to next workflow phase
Reporting results to user
Claiming a bug is fixed

In orchestrate.sh workflows:

After probe (discover) — verify synthesis file exists
After grasp (define) — verify consensus score meets threshold
After tangle (develop) — verify tests pass, not just that code was written
After ink (deliver) — verify review actually ran, not just that it was dispatched

Examples

Correct: Evidence-Based Claim

$ npm test
  ✓ user.create() saves to database (45ms)
  ✓ user.create() validates email (12ms)
  Tests: 2 passed, 2 total

All 2 tests pass. ← Claim backed by output.

Incorrect: Claim Without Evidence

I've implemented the feature. It should work now. The tests should pass.
← No test was run. "Should" is not evidence.

Correct: Regression Test Red-Green

1. Write test → run → FAIL (expected, proves test detects the bug)
2. Implement fix → run → PASS (proves fix works)
3. Revert fix → run → FAIL (proves test isn't false-positive)
4. Restore fix → run → PASS (final confirmation)

Integration with Other Skills

This skill is referenced by:

flow-develop.md — verification gate after implementation
flow-deliver.md — verification gate before delivery
skill-code-review.md — verify review findings before reporting
skill-tdd.md — red-green cycle requires evidence at each step
skill-factory.md — autonomous pipeline must verify at every phase

nyldn/skill-verification-gate

skills/skill-verification-gate/SKILL.md

Evidence before claims — run verification commands before declaring work complete, fixed, or passing

3,455 stars

data-ai

Updated Jun 3, 2026

$ install --global

skillsauth

npx skillsauth add nyldn/claude-octopus skill-verification-gate

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 3, 2026, 3:15 AM26.7s2 files scanned

SKILL.md

name:: skill-verification-gate
description:: Evidence before claims — run verification commands before declaring work complete, fixed, or passing

Host: Codex CLI — This skill was designed for Claude Code and adapted for Codex. Cross-reference commands use installed skill names in Codex rather than /octo:* slash commands. Use the active Codex shell and subagent tools. Do not claim a provider, model, or host subagent is available until the current session exposes it. For host tool equivalents, see skills/blocks/codex-host-adapter.md.

Verification Gate

The Iron Law

<HARD-GATE> NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE </HARD-GATE>

NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE

If you haven't run the verification command in this turn, you cannot claim it passes.

The Gate

Before claiming any success or expressing satisfaction:

IDENTIFY — What command proves this claim?
RUN — Execute the full command (fresh, not cached)
READ — Full output, check exit code, count failures
VERIFY — Does output actually confirm the claim?
ONLY THEN — State the claim WITH evidence

Skip any step = the claim is unverified.

What Counts as Evidence

Red Flags — STOP and Verify

If you catch yourself thinking any of these, STOP:

Multi-Provider Context

In Claude Octopus workflows, verification is especially critical because:

Provider outputs can be hallucinated — Codex/Gemini/Copilot may claim success without evidence
Consensus ≠ correctness — three models agreeing doesn't mean they're right
Synthesis files may be stale — check timestamps, don't assume freshness
orchestrate.sh exit code 0 ≠ quality — the script ran, but did it produce good output?

After any multi-provider workflow:

# Verify synthesis file exists and is recent
ls -la ~/.claude-octopus/results/*-synthesis-*.md | tail -1

# Verify it has content (not just headers)
wc -l ~/.claude-octopus/results/*-synthesis-*.md | tail -1

When to Apply

ALWAYS before:

Committing code
Creating PRs
Marking tasks complete
Moving to next workflow phase
Reporting results to user
Claiming a bug is fixed

In orchestrate.sh workflows:

After probe (discover) — verify synthesis file exists
After grasp (define) — verify consensus score meets threshold
After tangle (develop) — verify tests pass, not just that code was written
After ink (deliver) — verify review actually ran, not just that it was dispatched

Examples

Correct: Evidence-Based Claim

$ npm test
  ✓ user.create() saves to database (45ms)
  ✓ user.create() validates email (12ms)
  Tests: 2 passed, 2 total

All 2 tests pass. ← Claim backed by output.

Incorrect: Claim Without Evidence

I've implemented the feature. It should work now. The tests should pass.
← No test was run. "Should" is not evidence.

Correct: Regression Test Red-Green

1. Write test → run → FAIL (expected, proves test detects the bug)
2. Implement fix → run → PASS (proves fix works)
3. Revert fix → run → FAIL (proves test isn't false-positive)
4. Restore fix → run → PASS (final confirmation)

Integration with Other Skills

This skill is referenced by:

flow-develop.md — verification gate after implementation
flow-deliver.md — verification gate before delivery
skill-code-review.md — verify review findings before reporting
skill-tdd.md — red-green cycle requires evidence at each step
skill-factory.md — autonomous pipeline must verify at every phase

Related Skills

nyldn/skill-council

testing

VerifiedTrustedCommunity

Run a configurable multi-LLM council with personas, budget caps, synthesis, veto gates, and optional implementation handoff.

3,455SKILL.mdUpdated May 23, 2026

nyldn/skill-verify

testing

VerifiedTrustedCommunity

Evidence before claims — run verification commands before declaring work complete, fixed, or passing

3,455SKILL.mdUpdated Apr 16, 2026

nyldn/skill-debate

development

VerifiedTrustedCommunity

Structured four-way AI debates between Claude, Sonnet, Gemini, and Codex — use for critical decisions

3,455SKILL.mdUpdated Apr 16, 2026

nyldn/octopus-architecture

development

VerifiedTrustedCommunity

System architecture and API design with multi-AI consensus — use for design reviews and new subsystems

3,455SKILL.mdUpdated Apr 16, 2026

nyldn/octopus-architecture

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/nyldn/claude-octopus.git

# Copy into Claude Code skills folder (global)
cp -r claude-octopus/skills/skill-verification-gate ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

nyldn/claude-octopus

3,455 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT