skills/cleanexpo/verification-protocol/SKILL.md
Independent verification of task completion - eliminates self-attestation
npx skillsauth add aiskillstore/marketplace verification-protocolInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The Problem: Agents were verifying their own work and always returning success: true by default.
The Solution: Independent verification by a DIFFERENT agent that does NOT trust the original agent's claims.
The Rule: verified=true ONLY when EVIDENCE proves all completion criteria are met.
NEVER verify your own work.
ALWAYS verify with independent evidence.
ASSUME claims are false until proven true.
Block completion without proof.
Agent claims task is complete and provides:
{
"task_id": "task-123",
"claimed_outputs": ["/path/to/file.ts", "/path/to/test.ts"],
"completion_criteria": [
"file_exists:/path/to/file.ts",
"no_placeholders:/path/to/file.ts",
"typescript_compiles:/path/to/file.ts",
"lint_passes:/path/to/file.ts",
"tests_pass:/path/to/test.ts"
]
}
Orchestrator sends to Independent Verifier Agent (different agent).
Independent Verifier checks EVERY criterion with actual evidence:
file_exists → fs.stat(path) && size > 0
Proof: /path/to/file.ts, 1,247 bytes, modified 2025-12-02T14:30:00Z
no_placeholders → Scan for TODO, TBD, FIXME, [INSERT]
Proof: 0 placeholders found
typescript_compiles → npx tsc --noEmit [file]
Proof: Compilation successful, 0 errors
lint_passes → npx eslint [file]
Proof: 0 linting errors
tests_pass → npm test -- [file]
Proof: 15 tests passed, 0 failed
{
"verified": true,
"evidence": [
{
"criterion": "file_exists:/path/to/file.ts",
"method": "fs.stat(path) && size > 0",
"result": "pass",
"proof": "File: /path/to/file.ts, Size: 1247 bytes"
},
// ... more evidence ...
],
"failures": [],
"verifier_agent_id": "independent-verifier-1",
"timestamp": "2025-12-02T14:30:00Z"
}
verified=true → Task marked COMPLETE, evidence loggedverified=false → Task returned to agent with failure list
Method: fs.existsSync(path) && fs.statSync(path).size > 0
Evidence: File path, size in bytes, last modified timestamp
Failure Triggers:
Method: Regex scan for TODO, TBD, FIXME, [INSERT], [IMPLEMENT] Evidence: Count and line numbers of placeholders found Failure Triggers:
Method: npx tsc --noEmit [file]
Evidence: Compiler output, error count, error details
Failure Triggers:
Method: npx eslint [file] --format json
Evidence: Lint output, error/warning counts
Failure Triggers:
Method: npm test -- [file] --run
Evidence: Test output, pass/fail counts, coverage
Failure Triggers:
Method: HTTP request to endpoint, check status code and response shape Evidence: Status code, response time, response body sample Failure Triggers:
| Criterion | Evidence Type | Example |
|-----------|---------------|---------|
| file_exists | File path, size, timestamp | /src/lib/file.ts, 2,541 bytes, 2025-12-02 14:30:00 |
| no_placeholders | Scan results | 0 placeholders found or Found 2: Line 15, Line 42 |
| compiles | Compiler output | 0 TypeScript errors |
| lint_passes | Linter output | 0 errors, 2 warnings |
| tests_pass | Test results | 15 passed, 0 failed |
| endpoint_responds | Status code + response | Status 200, response time 45ms |
// WRONG - Agent grades its own homework
return { verified: true, message: "I completed it" };
// WRONG - Doesn't actually check
if (claimedFile) {
return { verified: true }; // No evidence!
}
// WRONG - "This check is slow, skip it for now"
if (criterion === 'tests_pass') {
return { verified: true }; // NEVER skip checks
}
// WRONG - "Looks about right"
if (output.includes('success')) {
return { verified: true }; // No proof!
}
// RIGHT - Actual evidence collected
const result = await fs.stat(filePath);
if (result.size > 0) {
return {
verified: true,
evidence: [{
criterion: 'file_exists',
proof: `File size: ${result.size} bytes`
}]
};
}
Agent receives detailed failure report:
{
"verified": false,
"failures": [
{
"criterion": "tests_pass:/tests/unit/feature.test.ts",
"reason": "Test execution failed",
"proof": "Expected 10 tests to pass, 3 failed"
}
],
"retry_count": 1,
"max_retries": 3
}
Task escalates to human review:
{
"status": "escalated_to_human",
"reason": "Failed verification 3 times",
"failures_history": [...]
}
Task: Agent claims file was created and is ready for deployment
Evidence Collected:
✓ file_exists:/src/lib/agents/new-agent.ts
Size: 3,847 bytes, Created: 2025-12-02 14:30:00
✓ no_placeholders:/src/lib/agents/new-agent.ts
Scan found 0 TODO/TBD/FIXME markers
✓ typescript_compiles:/src/lib/agents/new-agent.ts
tsc --noEmit completed successfully
✓ lint_passes:/src/lib/agents/new-agent.ts
eslint: 0 errors, 0 warnings
✓ tests_pass:/tests/new-agent.test.ts
npm test: 12 passed, 0 failed
Result: verified: true ✓ All evidence confirms completion
Task: Agent claims feature is complete
Evidence Collected:
✗ file_exists:/src/lib/features/new-feature.ts
File not found: ENOENT: no such file or directory
✗ tests_pass:/tests/features/new-feature.test.ts
Test file not found: ENOENT: no such file or directory
✗ typescript_compiles:/src/lib/features/incomplete.ts
Compilation failed: Missing return type (line 42)
Result: verified: false ✗ Multiple criteria failed, agent must fix
import { independentVerifier } from '@/lib/agents/independent-verifier';
// DO NOT return success directly
// DO call Independent Verifier
const result = await independentVerifier.verify({
task_id: 'my-task-123',
claimed_outputs: ['/path/to/file.ts'],
completion_criteria: [
'file_exists:/path/to/file.ts',
'no_placeholders:/path/to/file.ts',
'typescript_compiles:/path/to/file.ts'
],
requesting_agent_id: this.agent_id
});
// Return the verification result (not your own assessment)
return result;
// Before marking task complete:
const verification = await independentVerifier.verify({
task_id: task.id,
claimed_outputs: task.outputs,
completion_criteria: task.criteria,
requesting_agent_id: task.agent_id
});
if (!verification.verified) {
// Return task to agent for fixes
task.status = 'verification_failed';
task.failures = verification.failures;
task.retry_count++;
if (task.retry_count >= 3) {
task.status = 'escalated_to_human';
}
return;
}
// Only mark complete with verification proof
task.status = 'complete';
task.verification = verification;
Endpoint: GET /api/health
Status: ✓ Working
Use: Basic system health check
Endpoint: GET /api/health/deep
Status: ✓ Working
Use: Comprehensive dependency checks
Endpoint: GET /api/health/routes
Status: ✓ Working
Use: Verify all API routes are accessible
All health endpoints return verifiable evidence of system state.
After implementing Verification Protocol:
| Metric | Before | After | |--------|--------|-------| | Tasks verified without evidence | 100% | 0% | | False completions accepted | Unknown | 0% | | Completion claims with evidence | 0% | 100% | | Automatic escalation to human | N/A | Happens after 3 failures | | Audit trail completeness | Partial | Full with evidence |
1. NEVER verify your own work
2. ALWAYS use Independent Verifier
3. ALWAYS provide EVIDENCE
4. NEVER assume success
5. BLOCK completion without proof
6. ESCALATE after 3 failures
Status: Production Ready (v1.0.0) Last Updated: 2025-12-02 Critical: Yes - Blocks all task completions without proof
development
Apple Human Interface Guidelines for content display components. Use this skill when the user asks about charts component, collection view, image view, web view, color well, image well, activity view, lockup, data visualization, content display, displaying images, rendering web content, color pickers, or presenting collections of items in Apple apps. Also use when the user says how should I display charts, what's the best way to show images, should I use a web view, how do I build a grid of items, what component shows media, or how do I present a share sheet. Cross-references: hig-foundations for color/typography/accessibility, hig-patterns for data visualization patterns, hig-components-layout for structural containers, hig-platforms for platform-specific component behavior.
tools
Automate HelpDesk tasks via Rube MCP (Composio): list tickets, manage views, use canned responses, and configure custom fields. Always search tools first for current schemas.
testing
Expert Haskell engineer specializing in advanced type systems, pure functional design, and high-reliability software. Use PROACTIVELY for type-level programming, concurrency, and architecture guidance.
tools
GraphQL gives clients exactly the data they need - no more, no less. One endpoint, typed schema, introspection. But the flexibility that makes it powerful also makes it dangerous. Without proper controls, clients can craft queries that bring down your server. This skill covers schema design, resolvers, DataLoader for N+1 prevention, federation for microservices, and client integration with Apollo/urql. Key insight: GraphQL is a contract. The schema is the API documentation. Design it carefully.