SKILL.md

name:: spec-compliance-audit
description:: Trigger HAS_DOCS flag in template_recommendations.md (recon detects non-empty DOCS_PATH - whitepaper, spec, or design doc provided) - Agent Type general-purpose (standalone nich...

Niche Agent: Spec-to-Code Compliance

Trigger: HAS_DOCS flag in template_recommendations.md (recon detects non-empty DOCS_PATH - whitepaper, spec, or design doc provided) Agent Type: general-purpose (standalone niche agent, NOT injected into another agent) Budget: 1 depth budget slot in Phase 4b iteration 1 Finding prefix: [SPEC-N]

When This Agent Spawns

Recon Agent 1B processes DOCS_PATH (whitepaper, spec, or design doc). If docs are non-empty and contain protocol behavior claims (fee structures, token distribution, thresholds, permissions, state transitions), recon sets HAS_DOCS flag in the BINDING MANIFEST under ## Niche Agents.

The orchestrator spawns this agent in Phase 4b iteration 1 alongside standard agents (1 budget slot). The agent gets a CLEAN context window with ONLY the docs and code - zero attention dilution with other findings.

Why a Dedicated Agent

Spec compliance requires reading two large artifacts (documentation + code) and systematically comparing them. Injecting this into a breadth agent would cause severe attention dilution - the agent would either skim the docs or skip compliance checks in favor of vulnerability hunting. A dedicated agent ensures every spec claim is verified.

Agent Prompt Template

Task(subagent_type="general-purpose", prompt="
You are the Spec Compliance Agent. You compare documentation claims against actual code behavior.

## Your Inputs
Read:
- The documentation file(s) at {DOCS_PATH}
- {SCRATCHPAD}/design_context.md (extracted trust assumptions)
- {SCRATCHPAD}/function_list.md (all functions)
- {SCRATCHPAD}/state_variables.md (all state variables)
- Source files in scope

## Processing Protocol (MANDATORY)

For each analysis step below, execute in order:
1. **ENUMERATE targets**: List every entity the step applies to (claims, functions, parameters) as a numbered list before analysis begins.
2. **PROCESS exhaustively**: Analyze each numbered entity. Mark each "DONE" or "N/A (reason)" before moving to the next.
3. **COVERAGE GATE**: Count enumerated vs processed. If any entity lacks a marker, process it before proceeding to the next step.

## STEP 1: Extract Spec Claims

Read the documentation thoroughly. Extract every CONCRETE, TESTABLE claim into a structured list:

| # | Claim | Source Section | Claim Type | Testable? |
|---|-------|---------------|------------|-----------|

**Claim Types**:
- PARAMETER: Specific numeric value (fee = 0.3%, max supply = 1M, cooldown = 7 days)
- FLOW: Token/value flow description (fees go to treasury, rewards distributed proportionally)
- PERMISSION: Access control claim (only admin can pause, anyone can liquidate)
- INVARIANT: Protocol-wide guarantee (total shares == total assets, no negative balances)
- SEQUENCE: Operational ordering (must stake before claiming, lock before unlock)
- THRESHOLD: Boundary condition (liquidation at 80% LTV, quorum at 50%+1)

Skip vague/marketing claims ('secure', 'efficient', 'battle-tested'). Only extract claims that can be verified against code.

**Target**: 10-30 claims depending on doc depth. If docs are thin (<10 claims), note coverage gap and proceed.

## STEP 2: Verify Each Claim Against Code

For EACH extracted claim, find the corresponding code and verify:

| # | Claim | Code Location | Match? | Details |
|---|-------|-------------- |--------|---------|

**Match types**:
- MATCH: Code implements exactly what spec says
- MISMATCH: Code contradicts spec (wrong value, wrong logic, wrong recipient)
- PARTIAL: Code partially implements (some cases match, some don't)
- MISSING: Spec describes feature that code does not implement
- STRONGER: Code has stricter constraints than spec requires (usually safe)
- WEAKER: Code has looser constraints than spec states (usually a finding)

For each non-MATCH result, read the actual code and quote the specific lines.

## STEP 3: Classify Divergences

For each MISMATCH, MISSING, or WEAKER result:

1. **Impact**: What goes wrong if users trust the spec but code behaves differently?
2. **Severity**: Use standard matrix (Impact x Likelihood). Likelihood is HIGH if users/integrators would reasonably rely on the spec claim.
3. **Root cause**: Is this a doc bug (code is correct, doc is wrong) or code bug (doc is correct, code is wrong)? Report BOTH - the audit team decides.

## STEP 4: Check Inverse - Code Without Spec

Scan function_list.md for significant functions that the documentation does NOT mention:
- State-changing functions with no doc coverage
- Fee/reward mechanisms not described in docs
- Emergency/admin functions not in the trust model

These are not vulnerabilities per se, but document them as INFO findings - undocumented behavior is a trust risk.

## STEP 5: Enforcement-Gap Check (L1 and Cross-Chain)

For each PARAMETER, THRESHOLD, INVARIANT, or SEQUENCE claim marked MATCH
in STEP 2, verify there is an ACTIVE CHECK in code — not merely a stated
assumption. A claim like 'data producers must upload every partition' is
NOT satisfied by a constant or comment; there must be a code path that
slashes / rejects / alarms when the claim is violated. The bug class is
'spec claims X; code relies on honest actors to volunteer X'.

Concretely, for each matched claim:

| Claim | Documented Obligation | Code Enforcement Site | Penalty on Violation |

If the "Code Enforcement Site" column is empty or points to a comment,
promote to a finding tagged `[SPEC-NO-ENFORCEMENT:{claim}]`. Severity is
High when violation is silently profitable for the actor (e.g., validator
gets block reward without performing the claimed work); Medium when it
degrades service quality without direct economic gain.

This step exists because the DA-commitment class of bug — where a validator
commits to data availability but nothing downstream samples / verifies /
challenges the commitment — is invisible to STEP 2 (the spec says X, the
code says X, both agree X is the contract; neither enforces X).

## STEP 6: Implicit-Assumption Extraction

Re-scan the documentation for statements of the form 'we assume that',
'trusted to', 'it is expected that', 'relayers / validators / operators
will' — these are implicit trust statements that look like design
commentary but are actually unverified preconditions. For each such
statement, add a row to:

| Assumed Behavior | Who | Check in Code? | What Breaks if False |

Any row with "Check in Code? = NO" and a blast radius greater than 'a
single actor's own reward' is a finding tagged
`[SPEC-IMPLICIT-TRUST:{actor}:{behavior}]`.

**Coverage assertion**: Before returning, verify every entity enumerated under each step has been processed. Report enumerated vs analyzed counts in your return message.

## Output Requirements
Write to {SCRATCHPAD}/niche_spec_compliance_findings.md
Use finding IDs: [SPEC-1], [SPEC-2]...
Use standard finding format with Verdict, Severity, Location, Description, Impact, Evidence.

For each finding, include:
- **Spec Claim**: Exact quote from documentation
- **Code Reality**: Exact code behavior with file:line reference
- **Divergence Type**: MISMATCH / MISSING / WEAKER

Maximum 10 findings - prioritize by severity.

## Quality Gate
Every finding MUST cite both the spec source (section/page) AND the code location (file:line).
Findings without both references will be discarded.

Return: 'DONE: {N} spec divergences - {M} MISMATCH, {P} MISSING, {W} WEAKER, {I} undocumented behaviors'
")

Integration Point

This agent's output (niche_spec_compliance_findings.md) is read by:

Phase 4a inventory merge (after Phase 4b iteration 1)
Phase 4c chain analysis (enabler enumeration - spec mismatches can enable other attacks)
Phase 6 report writers (findings appear in the report like any other finding)

SKILL.md

name:: spec-compliance-audit
description:: Trigger HAS_DOCS flag in template_recommendations.md (recon detects non-empty DOCS_PATH - whitepaper, spec, or design doc provided) - Agent Type general-purpose (standalone nich...

Niche Agent: Spec-to-Code Compliance

Trigger: HAS_DOCS flag in template_recommendations.md (recon detects non-empty DOCS_PATH - whitepaper, spec, or design doc provided) Agent Type: general-purpose (standalone niche agent, NOT injected into another agent) Budget: 1 depth budget slot in Phase 4b iteration 1 Finding prefix: [SPEC-N]

When This Agent Spawns

Why a Dedicated Agent

Agent Prompt Template

Task(subagent_type="general-purpose", prompt="
You are the Spec Compliance Agent. You compare documentation claims against actual code behavior.

## Your Inputs
Read:
- The documentation file(s) at {DOCS_PATH}
- {SCRATCHPAD}/design_context.md (extracted trust assumptions)
- {SCRATCHPAD}/function_list.md (all functions)
- {SCRATCHPAD}/state_variables.md (all state variables)
- Source files in scope

## Processing Protocol (MANDATORY)

For each analysis step below, execute in order:
1. **ENUMERATE targets**: List every entity the step applies to (claims, functions, parameters) as a numbered list before analysis begins.
2. **PROCESS exhaustively**: Analyze each numbered entity. Mark each "DONE" or "N/A (reason)" before moving to the next.
3. **COVERAGE GATE**: Count enumerated vs processed. If any entity lacks a marker, process it before proceeding to the next step.

## STEP 1: Extract Spec Claims

Read the documentation thoroughly. Extract every CONCRETE, TESTABLE claim into a structured list:

| # | Claim | Source Section | Claim Type | Testable? |
|---|-------|---------------|------------|-----------|

**Claim Types**:
- PARAMETER: Specific numeric value (fee = 0.3%, max supply = 1M, cooldown = 7 days)
- FLOW: Token/value flow description (fees go to treasury, rewards distributed proportionally)
- PERMISSION: Access control claim (only admin can pause, anyone can liquidate)
- INVARIANT: Protocol-wide guarantee (total shares == total assets, no negative balances)
- SEQUENCE: Operational ordering (must stake before claiming, lock before unlock)
- THRESHOLD: Boundary condition (liquidation at 80% LTV, quorum at 50%+1)

Skip vague/marketing claims ('secure', 'efficient', 'battle-tested'). Only extract claims that can be verified against code.

**Target**: 10-30 claims depending on doc depth. If docs are thin (<10 claims), note coverage gap and proceed.

## STEP 2: Verify Each Claim Against Code

For EACH extracted claim, find the corresponding code and verify:

| # | Claim | Code Location | Match? | Details |
|---|-------|-------------- |--------|---------|

**Match types**:
- MATCH: Code implements exactly what spec says
- MISMATCH: Code contradicts spec (wrong value, wrong logic, wrong recipient)
- PARTIAL: Code partially implements (some cases match, some don't)
- MISSING: Spec describes feature that code does not implement
- STRONGER: Code has stricter constraints than spec requires (usually safe)
- WEAKER: Code has looser constraints than spec states (usually a finding)

For each non-MATCH result, read the actual code and quote the specific lines.

## STEP 3: Classify Divergences

For each MISMATCH, MISSING, or WEAKER result:

1. **Impact**: What goes wrong if users trust the spec but code behaves differently?
2. **Severity**: Use standard matrix (Impact x Likelihood). Likelihood is HIGH if users/integrators would reasonably rely on the spec claim.
3. **Root cause**: Is this a doc bug (code is correct, doc is wrong) or code bug (doc is correct, code is wrong)? Report BOTH - the audit team decides.

## STEP 4: Check Inverse - Code Without Spec

Scan function_list.md for significant functions that the documentation does NOT mention:
- State-changing functions with no doc coverage
- Fee/reward mechanisms not described in docs
- Emergency/admin functions not in the trust model

These are not vulnerabilities per se, but document them as INFO findings - undocumented behavior is a trust risk.

## STEP 5: Enforcement-Gap Check (L1 and Cross-Chain)

For each PARAMETER, THRESHOLD, INVARIANT, or SEQUENCE claim marked MATCH
in STEP 2, verify there is an ACTIVE CHECK in code — not merely a stated
assumption. A claim like 'data producers must upload every partition' is
NOT satisfied by a constant or comment; there must be a code path that
slashes / rejects / alarms when the claim is violated. The bug class is
'spec claims X; code relies on honest actors to volunteer X'.

Concretely, for each matched claim:

| Claim | Documented Obligation | Code Enforcement Site | Penalty on Violation |

If the "Code Enforcement Site" column is empty or points to a comment,
promote to a finding tagged `[SPEC-NO-ENFORCEMENT:{claim}]`. Severity is
High when violation is silently profitable for the actor (e.g., validator
gets block reward without performing the claimed work); Medium when it
degrades service quality without direct economic gain.

This step exists because the DA-commitment class of bug — where a validator
commits to data availability but nothing downstream samples / verifies /
challenges the commitment — is invisible to STEP 2 (the spec says X, the
code says X, both agree X is the contract; neither enforces X).

## STEP 6: Implicit-Assumption Extraction

Re-scan the documentation for statements of the form 'we assume that',
'trusted to', 'it is expected that', 'relayers / validators / operators
will' — these are implicit trust statements that look like design
commentary but are actually unverified preconditions. For each such
statement, add a row to:

| Assumed Behavior | Who | Check in Code? | What Breaks if False |

Any row with "Check in Code? = NO" and a blast radius greater than 'a
single actor's own reward' is a finding tagged
`[SPEC-IMPLICIT-TRUST:{actor}:{behavior}]`.

**Coverage assertion**: Before returning, verify every entity enumerated under each step has been processed. Report enumerated vs analyzed counts in your return message.

## Output Requirements
Write to {SCRATCHPAD}/niche_spec_compliance_findings.md
Use finding IDs: [SPEC-1], [SPEC-2]...
Use standard finding format with Verdict, Severity, Location, Description, Impact, Evidence.

For each finding, include:
- **Spec Claim**: Exact quote from documentation
- **Code Reality**: Exact code behavior with file:line reference
- **Divergence Type**: MISMATCH / MISSING / WEAKER

Maximum 10 findings - prioritize by severity.

## Quality Gate
Every finding MUST cite both the spec source (section/page) AND the code location (file:line).
Findings without both references will be discarded.

Return: 'DONE: {N} spec divergences - {M} MISMATCH, {P} MISSING, {W} WEAKER, {I} undocumented behaviors'
")

Integration Point

This agent's output (niche_spec_compliance_findings.md) is read by:

Phase 4a inventory merge (after Phase 4b iteration 1)
Phase 4c chain analysis (enabler enumeration - spec mismatches can enable other attacks)
Phase 6 report writers (findings appear in the report like any other finding)

Adoption

plamentsv/spec-compliance-audit

$ install --global

Security Scan Results

SKILL.md

Niche Agent: Spec-to-Code Compliance

When This Agent Spawns

Why a Dedicated Agent

Agent Prompt Template

Integration Point

Related Skills

plamentsv/fork-ancestry

plamentsv/fork-ancestry

plamentsv/fork-ancestry

plamentsv/stableswap-compliance

plamentsv/spec-compliance-audit

$ install --global

Security Scan Results

SKILL.md

Niche Agent: Spec-to-Code Compliance

When This Agent Spawns

Why a Dedicated Agent

Agent Prompt Template

Integration Point

Related Skills

plamentsv/fork-ancestry

plamentsv/fork-ancestry

plamentsv/fork-ancestry

plamentsv/stableswap-compliance