agents/skills/aptos/verification-protocol/SKILL.md
How to prove a hypothesis is TRUE or FALSE using Move unit tests.
npx skillsauth add plamentsv/plamen verification-protocolInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
How to prove a hypothesis is TRUE or FALSE using Move unit tests.
CRITICAL: For EVERY piece of evidence used in verification, you MUST tag its source. Evidence from mocks or unverified external modules CANNOT support a REFUTED verdict.
| Tag | Meaning | Valid for REFUTED? | |-----|---------|-------------------| | [PROD-ONCHAIN] | Production module verified on Aptos Explorer | YES | | [PROD-SOURCE] | Source code verified on-chain (Aptos Explorer source verification) | YES | | [CODE] | Audited codebase (in-scope) | YES | | [MOCK] | Mock/test module | NO | | [EXT-UNV] | External module, unverified behavior | NO | | [DOC] | Documentation/spec only | NO (needs verification) |
Before ANY verdict, fill this table:
### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|
| "External module returns X" | Mock module | [MOCK] | NO |
| "State changes to Y" | protocol_module.move:123 | [CODE] | YES |
| "Coin transfer triggers Z" | Aptos Explorer source | [PROD-ONCHAIN] | YES |
AUTOMATIC OVERRIDE: If ANY evidence supporting REFUTED has tag [MOCK] or [EXT-UNV]:
Example:
## Verdict: REFUTED -> CONTESTED (mock evidence override)
### Evidence Audit
| Claim | Source | Tag | Valid? |
|-------|--------|-----|--------|
| "Staking returns shares" | test_staking.move:45 | [MOCK] | NO |
**Override reason**: REFUTED verdict relies on mock behavior at test_staking.move:45.
Production module behavior is UNVERIFIED. Must verify against on-chain source.
Before writing ANY test code, you MUST answer:
NOT: "Something is inconsistent"
NOT: "State is wrong"
NOT: "Capability leak possible"
YES: "[Variable/resource] is [read/written/moved] at [location] but should be
[read/written/moved] at [other location] because [specific reason]"
NOT: "Values are different"
NOT: "State changed"
YES: "Before operation: [resource/value] = [expected value]
After operation: [resource/value] = [actual value]
Expected: [what it should be]"
NOT: assert!(bug_exists, 0)
NOT: assert!(!is_secure, 0)
YES: assert!(actual_value == expected_value, ERROR_CODE)
OR: assert!(before != after, ERROR_CODE) // "value changed when it shouldn't"
OR: assert!(error > threshold, ERROR_CODE) // "error exceeds acceptable threshold"
If you cannot answer all three -> ASK FOR CLARIFICATION
Before writing test code, verify these two gates. If either FAILS, adjust the hypothesis.
Trace a call path from a permissionless entry point to the vulnerable code.
If NO entry point reaches the vulnerable code → UNREACHABLE → FALSE_POSITIVE. If reachable only through a restricted path → document the restriction, adjust likelihood.
Substitute real-world value domains into the expression that triggers the bug.
If the bug requires values outside feasible domains → INFEASIBLE → FALSE_POSITIVE. If feasible only at extreme but realistic parameters → document the threshold, proceed with adjusted severity.
Both gates PASS → proceed to PoC. Either gate FAILS → document and stop.
See
templates.mdin this directory for all Move test file templates and Move-specific test patterns.
The assertion that "proves the bug" succeeded.
assert!(after != before, 0) passes -> values ARE different (bug exists)assert!(error > threshold, 0) passes -> error IS above threshold (bug exists)| Failure | Meaning | Action |
|---------|---------|--------|
| Assertion failed (abort code) | Bug doesn't exist as hypothesized | Re-examine hypothesis |
| Abort in setup | Module initialization wrong | Fix setup (check init order, missing resources) |
| Abort in action | Operation blocked (access control, precondition) | Check preconditions, signer requirements |
| ARITHMETIC_ERROR (0x20001) | Overflow/underflow or division by zero | Check calculations, validate inputs |
| RESOURCE_NOT_FOUND | Missing move_to in setup | Ensure all required resources are initialized |
| ALREADY_EXISTS | Duplicate resource creation | Check init called only once |
| Issue | Cause | Fix |
|-------|-------|-----|
| ENOT_FOUND on coin operations | Account not registered for coin type | Add coin::register<CoinType>(user) before operations |
| Timestamp not available | timestamp module not initialized | Add timestamp::set_time_has_started_for_testing(aptos_framework) |
| Object not found | Object created at unexpected address | Use object::create_named_object with deterministic seed |
| Module not published | Test module can't import protocol module | Check Move.toml dependencies and test address mapping |
| Signer mismatch | @protocol_addr doesn't match expected | Verify #[test(...)] signer addresses match module publish address |
Attempt 1: Direct implementation of test strategy from hypothesis
Attempt 2: Adjust parameters
Attempt 3: Re-examine assumptions
After 5 attempts:
## Verdict: CONFIRMED
### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|
### Bug Mechanism Verified
{Explain what the test proves in 2-3 sentences}
### Test File
`tests/audit/test_hypothesis_N.move`
### Test Output
{Paste relevant aptos move test output}
### Key Evidence
| Metric | Value |
|--------|-------|
| Before | {value} |
| After | {value} |
| Expected | {value} |
| Difference | {calculation} |
### Severity: {LEVEL}
{Justification in 1-2 sentences}
### RAG Evidence
- **Attack Vectors Consulted**: [list bug classes queried]
- **Similar Exploits Found**: [count and brief descriptions]
- **PoC Template Used**: [yes/no, which template]
- **Historical Precedent**: [describe any matching historical vulnerabilities]
## Verdict: FALSE_POSITIVE
### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|
### Attempts Made
**Attempt 1:**
- Approach: {description}
- Result: {what happened}
- Learning: {insight}
**Attempt 2:**
- Approach: {description}
- Result: {what happened}
- Learning: {insight}
**Attempt 3:**
- Approach: {description}
- Result: {what happened}
- Learning: {insight}
### Why It's Not a Bug
{Explain the actual behavior and why hypothesis was wrong in 2-3 sentences}
## Verdict: CONTESTED
### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|
### Evidence Status
| Checkpoint | Status | Details |
|------------|--------|---------|
| External behavior verified against PRODUCTION | NO | Used mock behavior as evidence |
| All callers checked | YES | Checked A, B, C |
| Ref access paths fully traced | NO | Friend module re-export not analyzed |
| Profit calculated with attacker holding | NO | Only analyzed donation loss |
### Why This Cannot Be REFUTED
{Explain what evidence is missing to definitively rule out the bug}
### Escalation Required
- [ ] Fetch production module source from Aptos Explorer for {external dep}
- [ ] Re-analyze with attacker holding shares/tokens
- [ ] Check additional caller paths: {list}
- [ ] Trace Ref access through friend modules: {list}
### Current Assessment
Likely: {TRUE_POSITIVE / FALSE_POSITIVE / UNKNOWN}
Confidence: {LOW / MEDIUM}
MANDATORY: You MUST check ALL boxes before returning REFUTED. If ANY checkbox is NO -> Return CONTESTED, not REFUTED.
Before marking REFUTED, check:
public fun and public entry fun that reach the vulnerable codepublic(friend) fun callers via friend module analysis{scratchpad}/findings_inventory.md for CONFIRMED/PARTIAL findingsFor findings from Validation Sweep ([VS-]) or Blind Spot Scanner ([BLIND-]): apply Rule 13's 5-question test BEFORE any downgrade. HALT: If test shows users harmed AND unavoidable AND undocumented -> you CANNOT return FALSE_POSITIVE. Minimum verdict: CONTESTED. Defense parity gaps (Module A has protection X, Module B lacks it for same action) are NEVER "by design" -> minimum severity: Medium, minimum verdict: CONTESTED. Violating this halt is a workflow error equivalent to using [MOCK] evidence for REFUTED.
A finding is NEVER truly REFUTED until chain analysis completes.
If you mark a finding as REFUTED but document a missing precondition, the chain analyzer will search for other findings whose postconditions match your missing precondition. If found, the finding will be escalated to CONTESTED and combined into a chain hypothesis.
Example:
Advanced Protocol Reference: See
advanced.mdfor RAG queries before PoC, exchange rate finding severity, design flaw escalation, bidirectional role analysis, chain hypothesis, and Aptos-specific verification considerations.
development
Prepare Solidity projects for a security audit — test coverage, test quality, NatSpec docs, code hygiene, dependency health, best-practice enforcement, deployment readiness, and project documentation checks. Generates a scored Audit Readiness Report and optionally runs static analysis. Trigger on: "prepare for audit", "audit readiness", "pre-audit check", "audit prep", "NatSpec check", or any request to review a Solidity codebase before a security review.
development
Launch the Plamen deterministic Web3 security audit pipeline
development
Run the Plamen smart-contract audit wizard in Codex
testing
Launch the Plamen deterministic L1 infrastructure audit pipeline