security-testing-as-code: Assessment as Executable Project

Transform security assessment deliverables from static documents into version-controlled, executable projects. Findings become reproducible code; evidence becomes artifacts; knowledge becomes inheritable structure.

Core Thesis

"A diagnosis is a document" is the fundamental flaw. A diagnosis should be a project.

Traditional security reports (Word, Excel, portal entries) produce ephemeral knowledge that dies upon publication. This skill converts assessment outputs into living, version-controlled projects where:

PoC code replaces narrative claims
Saved HTTP requests replace "verified" checkboxes
Commit hashes enable exact state reproduction
Handoff plans replace tribal knowledge

When to Use

After completing any producer skill (sec-audit-static, sec-audit-dast, external-software-analysis)
When packaging findings for developer handoff
When building a reproducible evidence chain for compliance
When multiple assessment cycles target the same system (inheritable structure)

Inputs

Findings from any producer skill (JSON, SARIF, markdown)
PoC code or exploit scripts developed during assessment
HTTP request/response captures
Threat intelligence or preliminary research notes
Previous assessment artifacts (for delta/handoff)

Project Structure

Every assessment produces a self-contained project directory:

assessment/
├── README.md                  ← Overall context, progress status, how to run
├── handoff-plan.md            ← Gap analysis, inheritance specs for next assessor
├── analysis/
│   ├── attack-surface/        ← Endpoint/integration/asset inventory
│   ├── findings/              ← Structured finding records (JSON/markdown)
│   └── threat-model/          ← DFD, trust boundaries, attack scenarios
├── artifacts/
│   ├── poc/                   ← Reproducible PoC code per finding
│   │   ├── <finding-id>/      ← One directory per PoC
│   │   │   ├── README.md      ← Setup, run instructions, expected output
│   │   │   ├── exploit.*      ← Exploit code (any language)
│   │   │   ├── Dockerfile     ← (optional) Reproducible environment
│   │   │   └── evidence/      ← Screenshots, logs from successful run
│   │   └── ...
│   └── runtime/               ← HTTP evidence, config snapshots
│       ├── requests/          ← Saved HTTP request/response pairs
│       ├── configs/           ← Captured service configurations
│       └── scans/             ← Tool output (Semgrep, nuclei, etc.)
├── inputs/
│   ├── threat-intel/          ← Advisory research, CVE/KEV context
│   ├── scope/                 ← Engagement scope, target definitions
│   └── prior-assessments/     ← Previous cycle artifacts (for delta)
└── outputs/
    ├── report.md              ← Final narrative report
    ├── finding_summary.json   ← Machine-readable finding index
    └── reporting_summary.json ← Cross-skill reporting summary

Workflow

Phase 1: Initialize Project Structure

Create the directory tree above.
Populate README.md with:
- Assessment scope and target description
- Date, assessor, engagement ID
- How to reproduce findings (prerequisites, environment setup)
- Current status (in-progress / complete / handed-off)
Populate inputs/scope/ with engagement boundaries.

Phase 2: Capture Attack Surface

Run discovery tools and save raw output to artifacts/runtime/scans/.
Synthesize into analysis/attack-surface/ inventory:
- Endpoint list with auth requirements
- Integration points (external services, APIs)
- Technology stack and versions
Cross-reference with prior assessments if available in inputs/prior-assessments/.

Phase 3: Evidence-Driven Finding Documentation

For each finding:

Create finding record in analysis/findings/:
```
{
  "id": "FINDING-001",
  "title": "XSS in search parameter",
  "severity": "High",
  "category": "XSS",
  "provenance": "source-confirmed",
  "impacted_flow": ["F1"],
  "evidence_path": "artifacts/poc/FINDING-001/",
  "status": "confirmed"
}
```
Note on schema scope: this per-finding file format is the methodology view of a finding and is intentionally distinct from the producer finding_schema.json, which is a task-wrapper schema ({task_id, status, findings[], metadata}). The two are at different pipeline positions: producers emit task-level outputs, methodology files capture the per-finding assessor view with PoC/runtime evidence pointers. Per-finding files in analysis/findings/ should NOT be validated against the producer finding_schema.json directly. The formal contract between the two views is tracked in PR #4 / ADR-0006; the canonical schema and the producer↔methodology mapping will be defined in a follow-up ADR. The status, severity, and provenance enums in the template match the closed enums in ADR-0003 and ADR-0005 so values are interchangeable across the two views.
Build PoC in artifacts/poc/FINDING-001/:
- Executable code that demonstrates the vulnerability
- README.md with exact reproduction steps
- evidence/ with captured output from successful execution
- Optional Dockerfile for environment isolation
Save runtime evidence in artifacts/runtime/requests/:
- HTTP request/response pairs (curl commands, .http files, or HAR)
- Configuration snapshots showing vulnerable settings
- Tool scan output confirming the finding

Phase 4: PoC Quality Standards

Every PoC must meet these criteria:

| Criterion | Requirement | |---|---| | Reproducible | Another assessor can run it and get the same result | | Self-contained | All dependencies documented or containerized | | Non-destructive | Safe to run against target (no data loss, no DoS) | | Documented | README explains what it proves and what "success" looks like | | Versioned | Tied to a specific commit/state of the target |

PoC types by finding category:

| Category | PoC Format | Example | |---|---|---| | Injection (XSS/SQLi) | Payload + request capture | curl command + response showing injection | | Fuzzing discovery | Fuzzer config + corpus + crash log | Jazzer/AFL harness + evidence | | Auth bypass | Test script with two scenarios | With auth vs without auth comparison | | Deserialization | Gadget chain + trigger | Payload generator + server response | | Config weakness | Config diff + impact demo | Vulnerable vs hardened config comparison |

Phase 5: Handoff Plan

Create handoff-plan.md documenting:

Completed scope: What was tested, what evidence exists
Open gaps: What was NOT tested and why
- Time constraints
- Access limitations
- Environment unavailability
Recommended next steps: Prioritized by risk
Environment notes: How to set up the test environment
Credential/access notes: What access is needed (without storing actual credentials)
Known false positive patterns: Save future assessors from re-investigating

Phase 6: Output Generation

Generate outputs/report.md — narrative report with links to artifacts
Generate outputs/finding_summary.json — machine-readable index
Generate outputs/reporting_summary.json — compatible with sec-audit-static reporting schema
Ensure all artifact paths are relative (portable across machines)

Medical Records Metaphor

| Concept | Medical Certificate (bad) | Medical Records (good) | |---|---|---| | Nature | One-time result snapshot | Progressive, inheritable documentation | | Reproduction | "Trust me, I checked" | "Run this command, see this output" | | Handoff | Start from scratch | Continue from documented state | | Version | None | Git commit = exact state | | Knowledge | Dies with the author | Lives in the repository |

Integration with Other Skills

| Skill | Integration Point | |---|---| | sec-audit-static | Findings become analysis/findings/, tool outputs go to artifacts/runtime/scans/ | | sec-audit-dast | SARIF outputs go to artifacts/runtime/scans/, probes become PoCs | | external-software-analysis | Binary analysis notes go to analysis/, decompilation artifacts preserved | | sec-cluster | Cluster definitions inform analysis/attack-surface/ organization | | security-architecture-review | DFD/attack flow go to analysis/threat-model/, SPRs reference finding IDs |

Anti-Patterns to Avoid

| Anti-Pattern | Why It's Bad | Do This Instead | |---|---|---| | Screenshot-only evidence | Not reproducible, not searchable | Save the actual request/response + command | | "Verified" without artifacts | Unverifiable claim | Link to PoC directory with run instructions | | Findings in report only | Lost when report format changes | Structured JSON + markdown + PoC code | | Hardcoded absolute paths | Breaks on another machine | Use relative paths from project root | | Credentials in artifacts | Security risk | Reference credential store, never embed |

Resources

references/project_structure.md — Detailed directory layout reference
references/poc_standards.md — PoC quality and safety guidelines
templates/assessment/ — Starter project template (copy to begin)
templates/finding.json — Finding record template
templates/poc-readme.md — PoC README template
templates/handoff-plan.md — Handoff plan template

security-testing-as-code: Assessment as Executable Project

Core Thesis

"A diagnosis is a document" is the fundamental flaw. A diagnosis should be a project.

PoC code replaces narrative claims
Saved HTTP requests replace "verified" checkboxes
Commit hashes enable exact state reproduction
Handoff plans replace tribal knowledge

When to Use

After completing any producer skill (sec-audit-static, sec-audit-dast, external-software-analysis)
When packaging findings for developer handoff
When building a reproducible evidence chain for compliance
When multiple assessment cycles target the same system (inheritable structure)

Inputs

Findings from any producer skill (JSON, SARIF, markdown)
PoC code or exploit scripts developed during assessment
HTTP request/response captures
Threat intelligence or preliminary research notes
Previous assessment artifacts (for delta/handoff)

Project Structure

Every assessment produces a self-contained project directory:

assessment/
├── README.md                  ← Overall context, progress status, how to run
├── handoff-plan.md            ← Gap analysis, inheritance specs for next assessor
├── analysis/
│   ├── attack-surface/        ← Endpoint/integration/asset inventory
│   ├── findings/              ← Structured finding records (JSON/markdown)
│   └── threat-model/          ← DFD, trust boundaries, attack scenarios
├── artifacts/
│   ├── poc/                   ← Reproducible PoC code per finding
│   │   ├── <finding-id>/      ← One directory per PoC
│   │   │   ├── README.md      ← Setup, run instructions, expected output
│   │   │   ├── exploit.*      ← Exploit code (any language)
│   │   │   ├── Dockerfile     ← (optional) Reproducible environment
│   │   │   └── evidence/      ← Screenshots, logs from successful run
│   │   └── ...
│   └── runtime/               ← HTTP evidence, config snapshots
│       ├── requests/          ← Saved HTTP request/response pairs
│       ├── configs/           ← Captured service configurations
│       └── scans/             ← Tool output (Semgrep, nuclei, etc.)
├── inputs/
│   ├── threat-intel/          ← Advisory research, CVE/KEV context
│   ├── scope/                 ← Engagement scope, target definitions
│   └── prior-assessments/     ← Previous cycle artifacts (for delta)
└── outputs/
    ├── report.md              ← Final narrative report
    ├── finding_summary.json   ← Machine-readable finding index
    └── reporting_summary.json ← Cross-skill reporting summary

Workflow

Phase 1: Initialize Project Structure

Create the directory tree above.
Populate README.md with:
- Assessment scope and target description
- Date, assessor, engagement ID
- How to reproduce findings (prerequisites, environment setup)
- Current status (in-progress / complete / handed-off)
Populate inputs/scope/ with engagement boundaries.

Phase 2: Capture Attack Surface

Run discovery tools and save raw output to artifacts/runtime/scans/.
Synthesize into analysis/attack-surface/ inventory:
- Endpoint list with auth requirements
- Integration points (external services, APIs)
- Technology stack and versions
Cross-reference with prior assessments if available in inputs/prior-assessments/.

Phase 3: Evidence-Driven Finding Documentation

For each finding:

Create finding record in analysis/findings/:
```
{
  "id": "FINDING-001",
  "title": "XSS in search parameter",
  "severity": "High",
  "category": "XSS",
  "provenance": "source-confirmed",
  "impacted_flow": ["F1"],
  "evidence_path": "artifacts/poc/FINDING-001/",
  "status": "confirmed"
}
```
Note on schema scope: this per-finding file format is the methodology view of a finding and is intentionally distinct from the producer finding_schema.json, which is a task-wrapper schema ({task_id, status, findings[], metadata}). The two are at different pipeline positions: producers emit task-level outputs, methodology files capture the per-finding assessor view with PoC/runtime evidence pointers. Per-finding files in analysis/findings/ should NOT be validated against the producer finding_schema.json directly. The formal contract between the two views is tracked in PR #4 / ADR-0006; the canonical schema and the producer↔methodology mapping will be defined in a follow-up ADR. The status, severity, and provenance enums in the template match the closed enums in ADR-0003 and ADR-0005 so values are interchangeable across the two views.
Build PoC in artifacts/poc/FINDING-001/:
- Executable code that demonstrates the vulnerability
- README.md with exact reproduction steps
- evidence/ with captured output from successful execution
- Optional Dockerfile for environment isolation
Save runtime evidence in artifacts/runtime/requests/:
- HTTP request/response pairs (curl commands, .http files, or HAR)
- Configuration snapshots showing vulnerable settings
- Tool scan output confirming the finding

Phase 4: PoC Quality Standards

Every PoC must meet these criteria:

PoC types by finding category:

Phase 5: Handoff Plan

Create handoff-plan.md documenting:

Completed scope: What was tested, what evidence exists
Open gaps: What was NOT tested and why
- Time constraints
- Access limitations
- Environment unavailability
Recommended next steps: Prioritized by risk
Environment notes: How to set up the test environment
Credential/access notes: What access is needed (without storing actual credentials)
Known false positive patterns: Save future assessors from re-investigating

Phase 6: Output Generation

Generate outputs/report.md — narrative report with links to artifacts
Generate outputs/finding_summary.json — machine-readable index
Generate outputs/reporting_summary.json — compatible with sec-audit-static reporting schema
Ensure all artifact paths are relative (portable across machines)

Medical Records Metaphor

Integration with Other Skills

Anti-Patterns to Avoid

Resources

references/project_structure.md — Detailed directory layout reference
references/poc_standards.md — PoC quality and safety guidelines
templates/assessment/ — Starter project template (copy to begin)
templates/finding.json — Finding record template
templates/poc-readme.md — PoC README template
templates/handoff-plan.md — Handoff plan template

Adoption

windshock/security-testing-as-code

$ install --global

Security Scan Results

SKILL.md

security-testing-as-code: Assessment as Executable Project

Core Thesis

When to Use

Inputs

Project Structure

Workflow

Phase 1: Initialize Project Structure

Phase 2: Capture Attack Surface

Phase 3: Evidence-Driven Finding Documentation

Phase 4: PoC Quality Standards

Phase 5: Handoff Plan

Phase 6: Output Generation

Medical Records Metaphor

Integration with Other Skills

Anti-Patterns to Avoid

Resources

Related Reading

Related Skills

windshock/sec-reference-retrieval

windshock/security-architecture-review

windshock/sec-cluster

windshock/sec-audit-static

windshock/security-testing-as-code

$ install --global

Security Scan Results

SKILL.md

security-testing-as-code: Assessment as Executable Project

Core Thesis

When to Use

Inputs

Project Structure

Workflow

Phase 1: Initialize Project Structure

Phase 2: Capture Attack Surface

Phase 3: Evidence-Driven Finding Documentation

Phase 4: PoC Quality Standards

Phase 5: Handoff Plan

Phase 6: Output Generation

Medical Records Metaphor

Integration with Other Skills

Anti-Patterns to Avoid

Resources

Related Reading

Related Skills

windshock/sec-reference-retrieval

windshock/security-architecture-review

windshock/sec-cluster

windshock/sec-audit-static