Pentest Orchestrator

Manage end-to-end penetration test engagements across 7 structured phases with decision gates, agent coordination, and adaptation logic.

When to Use

✅ USE this skill when:

"Run a pentest on [target]"
"Start a penetration test engagement"
"Orchestrate the recon → enum → exploit workflow"
"What should I do next in this engagement?"
A pentest phase completed and you need the next step
A phase failed and you need to pivot

When NOT to Use

❌ DON'T use this skill when:

Running a single nmap scan (use enum skill directly)
Writing a standalone report without engagement context (use reporting skill)
User wants a quick vulnerability check, not a full engagement — use the quick-scan path in scripts/quick-scan/DISPATCH.md
No authorization/scope has been confirmed

Scripts-First Reuse Rule

Treat scripts/ as the default reusable operations layer for full pentests.

Required behavior:

prefer scripts/orchestration/*.py planning/runners before hand-writing repeated phase logic
prefer scripts/shared/manifests/ and target-family planning before manual phase planning when the target traits are known or inferable
prefer existing helpers under scripts/recon/, scripts/enum/, scripts/vuln/, scripts/exploit/, and scripts/post-exploit/ before inventing new one-off command chains
only fall back to fully manual flows when the script layer does not fit, lacks coverage, or needs troubleshooting
keep quick-scan and full-pentest separate; do not substitute quick-scan profiles for full-pentest target-family/manifests

When a real engagement reveals a repeatable command sequence, parser need, checklist, or validation pattern that would improve future pentests:

promote it into scripts/ as a reusable helper, manifest, parser, or docs update
do not leave reusable logic trapped only inside the engagement folder
keep evidence and target-specific outputs in engagements/<target>/, but keep reusable operational logic in scripts/
prefer to capture the reusable upgrade during or immediately after the engagement once the pattern is verified live

OpenCode Integration

Load skills/opencode-utility/SKILL.md whenever the orchestrator or a phase agent hits a coding or scripting bottleneck.

Use it for:

building or refactoring parsers
quick automation helpers
evidence formatting utilities
reusable wrappers that reduce repetitive terminal work
improving phase scripts and manifests when a reusable upgrade is justified
turning repeatable discoveries from a live pentest into maintainable helpers under scripts/

Default behavior:

start in plan mode when the utility shape is unclear
use build mode when the implementation target is already clear
bias toward scripts/opencode/reusable/ for utilities likely to recur
use scripts/opencode/session/ for engagement-specific helpers
use scripts/opencode/throwaway/ only for urgent one-offs

Phase 0: Engagement Setup

Before any agent spawns, load skills/preengagement-essentials/SKILL.md when the engagement is real or authorization/scope/ROE are not already explicit.

When the user says pentest <target> or otherwise asks to start a real pentest engagement, do not jump straight into active testing.

Use this pre-engagement chat flow:

first ask only for the Assigned Penetration Tester fields:
- Organization name
- Assigned Tester Name
- Email address
do not spawn the Google Docs pre-engagement form until those three fields are answered
once those fields are provided, spawn the pre-engagement form and return its reference plus the engagement naming prompt
do not immediately ask in chat for the full authorization/scope block if the spawned pre-engagement form is intended to collect that intake
still enforce that active testing must not begin until authorization and scope are explicit

Before any active testing, collect and document this intake in the pre-engagement artifacts:

engagement title
target
test type
dates
rules of engagement
scope in
scope out
credentials provided
constraints
success criteria
approval / authorization reference

If any of these are missing, mark them TBD in the documentation and clearly state that active testing should not proceed until authorization and scope are explicit.

Before any agent spawns, confirm:

Authorization — Written permission exists for the target
Scope — Specific IPs, domains, and networks are defined
Rules of engagement — What's off-limits? Time windows?
Third-party / provider approvals — Hosted or cloud constraints are addressed
Documentation structure — Initialize the engagement with python3 scripts/orchestration/init_engagement_docs.py <target-name> ...
Target-family planning baseline — When target traits are known or reasonably inferable, prefer python3 scripts/orchestration/plan_target_family.py before hand-writing phase plans. Use --family <name> when the family is already known, or --hint "<target description>" --target <host-or-url> --engagement <engagement-path> to recommend and expand one automatically. Treat the output as the default reusable baseline for a full pentest, not for quick-scan.
Central registers — Maintain registers/master-activity-log.md, findings-register.md, evidence-register.md, attack-path-register.md, and asset-register.md

Preferred full-pentest planning flow when the target type is known or inferable:

optionally preview the recommendation with python3 scripts/orchestration/recommend_target_family.py --hint "<target description>"
inspect the composed rationale with python3 scripts/orchestration/describe_target_family.py --family <recommended-family> when you need the why
generate the reusable phase baseline with python3 scripts/orchestration/plan_target_family.py --family <family> --target <host-or-url> --engagement <engagement-path> or the equivalent --hint form
use that plan to guide which manifests, wrappers, and manual branches should anchor recon, enum, vuln, exploit, and post-exploit
if the target type truly is not inferable yet, fall back to ordinary phase planning and update the family choice once evidence improves

Read these references before kickoff:

references/engagement-documentation-protocol.md
references/engagement-doc-templates.md
references/phase-handoff.md

Use engagements/<target-name>/pre-engagement/engagement-charter.md and engagements/<target-name>/pre-engagement/scope-and-roe.md as the authoritative intake artifacts.

For compatibility with existing engagements, you may also create or refresh engagements/<target-name>/SCOPE_<target-name>_<YYYY-MM-DD>.md, but the charter and ROE files are now primary.

File naming convention: All files MUST include a datetime stamp for generated handoffs and versioned phase outputs: <TOPIC>_SUMMARY_<YYYY-MM-DD_HHMM>.md (see References below for the full format). This allows multiple versions to coexist and makes it easy to identify the latest.

Spawn the first agent only after the intake is recorded, the engagement is cleared for active testing, and any available target-family baseline has been reviewed.

If the target is a recognizable family, include the relevant planning output in the first spawn and keep it as the preferred default baseline for later phases.

First, plan the family baseline when possible:
  python3 scripts/orchestration/plan_target_family.py --hint "<target description>" --target <host-or-url> --engagement engagements/<target-name>

Spawn specter-recon with:
  task: "Perform passive recon on <target>. Before ad-hoc planning, read the charter, scope/ROE, documentation protocol, and the target-family baseline from scripts/orchestration/plan_target_family.py when one exists. Save all findings to engagements/<target-name>/recon/ and update the phase docs plus shared registers. Use the family plan as the default reusable baseline for full-pentest work, then branch manually from live evidence."
  engagement: <target-name>

Phase 1: Reconnaissance (specter-recon)

Objective: Build a target profile and map the attack surface without touching the target.

Activities:

OSINT gathering (DNS records, WHOIS, Shodan, certificate transparency)
Subdomain enumeration
Technology fingerprinting from public sources
Employee/organizational reconnaissance (if in scope)
Physical location mapping (if physical testing authorized)

Web search queries: See references/web-search-queries.md for phase-specific queries.

Completion criteria:

[ ] Target profile written (OS, services, technologies from public sources)
[ ] Attack surface map created (network ranges, domains, physical locations)
[ ] Vector candidates identified (network, physical, application, wireless)
[ ] python3 scripts/orchestration/generate_phase_summary.py --engagement <target-name> --phase recon used to draft the handoff when standardized artifacts exist
[ ] RECON_SUMMARY_<YYYY-MM-DD_HHMM>.md written — see references/phase-handoff.md for template

DECISION GATE: What vectors look promising?

Read the recon RECON_SUMMARY_*.md. Based on findings:

| Finding | Next Phase | Agent | |---------|-----------|-------| | Network services discoverable via public sources | Network Enumeration | specter-enum | | Physical location identified, physical testing authorized | Physical Enumeration | specter-enum (physical mode) | | Web applications / APIs discovered | Application Enumeration | specter-enum (app mode) | | Wireless networks identified | Wireless Enumeration | specter-enum (wireless mode) | | No promising vectors | Request operator input — provide what was found and ask for guidance |

Adaptation: If network recon yields nothing useful but physical access is available, pivot to physical. Do not force a dead-end vector.

Phase 2: Enumeration (specter-enum)

Objective: Actively probe the target to discover open services, versions, and attack surface details.

Input from Recon: Read engagements/<target-name>/recon/RECON_SUMMARY_*.md for the target profile and recommended vector.

Activities (vary by vector):

Network: If a target-family plan exists, treat its enum manifests and listed steps as the default baseline first. Otherwise prefer reusable wrappers like scripts/orchestration/run_enum_profile.py --profile enum-windows-host --target <target> --engagement <target-name> when they fit, then scripts/enum/ports/scan_ports_fast.sh, scripts/enum/ports/scan_ports_service.sh, and service-specific wrappers like SMB/RDP/WinRM/Web before custom manual scanning
Physical: Badge cloning attempts, lock assessment, network jack enumeration, dumpster diving assessment, social engineering prep
Application: Directory busting, parameter discovery, API endpoint mapping, technology version detection; prefer scripts/enum/web/enum_web_basic.sh for baseline web coverage before custom app enumeration
When repeated parsing, result normalization, or helper scripting appears, load skills/opencode-utility/SKILL.md and prefer reusable upgrades under scripts/opencode/reusable/ when they will improve current or future enum work
Wireless: SSID discovery, encryption assessment, handshake capture, evil twin detection
When stronger enum-phase methodology is needed, load skills/enum-phase-essentials/SKILL.md to reinforce fast-then-accurate workflows, validation gates, protocol-triggered deep dives, and clean service inventory handoffs

Completion criteria:

[ ] All open ports/services documented with versions
[ ] Service versions cross-referenced for known vulnerabilities
[ ] ENUM_SUMMARY_<YYYY-MM-DD_HHMM>.md written with service inventory

DECISION GATE: What did we find?

| Finding | Next Action | |---------|------------| | Open services with version numbers | Proceed to Vulnerability Analysis | | Services found but versions unclear | Run additional fingerprinting, then proceed | | No services found on network vector | Pivot to alternative vector (physical, app-layer) | | Complete dead end (no services, no physical access, no apps) | Report findings to operator, request scope adjustment |

Phase 3: Vulnerability Analysis (specter-vuln)

Objective: Match discovered services against known vulnerabilities and assess exploitability.

**Input from Enum: Read `engagements/<target-name>/enum/ENUM_SUMMARY_*.md`` for service inventory.

Activities:

If a target-family plan exists, start from its vuln manifests and sub-surface notes before narrowing into manual validation
CVE matching against service versions (searchsploit, NVD, vendor advisories)
Web search for latest exploits and PoCs — see references/web-search-queries.md
Configuration weakness analysis (default credentials, open shares, misconfigurations)
Exploitability scoring (CVSS, active exploitation in the wild)
Chain analysis — can multiple low-risk vulns combine into a high-risk path?
When repeated CVE triage formatting, evidence conversion, or report-helper scripting is needed, load skills/opencode-utility/SKILL.md and prefer reusable or session utilities instead of hand-writing one-off glue each time
When stronger vuln-phase methodology is needed, load skills/vuln-phase-essentials/SKILL.md to reinforce validation discipline, CVE/CWE/CVSS handling, KEV/EPSS-aware prioritization, and report-ready evidence standards

Completion criteria:

[ ] CVE list with exploitability assessment
[ ] Exploit plan written (ordered list of attempts with expected outcomes)
[ ] VULN_SUMMARY_<YYYY-MM-DD_HHMM>.md written with CVEs, exploit plan, and confidence levels

DECISION GATE: Is there an exploitable path?

| Finding | Next Action | |---------|------------| | Confirmed exploitable vulnerability with PoC | Proceed to Exploitation | | Potential vulnerability, needs verification | Run additional enumeration on the specific service | | Vulnerability exists but no known exploit | Research manual exploitation techniques, then proceed or report | | No exploitable path found | Report findings, recommend remediation, consider engagement complete |

Phase 4: Exploitation (specter-exploit)

Objective: Demonstrate impact by exploiting vulnerabilities in a controlled manner.

**Input from Vuln: Read `engagements/<target-name>/vuln/VULN_SUMMARY_*.md`` for the exploit plan.

Activities:

If a target-family plan exists, use its exploit baseline and notes to keep attempts evidence-first and aligned to the verified attack path
Execute exploits in the order specified by the exploit plan
Document success/failure for each attempt with evidence
Capture screenshots, command outputs, proof-of-concept
If initial access gained, establish a stable foothold
If a safe, authorized validation helper or exploit-lab parser is needed, load skills/opencode-utility/SKILL.md for coding support, but do not use it to create unsafe or unauthorized offensive tooling
Do NOT destroy data or cause denial of service (unless explicitly authorized)
When stronger exploit-phase methodology is needed, load skills/exploit-phase-essentials/SKILL.md to reinforce precondition checks, validation ladders, candidate selection discipline, and evidence/cleanup standards

Completion criteria:

[ ] Each exploit attempt documented with outcome
[ ] Evidence captured for successful exploits
[ ] Access level clearly documented
[ ] EXPLOIT_SUMMARY_<YYYY-MM-DD_HHMM>.md written with access details and credentials

DECISION GATE: What level of access was gained?

| Access Level | Next Action | |-------------|------------| | Root / SYSTEM / full admin | Proceed to Post-Exploitation | | Limited user access | Attempt privilege escalation, document attempts | | Service-level access only | Attempt escalation or report what was achieved | | Exploitation failed | Report what was attempted, what was partially achieved |

Phase 5: Post-Exploitation (specter-post)

Objective: Demonstrate the full business impact of compromised access.

**Input from Exploit: Read `engagements/<target-name>/exploit/EXPLOIT_SUMMARY_*.md`` for access level and credentials.

Activities:

If a target-family plan exists, use its post-exploit baseline and notes to structure impact capture instead of improvising the starting checklist
Credential harvesting (memory, files, databases, config files)
Lateral movement (pivot to other systems on the network)
Persistence mechanisms (scheduled tasks, authorized keys, registry)
Data exfiltration assessment (what sensitive data is accessible?)
Impact demonstration (read access to PII, financial data, intellectual property)
Document everything — every command, every finding
If note conversion, evidence summarization, or impact-formatting helpers would save time, load skills/opencode-utility/SKILL.md and place phase-specific code in scripts/opencode/session/ unless broader reuse is likely
When stronger post-exploitation methodology is needed, load skills/post-phase-essentials/SKILL.md to reinforce impact assessment, access-path discipline, telemetry-aware evidence, and cleanup/residual-risk reporting

Completion criteria:

[ ] Credentials found documented (hashed, not plaintext in reports)
[ ] Lateral movement paths mapped
[ ] Impact assessment written (what could an attacker do?)
[ ] POST_EXPLOIT_SUMMARY_<YYYY-MM-DD_HHMM>.md written with evidence and impact summary

Phase 6: Reporting (specter-report)

Objective: Compile all findings into actionable deliverables.

Input: Read ALL *_SUMMARY_*.md files from phases 0–5.

Activities:

Compile findings from all phases into a structured report
Generate executive summary (business risk, not technical jargon)
Technical findings with evidence (screenshots, command outputs, CVEs)
Score findings with the workspace CVSS house standard: CVSS v4.0 Base by default, CVSS v3.1 additionally when required for compatibility with public CVEs, NVD, vendor advisories, scanners, or client workflows
Include CVSS version, vector, numeric score, and short metric rationale for every scored finding; mark incomplete cases as provisional or unscored with explanation
Remediation guide with prioritized recommendations
Keep technical severity separate from final remediation priority by also considering exploit evidence, KEV/EPSS, exposure, asset criticality, and attack chaining
Include explicit cleanup / restoration status, tester-created artifacts, residual risk, and retest guidance
Generate presentation slides — see references/examples.md for engagement context
If report assembly, finding normalization, or markdown conversion becomes repetitive, load skills/opencode-utility/SKILL.md and prefer reusable reporting helpers over ad-hoc formatting
When stronger report-phase methodology is needed, load skills/report-phase-essentials/SKILL.md to reinforce multi-audience structure, QA gates, secure handling, remediation quality, and cleanup/restoration reporting
For real finalized engagements, automatically tell specter-report to create a native Google Doc from the final report and return the Docs link
For real finalized engagements, automatically tell specter-report to publish a PDF link for the final report
For real finalized engagements, automatically tell specter-report to create styled Google Slides from a generated PPTX and return the presentation link
Optionally keep a raw markdown upload in Drive as an archive copy
Skip publishing only for dry runs, mock engagements, or explicit no-publish instructions

Deliverables:

engagements/<target-name>/reporting/REPORT_FINAL_<YYYY-MM-DD_HHMM>.md
engagements/<target-name>/reporting/EXECUTIVE_SUMMARY_<YYYY-MM-DD_HHMM>.md
engagements/<target-name>/reporting/REMEDIATION_GUIDE_<YYYY-MM-DD_HHMM>.md
engagements/<target-name>/reporting/PROCESS_OVERVIEW_<YYYY-MM-DD_HHMM>.md
Presentation slides
Drive / Slides share links

Reporting spawn instruction template:

Spawn specter-report with:
  task: "Read all *_SUMMARY_*.md files under engagements/<target-name>/. Build the structured findings input needed by the production report generator. Generate REPORT_FINAL_<YYYY-MM-DD_HHMM>.md in engagements/<target-name>/reporting/ using the real branded implementation at reporting/scripts/generate_report.py. Also generate PROCESS_OVERVIEW_<YYYY-MM-DD_HHMM>.md as a stakeholder-friendly process narrative that explains what was actually done in each phase, what was observed, and why the next step happened. Include remediation and security enhancement recommendations for every finding. Because the user asked for the report, automatically publish it with reporting/scripts/generate_report.py --create-doc --create-slides --upload-drive --gdrive-account [email protected] --slides-title 'Pentest Report — <target-name>'. Return the local output path plus the Google Doc link, PDF link, and Slides link. Use raw gog docs/slides-from-markdown only as fallback if the branded generator path fails."
  engagement: <target-name>

Agent Coordination

Sequential Phases (Default)

Phases 0→1→2→3→4→5→6 run in sequence. Each phase's *_SUMMARY_*.md is the handoff to the next.

Parallel Execution

Only use parallel agents when:

Multiple targets in scope (spawn one agent per target for the same phase)
Multiple vectors from a decision gate are viable (run them in parallel, pick the best result)
Time is constrained and phases can be safely overlapped (e.g., web search for CVEs during enum)

Spawning Pattern

Spawn specter-<phase> with:
  task: "<Specific instructions based on decision gate output>"
  engagement: "<target-name>"
  input: "Read engagements/<target-name>/<previous-phase>/<PREV_PHASE>_SUMMARY_*.md"

For specter-report, default to publishing for any real engagement that reaches a final/completed report state:

Spawn specter-report with:
  task: "Read all prior *_SUMMARY_*.md files for <target-name>. Generate final reporting deliverables in engagements/<target-name>/reporting/, including PROCESS_OVERVIEW_<YYYY-MM-DD_HHMM>.md as the non-technical process narrative. Then create a Google Doc, publish/export a PDF, create Google Slides, and upload the raw markdown archive using reporting/scripts/generate_report.py --create-doc --create-slides --upload-drive --gdrive-account [email protected]. Return the local file path plus the Google Doc link, PDF link, and Slides link in the handoff. Skip publishing only for dry runs or explicit no-publish instructions."
  engagement: "<target-name>"

Always include:

The engagement target name
The specific path to save output
The path to the previous phase's handoff document

Adaptation Logic

When a Phase Fails or Times Out

Check the agent's output for partial findings
If partial data exists, proceed with what was found (document the gap)
If nothing useful, retry once with adjusted parameters
If retry fails, consult the decision gate for alternative vectors

When a Vector is a Dead End

Do NOT retry the same approach indefinitely
Check the decision gate table for pivot options
If all vectors exhausted, report to operator with a summary of what was tried
Example: Network recon on a Raspberry Pi with no open services → pivot to physical access or app-layer (web interface)

When Web Search is Unavailable

Fall back to training knowledge for CVE databases and exploit frameworks
Use searchsploit and local exploit databases
Manual research of vendor security advisories
Document that web search was unavailable (affects confidence in "latest" findings)

When Target is Offline

Retry up to 3 times with increasing intervals (5min, 15min, 30min)
If still offline, document and report to operator
Proceed with analysis of any data already gathered

Context Handoff Protocol

Every phase MUST produce *_SUMMARY_<YYYY-MM-DD_HHMM>.md in its engagement directory. See references/phase-handoff.md for the template.

That handoff is required, but it is not sufficient by itself. Every phase must also keep these current:

phase summary
activity log
evidence index
findings delta
next actions
shared registers when new assets, findings, evidence, or attack paths appear

Mandatory documentation on every meaningful run

Do not treat documentation as end-of-phase cleanup.

For every meaningful run, command batch, operator-supplied result set, or sub-agent handoff, update documentation in the same working turn before considering the run complete.

Minimum required updates per run:

append the action/result to the phase activity log
register any new evidence IDs in the evidence register and phase evidence index
update the phase summary and findings delta when the understanding changed
update next actions when the recommended path changed
update shared registers when new findings, assets, or attack paths appeared

A phase is incomplete if work happened but the engagement docs still describe an older state.

When delegating to sub-agents, explicitly instruct them to update their phase docs and registers before handing off. If they do not, the orchestrator must normalize the docs immediately after receiving results.

Key fields:

Found: What was discovered (specific, actionable data)
Not Found: What was checked but yielded nothing (prevents re-checking)
Recommended Next: Which phase and vector to pursue
Key Data: IPs, versions, credentials, CVEs, file paths — anything the next agent needs
Confidence: High / Medium / Low — how confident are we in these findings?

Web Search Integration

See references/web-search-queries.md for curated queries per phase. Key principles:

Always try web search for CVE lookups and latest exploit research
Fallback gracefully if search is unavailable (training knowledge + local tools)
Validate web search results against local databases (searchsploit, NVD)
Document when web search was used vs. unavailable

Quality Checklist

Before moving to the next phase, and after every meaningful run within a phase, verify:

[ ] *_SUMMARY_<YYYY-MM-DD_HHMM>.md exists and has all required fields
[ ] Phase summary, activity log, evidence index, findings delta, and next actions files are updated
[ ] Evidence files are saved in the engagement directory and registered
[ ] Findings, evidence, asset, and attack-path registers are updated where applicable
[ ] Decision gate has been evaluated and documented
[ ] Next agent (if applicable) has been given the correct input path
[ ] No sensitive data (passwords, keys) is stored in plaintext in documentation
[ ] The engagement docs reflect the latest run, not the previous one

Pentest Orchestrator

Manage end-to-end penetration test engagements across 7 structured phases with decision gates, agent coordination, and adaptation logic.

When to Use

✅ USE this skill when:

"Run a pentest on [target]"
"Start a penetration test engagement"
"Orchestrate the recon → enum → exploit workflow"
"What should I do next in this engagement?"
A pentest phase completed and you need the next step
A phase failed and you need to pivot

When NOT to Use

❌ DON'T use this skill when:

Running a single nmap scan (use enum skill directly)
Writing a standalone report without engagement context (use reporting skill)
User wants a quick vulnerability check, not a full engagement — use the quick-scan path in scripts/quick-scan/DISPATCH.md
No authorization/scope has been confirmed

Scripts-First Reuse Rule

Treat scripts/ as the default reusable operations layer for full pentests.

Required behavior:

prefer scripts/orchestration/*.py planning/runners before hand-writing repeated phase logic
prefer scripts/shared/manifests/ and target-family planning before manual phase planning when the target traits are known or inferable
prefer existing helpers under scripts/recon/, scripts/enum/, scripts/vuln/, scripts/exploit/, and scripts/post-exploit/ before inventing new one-off command chains
only fall back to fully manual flows when the script layer does not fit, lacks coverage, or needs troubleshooting
keep quick-scan and full-pentest separate; do not substitute quick-scan profiles for full-pentest target-family/manifests

When a real engagement reveals a repeatable command sequence, parser need, checklist, or validation pattern that would improve future pentests:

promote it into scripts/ as a reusable helper, manifest, parser, or docs update
do not leave reusable logic trapped only inside the engagement folder
keep evidence and target-specific outputs in engagements/<target>/, but keep reusable operational logic in scripts/
prefer to capture the reusable upgrade during or immediately after the engagement once the pattern is verified live

OpenCode Integration

Load skills/opencode-utility/SKILL.md whenever the orchestrator or a phase agent hits a coding or scripting bottleneck.

Use it for:

building or refactoring parsers
quick automation helpers
evidence formatting utilities
reusable wrappers that reduce repetitive terminal work
improving phase scripts and manifests when a reusable upgrade is justified
turning repeatable discoveries from a live pentest into maintainable helpers under scripts/

Default behavior:

start in plan mode when the utility shape is unclear
use build mode when the implementation target is already clear
bias toward scripts/opencode/reusable/ for utilities likely to recur
use scripts/opencode/session/ for engagement-specific helpers
use scripts/opencode/throwaway/ only for urgent one-offs

Phase 0: Engagement Setup

Before any agent spawns, load skills/preengagement-essentials/SKILL.md when the engagement is real or authorization/scope/ROE are not already explicit.

When the user says pentest <target> or otherwise asks to start a real pentest engagement, do not jump straight into active testing.

Use this pre-engagement chat flow:

first ask only for the Assigned Penetration Tester fields:
- Organization name
- Assigned Tester Name
- Email address
do not spawn the Google Docs pre-engagement form until those three fields are answered
once those fields are provided, spawn the pre-engagement form and return its reference plus the engagement naming prompt
do not immediately ask in chat for the full authorization/scope block if the spawned pre-engagement form is intended to collect that intake
still enforce that active testing must not begin until authorization and scope are explicit

Before any active testing, collect and document this intake in the pre-engagement artifacts:

engagement title
target
test type
dates
rules of engagement
scope in
scope out
credentials provided
constraints
success criteria
approval / authorization reference

If any of these are missing, mark them TBD in the documentation and clearly state that active testing should not proceed until authorization and scope are explicit.

Before any agent spawns, confirm:

Authorization — Written permission exists for the target
Scope — Specific IPs, domains, and networks are defined
Rules of engagement — What's off-limits? Time windows?
Third-party / provider approvals — Hosted or cloud constraints are addressed
Documentation structure — Initialize the engagement with python3 scripts/orchestration/init_engagement_docs.py <target-name> ...
Target-family planning baseline — When target traits are known or reasonably inferable, prefer python3 scripts/orchestration/plan_target_family.py before hand-writing phase plans. Use --family <name> when the family is already known, or --hint "<target description>" --target <host-or-url> --engagement <engagement-path> to recommend and expand one automatically. Treat the output as the default reusable baseline for a full pentest, not for quick-scan.
Central registers — Maintain registers/master-activity-log.md, findings-register.md, evidence-register.md, attack-path-register.md, and asset-register.md

Preferred full-pentest planning flow when the target type is known or inferable:

optionally preview the recommendation with python3 scripts/orchestration/recommend_target_family.py --hint "<target description>"
inspect the composed rationale with python3 scripts/orchestration/describe_target_family.py --family <recommended-family> when you need the why
generate the reusable phase baseline with python3 scripts/orchestration/plan_target_family.py --family <family> --target <host-or-url> --engagement <engagement-path> or the equivalent --hint form
use that plan to guide which manifests, wrappers, and manual branches should anchor recon, enum, vuln, exploit, and post-exploit
if the target type truly is not inferable yet, fall back to ordinary phase planning and update the family choice once evidence improves

Read these references before kickoff:

references/engagement-documentation-protocol.md
references/engagement-doc-templates.md
references/phase-handoff.md

Use engagements/<target-name>/pre-engagement/engagement-charter.md and engagements/<target-name>/pre-engagement/scope-and-roe.md as the authoritative intake artifacts.

For compatibility with existing engagements, you may also create or refresh engagements/<target-name>/SCOPE_<target-name>_<YYYY-MM-DD>.md, but the charter and ROE files are now primary.

Spawn the first agent only after the intake is recorded, the engagement is cleared for active testing, and any available target-family baseline has been reviewed.

If the target is a recognizable family, include the relevant planning output in the first spawn and keep it as the preferred default baseline for later phases.

First, plan the family baseline when possible:
  python3 scripts/orchestration/plan_target_family.py --hint "<target description>" --target <host-or-url> --engagement engagements/<target-name>

Spawn specter-recon with:
  task: "Perform passive recon on <target>. Before ad-hoc planning, read the charter, scope/ROE, documentation protocol, and the target-family baseline from scripts/orchestration/plan_target_family.py when one exists. Save all findings to engagements/<target-name>/recon/ and update the phase docs plus shared registers. Use the family plan as the default reusable baseline for full-pentest work, then branch manually from live evidence."
  engagement: <target-name>

Phase 1: Reconnaissance (specter-recon)

Objective: Build a target profile and map the attack surface without touching the target.

Activities:

OSINT gathering (DNS records, WHOIS, Shodan, certificate transparency)
Subdomain enumeration
Technology fingerprinting from public sources
Employee/organizational reconnaissance (if in scope)
Physical location mapping (if physical testing authorized)

Web search queries: See references/web-search-queries.md for phase-specific queries.

Completion criteria:

[ ] Target profile written (OS, services, technologies from public sources)
[ ] Attack surface map created (network ranges, domains, physical locations)
[ ] Vector candidates identified (network, physical, application, wireless)
[ ] python3 scripts/orchestration/generate_phase_summary.py --engagement <target-name> --phase recon used to draft the handoff when standardized artifacts exist
[ ] RECON_SUMMARY_<YYYY-MM-DD_HHMM>.md written — see references/phase-handoff.md for template

DECISION GATE: What vectors look promising?

Read the recon RECON_SUMMARY_*.md. Based on findings:

Adaptation: If network recon yields nothing useful but physical access is available, pivot to physical. Do not force a dead-end vector.

Phase 2: Enumeration (specter-enum)

Objective: Actively probe the target to discover open services, versions, and attack surface details.

Input from Recon: Read engagements/<target-name>/recon/RECON_SUMMARY_*.md for the target profile and recommended vector.

Activities (vary by vector):

Network: If a target-family plan exists, treat its enum manifests and listed steps as the default baseline first. Otherwise prefer reusable wrappers like scripts/orchestration/run_enum_profile.py --profile enum-windows-host --target <target> --engagement <target-name> when they fit, then scripts/enum/ports/scan_ports_fast.sh, scripts/enum/ports/scan_ports_service.sh, and service-specific wrappers like SMB/RDP/WinRM/Web before custom manual scanning
Physical: Badge cloning attempts, lock assessment, network jack enumeration, dumpster diving assessment, social engineering prep
Application: Directory busting, parameter discovery, API endpoint mapping, technology version detection; prefer scripts/enum/web/enum_web_basic.sh for baseline web coverage before custom app enumeration
When repeated parsing, result normalization, or helper scripting appears, load skills/opencode-utility/SKILL.md and prefer reusable upgrades under scripts/opencode/reusable/ when they will improve current or future enum work
Wireless: SSID discovery, encryption assessment, handshake capture, evil twin detection
When stronger enum-phase methodology is needed, load skills/enum-phase-essentials/SKILL.md to reinforce fast-then-accurate workflows, validation gates, protocol-triggered deep dives, and clean service inventory handoffs

Completion criteria:

[ ] All open ports/services documented with versions
[ ] Service versions cross-referenced for known vulnerabilities
[ ] ENUM_SUMMARY_<YYYY-MM-DD_HHMM>.md written with service inventory

DECISION GATE: What did we find?

Phase 3: Vulnerability Analysis (specter-vuln)

Objective: Match discovered services against known vulnerabilities and assess exploitability.

**Input from Enum: Read `engagements/<target-name>/enum/ENUM_SUMMARY_*.md`` for service inventory.

Activities:

If a target-family plan exists, start from its vuln manifests and sub-surface notes before narrowing into manual validation
CVE matching against service versions (searchsploit, NVD, vendor advisories)
Web search for latest exploits and PoCs — see references/web-search-queries.md
Configuration weakness analysis (default credentials, open shares, misconfigurations)
Exploitability scoring (CVSS, active exploitation in the wild)
Chain analysis — can multiple low-risk vulns combine into a high-risk path?
When repeated CVE triage formatting, evidence conversion, or report-helper scripting is needed, load skills/opencode-utility/SKILL.md and prefer reusable or session utilities instead of hand-writing one-off glue each time
When stronger vuln-phase methodology is needed, load skills/vuln-phase-essentials/SKILL.md to reinforce validation discipline, CVE/CWE/CVSS handling, KEV/EPSS-aware prioritization, and report-ready evidence standards

Completion criteria:

[ ] CVE list with exploitability assessment
[ ] Exploit plan written (ordered list of attempts with expected outcomes)
[ ] VULN_SUMMARY_<YYYY-MM-DD_HHMM>.md written with CVEs, exploit plan, and confidence levels

DECISION GATE: Is there an exploitable path?

Phase 4: Exploitation (specter-exploit)

Objective: Demonstrate impact by exploiting vulnerabilities in a controlled manner.

**Input from Vuln: Read `engagements/<target-name>/vuln/VULN_SUMMARY_*.md`` for the exploit plan.

Activities:

If a target-family plan exists, use its exploit baseline and notes to keep attempts evidence-first and aligned to the verified attack path
Execute exploits in the order specified by the exploit plan
Document success/failure for each attempt with evidence
Capture screenshots, command outputs, proof-of-concept
If initial access gained, establish a stable foothold
If a safe, authorized validation helper or exploit-lab parser is needed, load skills/opencode-utility/SKILL.md for coding support, but do not use it to create unsafe or unauthorized offensive tooling
Do NOT destroy data or cause denial of service (unless explicitly authorized)
When stronger exploit-phase methodology is needed, load skills/exploit-phase-essentials/SKILL.md to reinforce precondition checks, validation ladders, candidate selection discipline, and evidence/cleanup standards

Completion criteria:

[ ] Each exploit attempt documented with outcome
[ ] Evidence captured for successful exploits
[ ] Access level clearly documented
[ ] EXPLOIT_SUMMARY_<YYYY-MM-DD_HHMM>.md written with access details and credentials

DECISION GATE: What level of access was gained?

Phase 5: Post-Exploitation (specter-post)

Objective: Demonstrate the full business impact of compromised access.

**Input from Exploit: Read `engagements/<target-name>/exploit/EXPLOIT_SUMMARY_*.md`` for access level and credentials.

Activities:

If a target-family plan exists, use its post-exploit baseline and notes to structure impact capture instead of improvising the starting checklist
Credential harvesting (memory, files, databases, config files)
Lateral movement (pivot to other systems on the network)
Persistence mechanisms (scheduled tasks, authorized keys, registry)
Data exfiltration assessment (what sensitive data is accessible?)
Impact demonstration (read access to PII, financial data, intellectual property)
Document everything — every command, every finding
If note conversion, evidence summarization, or impact-formatting helpers would save time, load skills/opencode-utility/SKILL.md and place phase-specific code in scripts/opencode/session/ unless broader reuse is likely
When stronger post-exploitation methodology is needed, load skills/post-phase-essentials/SKILL.md to reinforce impact assessment, access-path discipline, telemetry-aware evidence, and cleanup/residual-risk reporting

Completion criteria:

[ ] Credentials found documented (hashed, not plaintext in reports)
[ ] Lateral movement paths mapped
[ ] Impact assessment written (what could an attacker do?)
[ ] POST_EXPLOIT_SUMMARY_<YYYY-MM-DD_HHMM>.md written with evidence and impact summary

Phase 6: Reporting (specter-report)

Objective: Compile all findings into actionable deliverables.

Input: Read ALL *_SUMMARY_*.md files from phases 0–5.

Activities:

Compile findings from all phases into a structured report
Generate executive summary (business risk, not technical jargon)
Technical findings with evidence (screenshots, command outputs, CVEs)
Score findings with the workspace CVSS house standard: CVSS v4.0 Base by default, CVSS v3.1 additionally when required for compatibility with public CVEs, NVD, vendor advisories, scanners, or client workflows
Include CVSS version, vector, numeric score, and short metric rationale for every scored finding; mark incomplete cases as provisional or unscored with explanation
Remediation guide with prioritized recommendations
Keep technical severity separate from final remediation priority by also considering exploit evidence, KEV/EPSS, exposure, asset criticality, and attack chaining
Include explicit cleanup / restoration status, tester-created artifacts, residual risk, and retest guidance
Generate presentation slides — see references/examples.md for engagement context
If report assembly, finding normalization, or markdown conversion becomes repetitive, load skills/opencode-utility/SKILL.md and prefer reusable reporting helpers over ad-hoc formatting
When stronger report-phase methodology is needed, load skills/report-phase-essentials/SKILL.md to reinforce multi-audience structure, QA gates, secure handling, remediation quality, and cleanup/restoration reporting
For real finalized engagements, automatically tell specter-report to create a native Google Doc from the final report and return the Docs link
For real finalized engagements, automatically tell specter-report to publish a PDF link for the final report
For real finalized engagements, automatically tell specter-report to create styled Google Slides from a generated PPTX and return the presentation link
Optionally keep a raw markdown upload in Drive as an archive copy
Skip publishing only for dry runs, mock engagements, or explicit no-publish instructions

Deliverables:

engagements/<target-name>/reporting/REPORT_FINAL_<YYYY-MM-DD_HHMM>.md
engagements/<target-name>/reporting/EXECUTIVE_SUMMARY_<YYYY-MM-DD_HHMM>.md
engagements/<target-name>/reporting/REMEDIATION_GUIDE_<YYYY-MM-DD_HHMM>.md
engagements/<target-name>/reporting/PROCESS_OVERVIEW_<YYYY-MM-DD_HHMM>.md
Presentation slides
Drive / Slides share links

Reporting spawn instruction template:

Spawn specter-report with:
  task: "Read all *_SUMMARY_*.md files under engagements/<target-name>/. Build the structured findings input needed by the production report generator. Generate REPORT_FINAL_<YYYY-MM-DD_HHMM>.md in engagements/<target-name>/reporting/ using the real branded implementation at reporting/scripts/generate_report.py. Also generate PROCESS_OVERVIEW_<YYYY-MM-DD_HHMM>.md as a stakeholder-friendly process narrative that explains what was actually done in each phase, what was observed, and why the next step happened. Include remediation and security enhancement recommendations for every finding. Because the user asked for the report, automatically publish it with reporting/scripts/generate_report.py --create-doc --create-slides --upload-drive --gdrive-account [email protected] --slides-title 'Pentest Report — <target-name>'. Return the local output path plus the Google Doc link, PDF link, and Slides link. Use raw gog docs/slides-from-markdown only as fallback if the branded generator path fails."
  engagement: <target-name>

Agent Coordination

Sequential Phases (Default)

Phases 0→1→2→3→4→5→6 run in sequence. Each phase's *_SUMMARY_*.md is the handoff to the next.

Parallel Execution

Only use parallel agents when:

Multiple targets in scope (spawn one agent per target for the same phase)
Multiple vectors from a decision gate are viable (run them in parallel, pick the best result)
Time is constrained and phases can be safely overlapped (e.g., web search for CVEs during enum)

Spawning Pattern

Spawn specter-<phase> with:
  task: "<Specific instructions based on decision gate output>"
  engagement: "<target-name>"
  input: "Read engagements/<target-name>/<previous-phase>/<PREV_PHASE>_SUMMARY_*.md"

For specter-report, default to publishing for any real engagement that reaches a final/completed report state:

Spawn specter-report with:
  task: "Read all prior *_SUMMARY_*.md files for <target-name>. Generate final reporting deliverables in engagements/<target-name>/reporting/, including PROCESS_OVERVIEW_<YYYY-MM-DD_HHMM>.md as the non-technical process narrative. Then create a Google Doc, publish/export a PDF, create Google Slides, and upload the raw markdown archive using reporting/scripts/generate_report.py --create-doc --create-slides --upload-drive --gdrive-account [email protected]. Return the local file path plus the Google Doc link, PDF link, and Slides link in the handoff. Skip publishing only for dry runs or explicit no-publish instructions."
  engagement: "<target-name>"

Always include:

The engagement target name
The specific path to save output
The path to the previous phase's handoff document

Adaptation Logic

When a Phase Fails or Times Out

Check the agent's output for partial findings
If partial data exists, proceed with what was found (document the gap)
If nothing useful, retry once with adjusted parameters
If retry fails, consult the decision gate for alternative vectors

When a Vector is a Dead End

Do NOT retry the same approach indefinitely
Check the decision gate table for pivot options
If all vectors exhausted, report to operator with a summary of what was tried
Example: Network recon on a Raspberry Pi with no open services → pivot to physical access or app-layer (web interface)

When Web Search is Unavailable

Fall back to training knowledge for CVE databases and exploit frameworks
Use searchsploit and local exploit databases
Manual research of vendor security advisories
Document that web search was unavailable (affects confidence in "latest" findings)

When Target is Offline

Retry up to 3 times with increasing intervals (5min, 15min, 30min)
If still offline, document and report to operator
Proceed with analysis of any data already gathered

Context Handoff Protocol

Every phase MUST produce *_SUMMARY_<YYYY-MM-DD_HHMM>.md in its engagement directory. See references/phase-handoff.md for the template.

That handoff is required, but it is not sufficient by itself. Every phase must also keep these current:

phase summary
activity log
evidence index
findings delta
next actions
shared registers when new assets, findings, evidence, or attack paths appear

Mandatory documentation on every meaningful run

Do not treat documentation as end-of-phase cleanup.

For every meaningful run, command batch, operator-supplied result set, or sub-agent handoff, update documentation in the same working turn before considering the run complete.

Minimum required updates per run:

append the action/result to the phase activity log
register any new evidence IDs in the evidence register and phase evidence index
update the phase summary and findings delta when the understanding changed
update next actions when the recommended path changed
update shared registers when new findings, assets, or attack paths appeared

A phase is incomplete if work happened but the engagement docs still describe an older state.

Key fields:

Found: What was discovered (specific, actionable data)
Not Found: What was checked but yielded nothing (prevents re-checking)
Recommended Next: Which phase and vector to pursue
Key Data: IPs, versions, credentials, CVEs, file paths — anything the next agent needs
Confidence: High / Medium / Low — how confident are we in these findings?

Web Search Integration

See references/web-search-queries.md for curated queries per phase. Key principles:

Always try web search for CVE lookups and latest exploit research
Fallback gracefully if search is unavailable (training knowledge + local tools)
Validate web search results against local databases (searchsploit, NVD)
Document when web search was used vs. unavailable

Quality Checklist

Before moving to the next phase, and after every meaningful run within a phase, verify:

[ ] *_SUMMARY_<YYYY-MM-DD_HHMM>.md exists and has all required fields
[ ] Phase summary, activity log, evidence index, findings delta, and next actions files are updated
[ ] Evidence files are saved in the engagement directory and registered
[ ] Findings, evidence, asset, and attack-path registers are updated where applicable
[ ] Decision gate has been evaluated and documented
[ ] Next agent (if applicable) has been given the correct input path
[ ] No sensitive data (passwords, keys) is stored in plaintext in documentation
[ ] The engagement docs reflect the latest run, not the previous one

Adoption

duriandurino/pentest-orchestrator

$ install --global

Security Scan Results

SKILL.md

Pentest Orchestrator

When to Use

When NOT to Use

Scripts-First Reuse Rule

OpenCode Integration

Phase 0: Engagement Setup

Phase 1: Reconnaissance (specter-recon)

DECISION GATE: What vectors look promising?

Phase 2: Enumeration (specter-enum)

DECISION GATE: What did we find?

Phase 3: Vulnerability Analysis (specter-vuln)

DECISION GATE: Is there an exploitable path?

Phase 4: Exploitation (specter-exploit)

DECISION GATE: What level of access was gained?

Phase 5: Post-Exploitation (specter-post)

Phase 6: Reporting (specter-report)

Agent Coordination

Sequential Phases (Default)

Parallel Execution

Spawning Pattern

Adaptation Logic

When a Phase Fails or Times Out

When a Vector is a Dead End

When Web Search is Unavailable

When Target is Offline

Context Handoff Protocol

Mandatory documentation on every meaningful run

Web Search Integration

Quality Checklist

Related Skills

duriandurino/vuln

duriandurino/vuln-phase-essentials

duriandurino/slides-cog

duriandurino/report-phase-essentials

duriandurino/pentest-orchestrator

$ install --global

Security Scan Results

SKILL.md

Pentest Orchestrator

When to Use

When NOT to Use

Scripts-First Reuse Rule

OpenCode Integration

Phase 0: Engagement Setup

Phase 1: Reconnaissance (specter-recon)

DECISION GATE: What vectors look promising?

Phase 2: Enumeration (specter-enum)

DECISION GATE: What did we find?

Phase 3: Vulnerability Analysis (specter-vuln)

DECISION GATE: Is there an exploitable path?

Phase 4: Exploitation (specter-exploit)

DECISION GATE: What level of access was gained?

Phase 5: Post-Exploitation (specter-post)

Phase 6: Reporting (specter-report)

Agent Coordination

Sequential Phases (Default)

Parallel Execution

Spawning Pattern

Adaptation Logic

When a Phase Fails or Times Out

When a Vector is a Dead End

When Web Search is Unavailable

When Target is Offline

Context Handoff Protocol

Mandatory documentation on every meaningful run

Web Search Integration

Quality Checklist

Related Skills

duriandurino/vuln

duriandurino/vuln-phase-essentials

duriandurino/slides-cog

duriandurino/report-phase-essentials