Pentest → Parallel PRs

You are running an authorized penetration test on a codebase, then shipping the fixes as a fan-out of independent, non-conflicting pull requests. The loop has two halves: find (parallel attacker-framed investigation) and fix (parallel TDD-driven implementation in isolated worktrees). The discipline that makes the whole thing work is generator-evaluator separation in the find half, and file-occupation analysis in the fix half.

Core Principles

Authorization first. Only proceed on the user's own code, a CTF, an engagement they have written authorization for, or other clearly defensive contexts. If unclear, ask once before any tool calls.
Generator-evaluator separation. Subagents that generate findings must not be the only judges of those findings. Verify every CRITICAL claim yourself by reading the actual cited code before acting on it.
Pessimism about your own subagents. Expect ~10-30% false positives from pentest subagents. Common shapes: "IDOR" that's actually safe via check-then-act with UUIDs, "missing sender validation" that requires externally_connectable to be exploitable, generic "this looks risky" with no concrete vector.
File-occupation analysis before parallelizing. Build an explicit table of which source AND test files each fix touches. Two tasks may not share a single file. Group fixes that overlap into the same worktree.
One concern per worktree, unless files force grouping. Don't artificially bundle unrelated fixes; that destroys the per-PR review experience.
TDD inside subagent prompts. Spell out the Red → Green discipline and list the exact test cases each subagent must add. Don't leave it implicit.
Subagents commit, the main thread publishes. Subagents create local commits only. The main thread reviews git log main..<branch>, renames awkward branches, then pushes and opens PRs. This keeps a human checkpoint before anything becomes public.
Recoverable, not perfect. Worktree isolation occasionally fails (a subagent ends up committing in the main repo). Reconcile via git operations rather than panicking — git checkout main, rename, push, done.

Phase 0: Confirm Authorization & Scope

Goal: Make sure this is a defensive engagement.

Verify context: is this the user's own project, a CTF, a contracted engagement, a security research target with permission, or a clearly defensive review? If you can't tell, ask once, then proceed.
Capture scope from $ARGUMENTS or the prior conversation. If the user said "skip the extension" or "auth only", honor it. If no scope given, default to the full project.
Briefly state the authorization you're relying on so the user can correct you. Example: "Treating this as authorized work on your own paveg/tuck repo."

If the request smells offensive (targeting third parties, evading detection, mass exploitation, supply-chain compromise), stop and refuse. The skill exists to harden software, not attack others.

Phase 1: Map the Attack Surface

Goal: Build a mental map of trust boundaries and entry points before dispatching anything.

Glob/ls the repo to identify:
- HTTP route handlers and middleware
- DB schema and ORM call sites
- URL fetching / file fetching / file uploading code
- Auth / session / token / cookie code
- Background workers, cron jobs, durable objects, queues
- Frontend client code (where auth tokens live, CSP headers, raw-HTML rendering sinks)
- Browser extension manifests and message handlers (if present)
- CI / deploy / wrangler / IaC config files
Group these by trust boundary — the points where data crosses from untrusted (user input, fetched HTML, extension messages) to trusted (database, internal API, LLM prompt). Each boundary becomes a candidate parallel investigation domain.
Pick 3-5 domains for parallel pentest. More than 5 wastes context with diminishing returns. Typical domains:
- Auth & authorization: login flows, session tokens, IDOR, admin gates
- SSRF & URL fetching: any code that fetches a user-supplied URL
- Multi-tenant data: ownership scoping, mass-assignment, public shares
- Quota / billing / rate limit / secrets / CSP / extension surface
- Injection sinks: SQL, command, template, prompt injection, stored XSS

Adapt domains to the actual codebase. A read-only static site has none of these; an LLM-backed chatbot has prompt injection and tool-use abuse.

Phase 2: Parallel Pentest Dispatch

Goal: Dispatch one focused subagent per domain, in parallel, with attacker framing.

Use the Agent tool with subagent_type: "Explore". Send all dispatches in a single message so they run concurrently. Each subagent prompt MUST contain:

Authorization context ("authorized work on the owner's own project") so the subagent doesn't refuse.
Specific files to read (absolute paths). Don't make the subagent re-discover the structure.
Attacker framing, not reviewer framing. Ask for "exploits a real attacker would use", not "code quality issues". Demand:
- Severity tier (CRITICAL / HIGH / MEDIUM)
- File:line reference
- Concrete exploit payload (curl command, request body, JS snippet)
- Why the current code fails
- A fix sketch
Length cap ("under 1500 words") and language (match the user's).
Negative instruction: don't report findings without code citations, don't pad with generic "consider HTTPS" advice.

Example prompt sketch (for an SSRF domain on a Cloudflare Workers app):

You are a security researcher on an authorized pentest of <project>, a
<one-line description>. Focus ONLY on SSRF, URL-fetch attacks, and content
processing vulnerabilities. The owner has authorized this engagement.

Read these files:
- /abs/path/to/articles.ts
- /abs/path/to/url-utils.ts
- /abs/path/to/article-fetcher.ts

This app fetches user-supplied URLs server-side. Hunt concrete bypasses for:
1. SSRF to private IP ranges (10/8, 127/8, 169.254/16, ::1, fc00::/7,
   IPv4-mapped IPv6 like ::ffff:127.0.0.1)
2. SSRF via redirect-following without re-validation
3. SSRF via DNS rebinding or obscure encodings (decimal, hex, 127.1)
4. Protocol smuggling (file:, gopher:, data:)
5. Response-size DoS, slowloris
6. Stored content rendering issues from parsed HTML
7. Prompt injection if fetched content reaches an LLM

For each finding: severity, file:line, exact exploit payload, why the current
code fails, fix. No generic advice. Report in <user's language>, under 1500
words. Skip any category where you find nothing actionable.

Tune the file list and the hunt list to the actual domain.

Phase 3: Verify Findings (Generator-Evaluator Separation)

Goal: Filter false positives before treating anything as actionable.

After all subagents return, read the actual code for every CRITICAL and HIGH finding before believing it. This is not optional. Subagents make systematic errors that look authoritative. Examples seen in real runs:

"CRITICAL IDOR: PATCH handler updates with eq(id) only, no userId" — but the preceding SELECT scopes by (id, userId) and 404s if not found. Check-then-act with UUIDs is safe today; the finding is at most a defense-in-depth concern (MEDIUM), not CRITICAL.
"HIGH: extension _sender parameter is unused, web pages can hijack the background script" — but chrome.runtime.onMessage only delivers from same-extension contexts unless externally_connectable is set in the manifest. Check the manifest first.
"CRITICAL admin bypass via X header" — but the route also runs through a middleware the subagent didn't read.

When you spot a false positive, demote it (often to LOW or drop) and note it in your report. When you spot something the subagent missed while reading the code yourself, promote it. Both directions matter.

Output of this phase: a clean, severity-classified list of findings with PoCs, with clear notes on which subagent claims you rejected and why (transparency builds trust with the user).

Present this list to the user. Pause for direction before fixing — they may want to triage, defer some, or add scope.

Phase 4: Plan Fixes & File-Occupation Analysis

Goal: Decide which fixes ship now, and how to parallelize them without merge conflicts.

Drop fixes you can't perform with current information. Examples:
- "Allowlist Chrome extension IDs" — needs the production extension IDs.
- "Fix the OAuth client_secret leak" — needs new secrets rotated first.
- Document these as "deferred, requires X from user".

Build an occupation table. For each remaining fix, list every file it touches — source AND test files. Test files matter because two parallel agents editing the same *.test.ts will conflict just as badly as editing the same source.

| Task | Source files | Test files |
|---|---|---|
| C-1 | index.ts, auth-dev.ts | auth-dev.test.ts |
| C-2 | url-utils.ts, ai.ts, robots.ts | url-utils.test.ts, ai.test.ts, robots.test.ts |
| H-1 | rate-limit.ts | rate-limit.test.ts |
| M-3 | highlights.ts | highlights.test.ts |

Group tasks that share files. If two fixes both touch ai.ts, they become one worktree task. Don't try to be clever with diff merging.
Verify zero overlap in the final grouping. If you can't get to zero overlap, run the conflicting groups sequentially (one worktree, then the next), not in parallel.
Show the table to the user and confirm before dispatching N worktrees. This is also a chance for them to drop fixes or change priority.

Phase 5: Dispatch Parallel Implementation Subagents

Goal: One subagent per non-overlapping group, each in its own worktree, each producing one local commit.

Use Agent with isolation: "worktree" and subagent_type: "general-purpose". Send all dispatches in a single message for true parallelism.

Each subagent prompt MUST include:

Why this matters — the exploit being closed, in 2-3 sentences. The subagent will make better judgment calls if it understands the threat.
Authorization so it doesn't refuse to work on security-sensitive code.
Exact files and approximate line numbers. Tell it where to look. Don't make it re-discover.
TDD workflow — Red, then Green. List the specific test cases the subagent must add (not "add tests"). Demand the subagent confirms RED before implementing.
Full project test/lint commands (e.g., pnpm test:api, pnpm typecheck, pnpm lint). Don't assume the subagent knows.
Code style rules (single quotes, no semis, no unrelated changes, no refactors outside the scope).
Conventional commit message drafted in the prompt — the subagent should commit with this message verbatim or close to it.
Suggested branch name. If you can't suggest one, plan to rename in Phase 6.
CRITICAL: "Commit locally on a feature branch. Do NOT push. Do NOT create a PR." This keeps the publish step centralized.
Return contract: branch name, files changed, before/after test counts, deviations from the plan, blockers. Cap response length.
Stop condition: "If you get stuck for more than 2 retries on any step, STOP and report what blocked you. Do not force through."

Worth knowing: isolation: "worktree" automatically creates a worktree under .claude/worktrees/agent-<hash>/. The worktree persists if changes are made, is auto-cleaned if not. Worktree paths and branch names come back in the agent result. Don't try to manage worktrees manually for this phase — the tool handles it.

Phase 6: Verify Branches & Recover Mishaps

Goal: Confirm each subagent left a clean, mainline-ready branch. Recover from any isolation hiccups before pushing.

List worktrees and branches:
```
git worktree list
```
Each subagent should appear with its worktree path and branch.
Per-branch sanity check:
```
for b in <branch1> <branch2> ...; do
  echo "=== $b ==="
  git log --oneline main..$b
done
```
Each branch should show 1 (or a small number of) commits ahead of main, all from the subagent's work.
Recover if isolation didn't take. Sometimes a subagent commits in the main repo working directory instead of a worktree. Symptoms: git status in the main repo shows you're on a feature branch, not main. Recovery:
```
git checkout main   # restores main in the main repo; the feature branch is still saved
```
The subagent's commits remain on the feature branch — nothing is lost.
Rename auto-generated branches. Subagents that didn't create their own feature branch end up on something like worktree-agent-abc12345. Rename to a meaningful slug:
```
git branch -m worktree-agent-abc12345 fix/<descriptive-slug>
```
Note: if tag.gpgSign is set globally, branch ops are unaffected, but tag creation in other workflows may need git -c tag.gpgSign=false.
Optional: run the full test suite once more in the main repo if you want extra confidence before publishing. Each subagent should already have done this, but a sanity check is cheap.

If a subagent reported a blocker, decide: re-dispatch with a tighter prompt, do that fix yourself, or defer it. Don't push half-done work.

Phase 7: Push & Create One PR Per Fix

Goal: Publish each branch and open one PR per fix, with a body the user can act on.

Push all branches in parallel (separate Bash tool calls in one message):
```
git push -u origin <branch1>
```

Create one PR per branch, also in parallel. Use the project's PR template if there is one (check .github/PULL_REQUEST_TEMPLATE.md). Otherwise default to:

## Summary
- <what changed, in 2-4 bullets>
- <why — the exploit this closes, with severity tag for security fixes>

## Test plan
- [x] <project test command> — <count> passing
- [x] <typecheck command>
- [x] <lint command>
- [x] <new test cases added, briefly>
- [ ] <anything that requires manual verification, e.g., production env vars>

Use a HEREDOC for the body to preserve formatting:

gh pr create --base main --head <branch> --title "<title under 70 chars>" --body "$(cat <<'EOF'
## Summary
...
EOF
)"

For security PRs, lead the title with the conventional commit type (fix(...)/refactor(...)) and put severity in the body, not the title. Keep titles factual; security details belong in the body where they can be redacted from the public timeline if needed.
Recommend a merge order to the user, based on:
- Risk: lowest-risk PRs first to catch CI surprises
- Dependencies: any PR that needs an env var or config change should be called out separately
- Blast radius: smaller diffs first
Summarize the result:
- Table of PRs with severity, branch, title, link
- Test count deltas per PR
- Any deferred items and why
- Long-term recommendations the fixes don't address

Common Pitfalls

Trusting subagent CRITICAL claims without verification → false-positive PRs that waste reviewer time and reduce future trust in the process.
Test file overlap missed in occupation analysis → silent merge conflicts when the user merges PRs sequentially. Always include test files in the table.
Subagents pushing or creating PRs themselves → no human review checkpoint, hard to roll back. Always say "commit only" in the prompt.
Vague prompts to implementation subagents → wasted iterations and off-scope refactors. Always include exact files, line numbers, test cases, and the project's test/lint commands.
Trying to fix everything at once → context exhaustion and rushed reviews. Defer LOW findings or batch them as a single follow-up.
Forgetting authorization framing → subagents may refuse to engage with security-sensitive code. Always include a one-line authorization context.
Branch name pollution → leaving subagent-named branches like worktree-agent-abc12345 makes the PR list ugly and breaks conventional branch naming. Rename in Phase 6.
Skipping the finding-verification pause → user loses the chance to triage before fixes are dispatched, ends up with PRs they didn't want. Always pause after Phase 3 to confirm scope.

When To Skip Phases

Solo bug fix, no security context: this skill is overkill. Use a simpler workflow.
User already has the finding list: skip Phases 0-3, jump straight to Phase 4 (occupation analysis) using their list.
Single fix only: skip Phase 4-5 parallelism; use a single worktree. The benefit of this skill is parallelism — for one fix, just edit directly.
Read-only static site, no server: most of the attack surface domains don't apply. Focus on dependency vulnerabilities and CSP only.

Output Language

Match the user's input language for narrative reports and PR descriptions unless they specify otherwise. Code identifiers, commit messages, and PR titles stay in English regardless. SKILL output (this file's contents) is always English; conversational responses follow the user.

Pentest → Parallel PRs

Core Principles

Authorization first. Only proceed on the user's own code, a CTF, an engagement they have written authorization for, or other clearly defensive contexts. If unclear, ask once before any tool calls.
Generator-evaluator separation. Subagents that generate findings must not be the only judges of those findings. Verify every CRITICAL claim yourself by reading the actual cited code before acting on it.
Pessimism about your own subagents. Expect ~10-30% false positives from pentest subagents. Common shapes: "IDOR" that's actually safe via check-then-act with UUIDs, "missing sender validation" that requires externally_connectable to be exploitable, generic "this looks risky" with no concrete vector.
File-occupation analysis before parallelizing. Build an explicit table of which source AND test files each fix touches. Two tasks may not share a single file. Group fixes that overlap into the same worktree.
One concern per worktree, unless files force grouping. Don't artificially bundle unrelated fixes; that destroys the per-PR review experience.
TDD inside subagent prompts. Spell out the Red → Green discipline and list the exact test cases each subagent must add. Don't leave it implicit.
Subagents commit, the main thread publishes. Subagents create local commits only. The main thread reviews git log main..<branch>, renames awkward branches, then pushes and opens PRs. This keeps a human checkpoint before anything becomes public.
Recoverable, not perfect. Worktree isolation occasionally fails (a subagent ends up committing in the main repo). Reconcile via git operations rather than panicking — git checkout main, rename, push, done.

Phase 0: Confirm Authorization & Scope

Goal: Make sure this is a defensive engagement.

Verify context: is this the user's own project, a CTF, a contracted engagement, a security research target with permission, or a clearly defensive review? If you can't tell, ask once, then proceed.
Capture scope from $ARGUMENTS or the prior conversation. If the user said "skip the extension" or "auth only", honor it. If no scope given, default to the full project.
Briefly state the authorization you're relying on so the user can correct you. Example: "Treating this as authorized work on your own paveg/tuck repo."

If the request smells offensive (targeting third parties, evading detection, mass exploitation, supply-chain compromise), stop and refuse. The skill exists to harden software, not attack others.

Phase 1: Map the Attack Surface

Goal: Build a mental map of trust boundaries and entry points before dispatching anything.

Glob/ls the repo to identify:
- HTTP route handlers and middleware
- DB schema and ORM call sites
- URL fetching / file fetching / file uploading code
- Auth / session / token / cookie code
- Background workers, cron jobs, durable objects, queues
- Frontend client code (where auth tokens live, CSP headers, raw-HTML rendering sinks)
- Browser extension manifests and message handlers (if present)
- CI / deploy / wrangler / IaC config files
Group these by trust boundary — the points where data crosses from untrusted (user input, fetched HTML, extension messages) to trusted (database, internal API, LLM prompt). Each boundary becomes a candidate parallel investigation domain.
Pick 3-5 domains for parallel pentest. More than 5 wastes context with diminishing returns. Typical domains:
- Auth & authorization: login flows, session tokens, IDOR, admin gates
- SSRF & URL fetching: any code that fetches a user-supplied URL
- Multi-tenant data: ownership scoping, mass-assignment, public shares
- Quota / billing / rate limit / secrets / CSP / extension surface
- Injection sinks: SQL, command, template, prompt injection, stored XSS

Adapt domains to the actual codebase. A read-only static site has none of these; an LLM-backed chatbot has prompt injection and tool-use abuse.

Phase 2: Parallel Pentest Dispatch

Goal: Dispatch one focused subagent per domain, in parallel, with attacker framing.

Use the Agent tool with subagent_type: "Explore". Send all dispatches in a single message so they run concurrently. Each subagent prompt MUST contain:

Authorization context ("authorized work on the owner's own project") so the subagent doesn't refuse.
Specific files to read (absolute paths). Don't make the subagent re-discover the structure.
Attacker framing, not reviewer framing. Ask for "exploits a real attacker would use", not "code quality issues". Demand:
- Severity tier (CRITICAL / HIGH / MEDIUM)
- File:line reference
- Concrete exploit payload (curl command, request body, JS snippet)
- Why the current code fails
- A fix sketch
Length cap ("under 1500 words") and language (match the user's).
Negative instruction: don't report findings without code citations, don't pad with generic "consider HTTPS" advice.

Example prompt sketch (for an SSRF domain on a Cloudflare Workers app):

You are a security researcher on an authorized pentest of <project>, a
<one-line description>. Focus ONLY on SSRF, URL-fetch attacks, and content
processing vulnerabilities. The owner has authorized this engagement.

Read these files:
- /abs/path/to/articles.ts
- /abs/path/to/url-utils.ts
- /abs/path/to/article-fetcher.ts

This app fetches user-supplied URLs server-side. Hunt concrete bypasses for:
1. SSRF to private IP ranges (10/8, 127/8, 169.254/16, ::1, fc00::/7,
   IPv4-mapped IPv6 like ::ffff:127.0.0.1)
2. SSRF via redirect-following without re-validation
3. SSRF via DNS rebinding or obscure encodings (decimal, hex, 127.1)
4. Protocol smuggling (file:, gopher:, data:)
5. Response-size DoS, slowloris
6. Stored content rendering issues from parsed HTML
7. Prompt injection if fetched content reaches an LLM

For each finding: severity, file:line, exact exploit payload, why the current
code fails, fix. No generic advice. Report in <user's language>, under 1500
words. Skip any category where you find nothing actionable.

Tune the file list and the hunt list to the actual domain.

Phase 3: Verify Findings (Generator-Evaluator Separation)

Goal: Filter false positives before treating anything as actionable.

"CRITICAL IDOR: PATCH handler updates with eq(id) only, no userId" — but the preceding SELECT scopes by (id, userId) and 404s if not found. Check-then-act with UUIDs is safe today; the finding is at most a defense-in-depth concern (MEDIUM), not CRITICAL.
"HIGH: extension _sender parameter is unused, web pages can hijack the background script" — but chrome.runtime.onMessage only delivers from same-extension contexts unless externally_connectable is set in the manifest. Check the manifest first.
"CRITICAL admin bypass via X header" — but the route also runs through a middleware the subagent didn't read.

Output of this phase: a clean, severity-classified list of findings with PoCs, with clear notes on which subagent claims you rejected and why (transparency builds trust with the user).

Present this list to the user. Pause for direction before fixing — they may want to triage, defer some, or add scope.

Phase 4: Plan Fixes & File-Occupation Analysis

Goal: Decide which fixes ship now, and how to parallelize them without merge conflicts.

Drop fixes you can't perform with current information. Examples:
- "Allowlist Chrome extension IDs" — needs the production extension IDs.
- "Fix the OAuth client_secret leak" — needs new secrets rotated first.
- Document these as "deferred, requires X from user".

| Task | Source files | Test files |
|---|---|---|
| C-1 | index.ts, auth-dev.ts | auth-dev.test.ts |
| C-2 | url-utils.ts, ai.ts, robots.ts | url-utils.test.ts, ai.test.ts, robots.test.ts |
| H-1 | rate-limit.ts | rate-limit.test.ts |
| M-3 | highlights.ts | highlights.test.ts |

Group tasks that share files. If two fixes both touch ai.ts, they become one worktree task. Don't try to be clever with diff merging.
Verify zero overlap in the final grouping. If you can't get to zero overlap, run the conflicting groups sequentially (one worktree, then the next), not in parallel.
Show the table to the user and confirm before dispatching N worktrees. This is also a chance for them to drop fixes or change priority.

Phase 5: Dispatch Parallel Implementation Subagents

Goal: One subagent per non-overlapping group, each in its own worktree, each producing one local commit.

Use Agent with isolation: "worktree" and subagent_type: "general-purpose". Send all dispatches in a single message for true parallelism.

Each subagent prompt MUST include:

Why this matters — the exploit being closed, in 2-3 sentences. The subagent will make better judgment calls if it understands the threat.
Authorization so it doesn't refuse to work on security-sensitive code.
Exact files and approximate line numbers. Tell it where to look. Don't make it re-discover.
TDD workflow — Red, then Green. List the specific test cases the subagent must add (not "add tests"). Demand the subagent confirms RED before implementing.
Full project test/lint commands (e.g., pnpm test:api, pnpm typecheck, pnpm lint). Don't assume the subagent knows.
Code style rules (single quotes, no semis, no unrelated changes, no refactors outside the scope).
Conventional commit message drafted in the prompt — the subagent should commit with this message verbatim or close to it.
Suggested branch name. If you can't suggest one, plan to rename in Phase 6.
CRITICAL: "Commit locally on a feature branch. Do NOT push. Do NOT create a PR." This keeps the publish step centralized.
Return contract: branch name, files changed, before/after test counts, deviations from the plan, blockers. Cap response length.
Stop condition: "If you get stuck for more than 2 retries on any step, STOP and report what blocked you. Do not force through."

Phase 6: Verify Branches & Recover Mishaps

Goal: Confirm each subagent left a clean, mainline-ready branch. Recover from any isolation hiccups before pushing.

List worktrees and branches:
```
git worktree list
```
Each subagent should appear with its worktree path and branch.
Per-branch sanity check:
```
for b in <branch1> <branch2> ...; do
  echo "=== $b ==="
  git log --oneline main..$b
done
```
Each branch should show 1 (or a small number of) commits ahead of main, all from the subagent's work.
Recover if isolation didn't take. Sometimes a subagent commits in the main repo working directory instead of a worktree. Symptoms: git status in the main repo shows you're on a feature branch, not main. Recovery:
```
git checkout main   # restores main in the main repo; the feature branch is still saved
```
The subagent's commits remain on the feature branch — nothing is lost.
Rename auto-generated branches. Subagents that didn't create their own feature branch end up on something like worktree-agent-abc12345. Rename to a meaningful slug:
```
git branch -m worktree-agent-abc12345 fix/<descriptive-slug>
```
Note: if tag.gpgSign is set globally, branch ops are unaffected, but tag creation in other workflows may need git -c tag.gpgSign=false.
Optional: run the full test suite once more in the main repo if you want extra confidence before publishing. Each subagent should already have done this, but a sanity check is cheap.

If a subagent reported a blocker, decide: re-dispatch with a tighter prompt, do that fix yourself, or defer it. Don't push half-done work.

Phase 7: Push & Create One PR Per Fix

Goal: Publish each branch and open one PR per fix, with a body the user can act on.

Push all branches in parallel (separate Bash tool calls in one message):
```
git push -u origin <branch1>
```

Create one PR per branch, also in parallel. Use the project's PR template if there is one (check .github/PULL_REQUEST_TEMPLATE.md). Otherwise default to:

## Summary
- <what changed, in 2-4 bullets>
- <why — the exploit this closes, with severity tag for security fixes>

## Test plan
- [x] <project test command> — <count> passing
- [x] <typecheck command>
- [x] <lint command>
- [x] <new test cases added, briefly>
- [ ] <anything that requires manual verification, e.g., production env vars>

Use a HEREDOC for the body to preserve formatting:

gh pr create --base main --head <branch> --title "<title under 70 chars>" --body "$(cat <<'EOF'
## Summary
...
EOF
)"

For security PRs, lead the title with the conventional commit type (fix(...)/refactor(...)) and put severity in the body, not the title. Keep titles factual; security details belong in the body where they can be redacted from the public timeline if needed.
Recommend a merge order to the user, based on:
- Risk: lowest-risk PRs first to catch CI surprises
- Dependencies: any PR that needs an env var or config change should be called out separately
- Blast radius: smaller diffs first
Summarize the result:
- Table of PRs with severity, branch, title, link
- Test count deltas per PR
- Any deferred items and why
- Long-term recommendations the fixes don't address

Common Pitfalls

Trusting subagent CRITICAL claims without verification → false-positive PRs that waste reviewer time and reduce future trust in the process.
Test file overlap missed in occupation analysis → silent merge conflicts when the user merges PRs sequentially. Always include test files in the table.
Subagents pushing or creating PRs themselves → no human review checkpoint, hard to roll back. Always say "commit only" in the prompt.
Vague prompts to implementation subagents → wasted iterations and off-scope refactors. Always include exact files, line numbers, test cases, and the project's test/lint commands.
Trying to fix everything at once → context exhaustion and rushed reviews. Defer LOW findings or batch them as a single follow-up.
Forgetting authorization framing → subagents may refuse to engage with security-sensitive code. Always include a one-line authorization context.
Branch name pollution → leaving subagent-named branches like worktree-agent-abc12345 makes the PR list ugly and breaks conventional branch naming. Rename in Phase 6.
Skipping the finding-verification pause → user loses the chance to triage before fixes are dispatched, ends up with PRs they didn't want. Always pause after Phase 3 to confirm scope.

When To Skip Phases

Solo bug fix, no security context: this skill is overkill. Use a simpler workflow.
User already has the finding list: skip Phases 0-3, jump straight to Phase 4 (occupation analysis) using their list.
Single fix only: skip Phase 4-5 parallelism; use a single worktree. The benefit of this skill is parallelism — for one fix, just edit directly.
Read-only static site, no server: most of the attack surface domains don't apply. Focus on dependency vulnerabilities and CSP only.

Adoption

paveg/pentest-parallel-prs

$ install --global

Security Scan Results

SKILL.md

Pentest → Parallel PRs

Core Principles

Phase 0: Confirm Authorization & Scope

Phase 1: Map the Attack Surface

Phase 2: Parallel Pentest Dispatch

Phase 3: Verify Findings (Generator-Evaluator Separation)

Phase 4: Plan Fixes & File-Occupation Analysis

Phase 5: Dispatch Parallel Implementation Subagents

Phase 6: Verify Branches & Recover Mishaps

Phase 7: Push & Create One PR Per Fix

Common Pitfalls

When To Skip Phases

Output Language

Related Skills

paveg/empirical-prompt-tuning

paveg/x-post-craft

paveg/ui-design-standards

paveg/trend-arbitrage

paveg/pentest-parallel-prs

$ install --global

Security Scan Results

SKILL.md

Pentest → Parallel PRs

Core Principles

Phase 0: Confirm Authorization & Scope

Phase 1: Map the Attack Surface

Phase 2: Parallel Pentest Dispatch

Phase 3: Verify Findings (Generator-Evaluator Separation)

Phase 4: Plan Fixes & File-Occupation Analysis

Phase 5: Dispatch Parallel Implementation Subagents

Phase 6: Verify Branches & Recover Mishaps

Phase 7: Push & Create One PR Per Fix

Common Pitfalls

When To Skip Phases

Output Language

Related Skills

paveg/empirical-prompt-tuning

paveg/x-post-craft

paveg/ui-design-standards

paveg/trend-arbitrage