Cybersecurity Review

Audit a codebase for security risks. Operates in two modes:

full-audit — scan the entire codebase. Used by do-init at adoption time and on demand. Produces docs/security-audit.md. Flips cybersecurity_reviewed milestone.
update — scoped to a diff, plus the code that could be impacted by the diff. Used by do-issue-solo and do-issue-guided as a mandatory pre-push gate before opening a PR. Returns a structured findings list to the orchestrator; does not flip the milestone (it's per-PR work, not a one-time milestone).

Refer to mav-bp-application-security for the standards each finding should be measured against. The skill surfaces risks; it does not modify code.

Preflight (mandatory)

Run this first. If it exits non-zero, halt and report the stderr output to the user verbatim. Do not proceed.

uv run maverick preflight do-cybersecurity-review

The check verifies the project is initialised and uv is on PATH.

Mode Selection

If `` specifies a mode (full-audit or update), use it. If update is selected the caller must also pass a diff (via stdin or a file path); halt and ask for one if missing.

If no mode is specified, default to full-audit.

Full Audit Mode

1. Detect the project stack

Identify language, framework, and runtime so subsequent checks know what to look for. Use the same detectors as do-maverick-alignment: package.json, pyproject.toml, Dockerfile, etc.

2. Walk the audit categories

For each category below, search the codebase and assign one of:

PASS — no concerns surfaced
WARN — partial coverage or non-critical issues
FAIL — material risk; needs human attention
N/A — category does not apply to this project

2.1 Secret exposure

Scan tracked files for committed credentials, tokens, private keys, and connection strings. Patterns to check (extend per stack):

AKIA[0-9A-Z]{16} (AWS access key id), aws_secret_access_key\s*=
-----BEGIN (RSA |EC |OPENSSH )?PRIVATE KEY-----
ghp_, gho_, ghs_, github_pat_ (GitHub tokens)
xox[baprs]- (Slack tokens), sk- followed by 20+ alphanumerics (OpenAI/Anthropic-style)
Generic password\s*=\s*['"][^'"]+['"], api[_-]?key\s*=\s*['"][^'"]+['"]

Also check .env* files (any tracked) and history (git log --all --full-history -- .env).

2.2 Dependency hygiene

| What | Where to look | | --- | --- | | Lock file present | package-lock.json, pnpm-lock.yaml, yarn.lock, uv.lock, poetry.lock, Cargo.lock, go.sum | | Vulnerability scanning configured | npm audit in CI, pip-audit, safety, cargo audit, trivy, Dependabot, Renovate | | Pinned versions | Direct deps pinned to a specific version or range | | No supply-chain red flags | typosquats, abandoned packages, unfamiliar registries |

2.3 Authentication and authorisation

Where is auth implemented? Identify the library / pattern (Passport, NextAuth, Django auth, Spring Security, custom).
Are passwords hashed with a modern algorithm (bcrypt, argon2, scrypt) — never plain SHA-x or MD5.
Is session management correct (signed cookies, rotation on privilege change, secure + httponly + samesite flags)?
Are routes/endpoints protected by middleware/decorators or do they require per-handler checks (latter is error-prone)?
Are admin / privileged paths gated separately?

2.4 Input validation and output encoding

Are inputs validated at the boundary (schema validation: zod, pydantic, joi, Bean Validation)?
Are templates auto-escaped (Jinja, JSX, Razor default-on)?
Are SQL queries parameterised? Any string-concatenated SQL?
Are file paths validated (no untrusted concat into open() / fs.readFile)?
Are command invocations safe (no shell=True with untrusted input, no eval)?

2.5 Transport, headers, and CORS

Is HTTPS enforced (HSTS, redirect from HTTP, secure cookies)?
Are baseline security headers set (CSP, X-Frame-Options or frame-ancestors, X-Content-Type-Options, Referrer-Policy, Permissions-Policy)?
Is CORS scoped explicitly, not * for credentialed endpoints?

2.6 Data at rest

Is sensitive data encrypted at rest (DB-level, app-level)?
Are backups encrypted?
Is logging redacting PII / credentials? Search log calls for unredacted user input or auth headers.

2.7 Logging, monitoring, and rate limiting

Are auth failures logged with enough context to detect brute-force?
Is there a rate-limit on login / password-reset / signup?
Are alerts wired up for repeated auth failures, privilege escalations, or 5xx spikes?

2.8 Container / infrastructure (if applicable)

Does the Dockerfile run as a non-root user?
Is the base image pinned by digest or just by tag?
Is .dockerignore keeping .env, secrets, and .git out of layers?
For IaC: are S3 buckets / blob stores private by default, encrypted, and versioned?

3. Write the report

Create docs/security-audit.md (create docs/ if it doesn't exist). Use this structure:

# Security Audit

**Generated:** <ISO timestamp>
**Stack detected:** <e.g., Node.js + Express, Python + Django>

## Summary

<1-2 sentences. Highest-severity finding. Headline risk.>

| Category | Status |
| --- | --- |
| Secret exposure | PASS / WARN / FAIL / N/A |
| Dependency hygiene | ... |
| Authentication / authorisation | ... |
| Input validation / output encoding | ... |
| Transport, headers, CORS | ... |
| Data at rest | ... |
| Logging, monitoring, rate limiting | ... |
| Container / infrastructure | ... |

## Details

### <Category> — <STATUS>

<Evidence: file paths, snippets, dependency names. Be concrete.>

<If WARN/FAIL: a one-paragraph recommendation with a concrete next step.>

<repeat per category>

## Recommendations (prioritised)

1. **<Severity: high/medium/low>** — <action>. <File or area>. <Why it matters>.
2. ...

Use one-line citations (path/to/file:42) so a human (or a follow-up agent) can jump straight to evidence.

4. Surface the findings to the user

After writing the report, print:

The number of findings at each severity.
The path to the report.
A one-line top-recommendation if any FAIL exists.

5. Record the milestone

Once the report is written, record that the cybersecurity review has run on this project:

uv run maverick integration set cybersecurity_reviewed true

This commits the milestone into .maverick/config.json so other Maverick skills (and maverick integration get) can see it.

Update Mode

Diff-scoped review used as a mandatory pre-push gate by do-issue-solo and do-issue-guided. Reviews only the changed code and the code that could be impacted by it — not the whole codebase. Returns findings to the orchestrator as a structured outcome.

This mode does not flip the cybersecurity_reviewed milestone — it runs on every PR, not once per project lifetime.

1. Read the diff

Caller passes the diff via stdin or as a file path. Parse it to get:

the list of changed files
the changed line ranges per file (so subsequent checks can be scoped)
whether any of the changes touch dependency manifests (package.json, pyproject.toml, Cargo.toml, lock files), env / config files, IaC, or CI workflows — those carry security weight beyond the line count

If no diff was provided, halt and ask the caller for one. Do not silently fall back to a full-audit scan.

2. Identify impacted code

A change to a function, type, schema, or config can introduce security risk in code that wasn't itself edited. For each changed entity, identify the impact set:

| Change kind | Impact set to audit | | --- | --- | | Function signature / body | All callers (use grep / IDE-equivalent symbol search) | | Exported type / schema | All importers; also serialisation / persistence sites | | Auth / authz primitive (middleware, decorator, role) | Every route or handler protected by it | | Public API surface (route, endpoint, GraphQL resolver) | Clients of that API; rate-limits and input validation around it | | Config or env variable | Every reader of that config; consider whether the new value needs to be a secret | | Dependency added or upgraded | The added/upgraded package itself: licence, known CVEs, transitive deps | | Dockerfile / IaC | The deployed surface that uses it |

The impact set is bounded — do not transitively trace until the entire codebase is included. Stop at one or two hops; if the impact is wider than that, surface it as a finding ("this change has wide reach; recommend a fuller review") rather than try to audit everything.

3. Run the audit categories on the scoped set

Apply the same eight categories from Full Audit Mode (Secret exposure, Dependency hygiene, Authentication / authorisation, Input validation / output encoding, Transport / headers / CORS, Data at rest, Logging / monitoring / rate-limit, Container / IaC) — but only against the changed lines and the impact set, not the whole repo.

Most categories will be N/A on any given diff. That is fine. Returning "N/A" with a one-line justification is informative; returning empty findings without saying which categories were considered is not.

4. Return a structured outcome

The orchestrator wires this output into the PR description or a comment. Format:

{
  "verdict": "PASS" | "FINDINGS" | "BLOCKING",
  "summary": "<one sentence: what was reviewed and the headline result>",
  "categories_considered": ["secret-exposure", "auth", ...],
  "findings": [
    {
      "severity": "critical|high|medium|low",
      "category": "<one of the eight>",
      "location": "<path/to/file:line>",
      "description": "<concrete what + why>",
      "recommendation": "<concrete next step>"
    }
  ]
}

Verdict semantics:

PASS — nothing of concern. Findings list may still contain low severity items but none are actionable.
FINDINGS — one or more medium / high items. The PR may proceed; findings are surfaced to the human reviewer in the PR body.
BLOCKING — at least one critical finding (e.g., a secret committed to the diff, an auth bypass introduced). The orchestrator must halt the push and surface this to the user. Do not return BLOCKING lightly — the bar is "this PR cannot land safely as-is".

5. Do not write to docs/security-audit.md

Update mode produces transient findings, not a snapshot of the whole codebase. Writing to the audit doc would either overwrite valid full-audit content or accumulate noise. Findings stay in the structured output; the orchestrator decides where they end up.

Rules

Surface, do not fix. Report findings; do not modify code as part of this skill. Fixes are tracked work that go through do-issue-solo after the user prioritises them.
Cite evidence. Every WARN / FAIL must reference a file path or dependency name. Vague findings ("auth could be stronger") are not actionable.
Be honest about coverage. If a category requires runtime testing or production data the audit cannot reach, mark it N/A and note what would be needed for a thorough review.
Defer to mav-bp-application-security for what "good" looks like in each category. This skill is the audit; that one is the standard.
Follow mav-scope-boundaries — do not run anything that would modify production systems, change auth/permissions, or take destructive action.

Return Protocol (update mode)

When invoked from do-issue-solo Phase 7 (or any other caller) in update mode, do-cybersecurity-review is a subroutine of the calling workflow — not a terminal action. Returning the structured verdict is a hand-back to the caller's next numbered step, not a phase-complete signal (#106).

When you return from this skill, do not post a closing summary, do not stop. The calling workflow still owns, in order:

Closing the skill-dispatch interval that wrapped this invocation (uv run maverick report end skill-dispatch … --outcome <success|failure|blocked>). The outcome maps from the verdict: PASS → success, FINDINGS → success (with the findings folded into the PR body draft), BLOCKING → blocked.
Acting on the verdict — halting on BLOCKING, folding findings into the PR body draft on FINDINGS, or recording Security review: no concerns. on PASS.
Advancing to the next phase (open the PR) once the verdict has been folded in.

If you find yourself drafting a final summary after returning here, that is the signal: scroll back to the calling workflow and resume from the step immediately after the /do-cybersecurity-review dispatch.

Cybersecurity Review

Audit a codebase for security risks. Operates in two modes:

full-audit — scan the entire codebase. Used by do-init at adoption time and on demand. Produces docs/security-audit.md. Flips cybersecurity_reviewed milestone.
update — scoped to a diff, plus the code that could be impacted by the diff. Used by do-issue-solo and do-issue-guided as a mandatory pre-push gate before opening a PR. Returns a structured findings list to the orchestrator; does not flip the milestone (it's per-PR work, not a one-time milestone).

Refer to mav-bp-application-security for the standards each finding should be measured against. The skill surfaces risks; it does not modify code.

Preflight (mandatory)

Run this first. If it exits non-zero, halt and report the stderr output to the user verbatim. Do not proceed.

uv run maverick preflight do-cybersecurity-review

The check verifies the project is initialised and uv is on PATH.

Mode Selection

If `` specifies a mode (full-audit or update), use it. If update is selected the caller must also pass a diff (via stdin or a file path); halt and ask for one if missing.

If no mode is specified, default to full-audit.

Full Audit Mode

1. Detect the project stack

Identify language, framework, and runtime so subsequent checks know what to look for. Use the same detectors as do-maverick-alignment: package.json, pyproject.toml, Dockerfile, etc.

2. Walk the audit categories

For each category below, search the codebase and assign one of:

PASS — no concerns surfaced
WARN — partial coverage or non-critical issues
FAIL — material risk; needs human attention
N/A — category does not apply to this project

2.1 Secret exposure

Scan tracked files for committed credentials, tokens, private keys, and connection strings. Patterns to check (extend per stack):

AKIA[0-9A-Z]{16} (AWS access key id), aws_secret_access_key\s*=
-----BEGIN (RSA |EC |OPENSSH )?PRIVATE KEY-----
ghp_, gho_, ghs_, github_pat_ (GitHub tokens)
xox[baprs]- (Slack tokens), sk- followed by 20+ alphanumerics (OpenAI/Anthropic-style)
Generic password\s*=\s*['"][^'"]+['"], api[_-]?key\s*=\s*['"][^'"]+['"]

Also check .env* files (any tracked) and history (git log --all --full-history -- .env).

2.2 Dependency hygiene

2.3 Authentication and authorisation

Where is auth implemented? Identify the library / pattern (Passport, NextAuth, Django auth, Spring Security, custom).
Are passwords hashed with a modern algorithm (bcrypt, argon2, scrypt) — never plain SHA-x or MD5.
Is session management correct (signed cookies, rotation on privilege change, secure + httponly + samesite flags)?
Are routes/endpoints protected by middleware/decorators or do they require per-handler checks (latter is error-prone)?
Are admin / privileged paths gated separately?

2.4 Input validation and output encoding

Are inputs validated at the boundary (schema validation: zod, pydantic, joi, Bean Validation)?
Are templates auto-escaped (Jinja, JSX, Razor default-on)?
Are SQL queries parameterised? Any string-concatenated SQL?
Are file paths validated (no untrusted concat into open() / fs.readFile)?
Are command invocations safe (no shell=True with untrusted input, no eval)?

2.5 Transport, headers, and CORS

Is HTTPS enforced (HSTS, redirect from HTTP, secure cookies)?
Are baseline security headers set (CSP, X-Frame-Options or frame-ancestors, X-Content-Type-Options, Referrer-Policy, Permissions-Policy)?
Is CORS scoped explicitly, not * for credentialed endpoints?

2.6 Data at rest

Is sensitive data encrypted at rest (DB-level, app-level)?
Are backups encrypted?
Is logging redacting PII / credentials? Search log calls for unredacted user input or auth headers.

2.7 Logging, monitoring, and rate limiting

Are auth failures logged with enough context to detect brute-force?
Is there a rate-limit on login / password-reset / signup?
Are alerts wired up for repeated auth failures, privilege escalations, or 5xx spikes?

2.8 Container / infrastructure (if applicable)

Does the Dockerfile run as a non-root user?
Is the base image pinned by digest or just by tag?
Is .dockerignore keeping .env, secrets, and .git out of layers?
For IaC: are S3 buckets / blob stores private by default, encrypted, and versioned?

3. Write the report

Create docs/security-audit.md (create docs/ if it doesn't exist). Use this structure:

# Security Audit

**Generated:** <ISO timestamp>
**Stack detected:** <e.g., Node.js + Express, Python + Django>

## Summary

<1-2 sentences. Highest-severity finding. Headline risk.>

| Category | Status |
| --- | --- |
| Secret exposure | PASS / WARN / FAIL / N/A |
| Dependency hygiene | ... |
| Authentication / authorisation | ... |
| Input validation / output encoding | ... |
| Transport, headers, CORS | ... |
| Data at rest | ... |
| Logging, monitoring, rate limiting | ... |
| Container / infrastructure | ... |

## Details

### <Category> — <STATUS>

<Evidence: file paths, snippets, dependency names. Be concrete.>

<If WARN/FAIL: a one-paragraph recommendation with a concrete next step.>

<repeat per category>

## Recommendations (prioritised)

1. **<Severity: high/medium/low>** — <action>. <File or area>. <Why it matters>.
2. ...

Use one-line citations (path/to/file:42) so a human (or a follow-up agent) can jump straight to evidence.

4. Surface the findings to the user

After writing the report, print:

The number of findings at each severity.
The path to the report.
A one-line top-recommendation if any FAIL exists.

5. Record the milestone

Once the report is written, record that the cybersecurity review has run on this project:

uv run maverick integration set cybersecurity_reviewed true

This commits the milestone into .maverick/config.json so other Maverick skills (and maverick integration get) can see it.

Update Mode

This mode does not flip the cybersecurity_reviewed milestone — it runs on every PR, not once per project lifetime.

1. Read the diff

Caller passes the diff via stdin or as a file path. Parse it to get:

the list of changed files
the changed line ranges per file (so subsequent checks can be scoped)
whether any of the changes touch dependency manifests (package.json, pyproject.toml, Cargo.toml, lock files), env / config files, IaC, or CI workflows — those carry security weight beyond the line count

If no diff was provided, halt and ask the caller for one. Do not silently fall back to a full-audit scan.

2. Identify impacted code

A change to a function, type, schema, or config can introduce security risk in code that wasn't itself edited. For each changed entity, identify the impact set:

3. Run the audit categories on the scoped set

4. Return a structured outcome

The orchestrator wires this output into the PR description or a comment. Format:

{
  "verdict": "PASS" | "FINDINGS" | "BLOCKING",
  "summary": "<one sentence: what was reviewed and the headline result>",
  "categories_considered": ["secret-exposure", "auth", ...],
  "findings": [
    {
      "severity": "critical|high|medium|low",
      "category": "<one of the eight>",
      "location": "<path/to/file:line>",
      "description": "<concrete what + why>",
      "recommendation": "<concrete next step>"
    }
  ]
}

Verdict semantics:

PASS — nothing of concern. Findings list may still contain low severity items but none are actionable.
FINDINGS — one or more medium / high items. The PR may proceed; findings are surfaced to the human reviewer in the PR body.
BLOCKING — at least one critical finding (e.g., a secret committed to the diff, an auth bypass introduced). The orchestrator must halt the push and surface this to the user. Do not return BLOCKING lightly — the bar is "this PR cannot land safely as-is".

5. Do not write to docs/security-audit.md

Rules

Surface, do not fix. Report findings; do not modify code as part of this skill. Fixes are tracked work that go through do-issue-solo after the user prioritises them.
Cite evidence. Every WARN / FAIL must reference a file path or dependency name. Vague findings ("auth could be stronger") are not actionable.
Be honest about coverage. If a category requires runtime testing or production data the audit cannot reach, mark it N/A and note what would be needed for a thorough review.
Defer to mav-bp-application-security for what "good" looks like in each category. This skill is the audit; that one is the standard.
Follow mav-scope-boundaries — do not run anything that would modify production systems, change auth/permissions, or take destructive action.

Return Protocol (update mode)

When you return from this skill, do not post a closing summary, do not stop. The calling workflow still owns, in order:

Closing the skill-dispatch interval that wrapped this invocation (uv run maverick report end skill-dispatch … --outcome <success|failure|blocked>). The outcome maps from the verdict: PASS → success, FINDINGS → success (with the findings folded into the PR body draft), BLOCKING → blocked.
Acting on the verdict — halting on BLOCKING, folding findings into the PR body draft on FINDINGS, or recording Security review: no concerns. on PASS.
Advancing to the next phase (open the PR) once the verdict has been folded in.

Adoption

thermiteau/do-cybersecurity-review

$ install --global

Security Scan Results

SKILL.md

Cybersecurity Review

Preflight (mandatory)

Mode Selection

Full Audit Mode

1. Detect the project stack

2. Walk the audit categories

2.1 Secret exposure

2.2 Dependency hygiene

2.3 Authentication and authorisation

2.4 Input validation and output encoding

2.5 Transport, headers, and CORS

2.6 Data at rest

2.7 Logging, monitoring, and rate limiting

2.8 Container / infrastructure (if applicable)

3. Write the report

4. Surface the findings to the user

5. Record the milestone

Update Mode

1. Read the diff

2. Identify impacted code

3. Run the audit categories on the scoped set

4. Return a structured outcome

5. Do not write to docs/security-audit.md

Rules

Return Protocol (update mode)

Related Skills

thermiteau/skills/do-test

thermiteau/do-code

thermiteau/mav-stacked-prs

thermiteau/mav-multi-instance-coordination

thermiteau/do-cybersecurity-review

$ install --global

Security Scan Results

SKILL.md

Cybersecurity Review

Preflight (mandatory)

Mode Selection

Full Audit Mode

1. Detect the project stack

2. Walk the audit categories

2.1 Secret exposure

2.2 Dependency hygiene

2.3 Authentication and authorisation

2.4 Input validation and output encoding

2.5 Transport, headers, and CORS

2.6 Data at rest

2.7 Logging, monitoring, and rate limiting

2.8 Container / infrastructure (if applicable)

3. Write the report

4. Surface the findings to the user

5. Record the milestone

Update Mode

1. Read the diff

2. Identify impacted code

3. Run the audit categories on the scoped set

4. Return a structured outcome

5. Do not write to docs/security-audit.md

Rules

Return Protocol (update mode)

Related Skills

thermiteau/skills/do-test

thermiteau/do-code

thermiteau/mav-stacked-prs

thermiteau/mav-multi-instance-coordination