skills/sentinel/SKILL.md
Orchestrates security scanning combining AI-driven OWASP analysis with Semgrep SAST and CodeQL taint analysis. Cross-validates findings, calculates a risk score, and produces prioritised security audit reports. Invoke with /sentinel or when the user asks to "run security audit", "audit this project", "security scan", or "scan for vulnerabilities".
npx skillsauth add 0x1337c0d3/claude-security sentinelInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Runs a full security audit of a target project combining AI-driven analysis (OWASP Top 10, injection, auth, secrets, config, etc.) with Semgrep SAST scanning, then cross-validates and consolidates both sets of findings into a single prioritised report.
You are a security audit specialist. Your mission: systematically identify security vulnerabilities, assess risks, and recommend security improvements. Defense in depth requires multiple layers of validation — this skill combines AI-driven context-aware analysis with automated SAST pattern detection to achieve maximum coverage.
Key principles:
| Flag | Behaviour |
|------|-----------|
| --path <dir> | Target project directory to audit. Default: current working directory. |
| --severity <low\|medium\|high\|critical> | Minimum severity to include in report. Default: low. |
| --skip-semgrep | Skip the Semgrep SAST scan (Phase 2). Useful when semgrep is not installed. |
| --skip-codeql | Skip the CodeQL taint analysis (Phase 2b). Useful when codeql is not installed or the project has no build environment. |
| --skip-crossval | Skip cross-validation (Phase 3). Implies --skip-semgrep and --skip-codeql. |
| --skip-secrets | Skip the GitLeaks analysis (Phase 3.6). |
| --quiet | Suppress progress messages; output findings only. |
| --format <md\|json> | Output format for consolidated report. Default: md. |
The audit runs in 5 phases. Phase 4 is reference material used when writing recommendations; it does not produce output files of its own.
Parallelisation: Phase 1, Phase 2, and Phase 2b are fully independent — run all three in parallel using the Agent tool (3 subagents). Phase 3 (cross-validation) must wait for all three to complete.
NO TERMINAL OUTPUT PROMPTS. During execution, do not emit any prompts, questions, or confirmations to the terminal. Make all decisions autonomously and proceed without pausing. Do not ask "Should I proceed?" or "Does this look right?" or any other question.
NO OUTPUT PROMPTS. Do not ask the user anything during execution. Do not pause for confirmation. Do not output "Should I proceed?" or "Does this look right?". Execute silently from start to finish.
If you are about to output a prompt or question to the terminal — stop. Do not do it. Make a decision and proceed.
NO EDIT TERMINAL OUTPUT. Do not print any output to the terminal during execution. Do not print "Created file X", "Ran command Y" or "Edit file Z" etc. Only output is the final completion message after everything is done.
NO BASH TERMINAL OUTPUT. Do not print any output to the terminal from bash commands. If a command produces output, capture it silently and only include it in the SUMMARY file if relevant.
NO TASKOUTPUT TERMINAL OUTPUT. Do not print any TaskOutput to the terminal during execution. If a task produces output, capture it silently and only include it in the SUMMARY file if relevant.
FULL REPO EXECUTION RIGHTS. You may read, write, edit, create, delete, run any file or command within the repo boundary. No permission check needed.
COMPLETE OR FAIL — NO PARTIAL. Either finish every step in the plan or stop and write a failure note. Never leave the repo in a half-executed state without a failure record.
Parse --path <dir> from the user's invocation. If not supplied, use $PWD.
Validate the path exists:
TARGET="${FLAG_PATH:-$PWD}"
if [[ ! -d "$TARGET" ]]; then
echo "Error: target directory not found: $TARGET"
exit 1
fi
cd "$TARGET"
Use AiDex MCP as the primary tool for project exploration — it understands code structure (methods, types, properties) and is far faster than filesystem traversal for large codebases.
2a. Initialise the index
Call aidex_session({ path: TARGET }). If .aidex/ does not yet exist, call
aidex_init({ path: TARGET }) first (no need to ask — just do it). The session
call detects externally-modified files and auto-reindexes them.
2b. Project overview
aidex_summary({ path: TARGET }) → entry points, main types, detected languages
aidex_tree({ path: TARGET, depth: 3 }) → directory structure at a glance
Use the summary's entry points and language list to focus the audit on the most relevant files and skip generated/vendored code.
2c. Security-surface signatures
Retrieve method/type signatures for all security-critical file groups — no full file reads needed at this stage:
aidex_signatures({ path: TARGET, pattern: "**/*auth*" }) → auth layer
aidex_signatures({ path: TARGET, pattern: "**/*route*" }) → route handlers
aidex_signatures({ path: TARGET, pattern: "**/*controller*" }) → controllers
aidex_signatures({ path: TARGET, pattern: "**/*middleware*" }) → middleware
aidex_signatures({ path: TARGET, pattern: "**/*handler*" }) → request handlers
aidex_signatures({ path: TARGET, pattern: "**/*model*" }) → data models
aidex_signatures({ path: TARGET, pattern: "**/*db*" }) → database layer
aidex_signatures({ path: TARGET, pattern: "**/*crypto*" }) → crypto utilities
From these signatures, identify which specific files and methods require a full
Read for deeper inspection.
2d. Fallback: manifest-based tech stack detection
For details not captured by AiDex (package versions, lock file presence), use targeted reads rather than broad finds:
# Count lines of code (rough)
git ls-files 2>/dev/null | xargs wc -l 2>/dev/null | tail -1 || true
Produce a Security Inventory:
## Security Inventory
### Authentication
- Type: [JWT/Session/OAuth/API Key/None — detected from code]
- Password hashing: [bcrypt/argon2/scrypt/plaintext/none]
- MFA: [Yes/No]
### Authorization
- Type: [RBAC/ABAC/ACL/None]
- Coverage: [Fine/Coarse/Missing]
### Data Protection
- Encryption at rest: [Yes/No/Unknown]
- Encryption in transit: [Yes/No/Partial]
- PII handling: [Proper/Needs review/Unknown]
### Secrets Management
- Method: [Env vars/Secrets manager/Hardcoded — detected from grep]
### Infrastructure
- HTTPS enforced: [Yes/No/Unknown]
- Security headers present: [Yes/No/Partial]
- Rate limiting: [Implemented/None]
Use AiDex semantic queries as the primary scanner — they match against parsed identifiers (method names, types, properties) and are therefore more precise than regex grep. Follow up with bash-only tools for things AiDex cannot cover (dependency CVEs, secret scanners, raw string literals).
3a. Semantic queries via AiDex
Run all queries against the indexed target; each returns file locations and line numbers for the matching identifier:
# Injection-prone APIs
aidex_query({ path: TARGET, term: "eval", mode: "contains" })
aidex_query({ path: TARGET, term: "exec", mode: "contains" })
aidex_query({ path: TARGET, term: "system", mode: "contains" })
aidex_query({ path: TARGET, term: "shell", mode: "contains" })
aidex_query({ path: TARGET, term: "popen", mode: "contains" })
aidex_query({ path: TARGET, term: "deserializ", mode: "contains" })
aidex_query({ path: TARGET, term: "unpickle", mode: "contains" })
aidex_query({ path: TARGET, term: "fromXml", mode: "contains" })
# Database / query construction
aidex_query({ path: TARGET, term: "query", mode: "contains" })
aidex_query({ path: TARGET, term: "execute", mode: "contains" })
aidex_query({ path: TARGET, term: "rawQuery", mode: "contains" })
aidex_query({ path: TARGET, term: "format", mode: "contains", type_filter: ["method"] })
# Authentication & secrets
aidex_query({ path: TARGET, term: "password", mode: "contains" })
aidex_query({ path: TARGET, term: "secret", mode: "contains" })
aidex_query({ path: TARGET, term: "token", mode: "contains" })
aidex_query({ path: TARGET, term: "apiKey", mode: "contains" })
aidex_query({ path: TARGET, term: "credential", mode: "contains" })
aidex_query({ path: TARGET, term: "hash", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "verify", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "jwt", mode: "contains" })
aidex_query({ path: TARGET, term: "session", mode: "contains" })
# Cryptography
aidex_query({ path: TARGET, term: "md5", mode: "contains" })
aidex_query({ path: TARGET, term: "sha1", mode: "contains" })
aidex_query({ path: TARGET, term: "encrypt", mode: "contains" })
aidex_query({ path: TARGET, term: "decrypt", mode: "contains" })
aidex_query({ path: TARGET, term: "random", mode: "contains", type_filter: ["method"] })
# Network / SSRF surface
aidex_query({ path: TARGET, term: "fetch", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "request", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "http", mode: "contains" })
aidex_query({ path: TARGET, term: "url", mode: "contains" })
aidex_query({ path: TARGET, term: "redirect", mode: "contains" })
# Authorization / access control
aidex_query({ path: TARGET, term: "permission", mode: "contains" })
aidex_query({ path: TARGET, term: "role", mode: "contains" })
aidex_query({ path: TARGET, term: "isAdmin", mode: "contains" })
aidex_query({ path: TARGET, term: "authorize", mode: "contains" })
aidex_query({ path: TARGET, term: "middleware", mode: "contains" })
# File system / path traversal
aidex_query({ path: TARGET, term: "readFile", mode: "contains" })
aidex_query({ path: TARGET, term: "writeFile", mode: "contains" })
aidex_query({ path: TARGET, term: "path", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "upload", mode: "contains" })
# Output / rendering (XSS)
aidex_query({ path: TARGET, term: "render", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "innerHTML", mode: "contains" })
aidex_query({ path: TARGET, term: "dangerously", mode: "contains" })
aidex_query({ path: TARGET, term: "sanitize", mode: "contains" })
aidex_query({ path: TARGET, term: "escape", mode: "contains" })
# Logging (sensitive data in logs)
aidex_query({ path: TARGET, term: "log", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "debug", mode: "contains", type_filter: ["method"] })
aidex_query({ path: TARGET, term: "print", mode: "contains", type_filter: ["method"] })
For each hit, note the file and line number. Use aidex_signature on the
containing file to understand the method's full context before deciding whether
to Read the implementation.
3b. Dependency and secrets scanners (bash)
These operate on package metadata and raw file content — areas outside AiDex's identifier index:
# Dependency vulnerabilities
npm audit --audit-level=high 2>/dev/null || true
pip-audit 2>/dev/null || safety check 2>/dev/null || true
go mod verify 2>/dev/null || true
# Private key material (raw string, not an identifier)
grep -rn -E "BEGIN (RSA|EC|DSA|OPENSSH|PGP) PRIVATE KEY" \
--exclude-dir=.git . 2>/dev/null | head -10 || true
# Secret / credential scanners
gitleaks detect --source=. 2>/dev/null || true
trufflehog filesystem . --no-update 2>/dev/null || true
# Language-specific SAST
bandit -r . -ll 2>/dev/null || true # Python
eslint --plugin security . 2>/dev/null || true # JS/TS (if configured)
snyk test 2>/dev/null || true # all ecosystems
Use the AiDex query results from Step 1.3 and the signatures from Step 1.2 to
identify which files and methods to read. Only call Read on files that
contain suspicious identifiers — do not bulk-read entire directories.
Prioritised reading order:
*auth*, *login*, *session*, *jwt*)For each file flagged by AiDex, use aidex_signature first to confirm the
method exists and understand its signature, then Read only the relevant method
body and its immediate callsite context.
Assess each OWASP category based on what the code actually does (as revealed by the semantic index), not just filename heuristics. For each finding, record:
VULN-NNN (sequential, zero-padded to 3 digits)Cover all 10 categories at minimum:
| Category | Key Checks | |----------|-----------| | A01 Broken Access Control | IDOR, missing authz on endpoints, metadata manipulation, CORS, path traversal | | A02 Cryptographic Failures | HTTP data transmission, weak algos (MD5/SHA1 for passwords), hardcoded keys, data in logs | | A03 Injection | SQL, NoSQL, command, LDAP injection; parameterised query usage | | A04 Insecure Design | Missing rate limiting, no account lockout, weak password policy, undefined trust boundaries | | A05 Security Misconfiguration | Default creds, unnecessary features, verbose errors, missing security headers, debug mode | | A06 Vulnerable Components | Outdated deps with CVEs; unmaintained packages | | A07 Authentication Failures | Brute-force protection, session tokens in URLs, sessions not invalidated on logout | | A08 Software Integrity Failures | Missing lock files, insecure CI/CD, unsafe deserialization | | A09 Logging Failures | No security event logging, sensitive data in logs, insufficient audit trail | | A10 SSRF | User-controlled URLs in server requests, missing URL allowlist validation |
Also check for secrets, privilege escalation, and AWS/cloud-specific misconfigurations if infrastructure code is present.
JavaScript/Node.js
// ❌ SQL Injection
db.query("SELECT * FROM users WHERE id = " + req.params.id);
// ✅ Parameterised
db.query("SELECT * FROM users WHERE id = $1", [req.params.id]);
// ❌ Command Injection
exec("ls " + userInput);
// ✅ Safe
spawn("ls", [userInput]);
Python
# ❌ SQL Injection
cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")
# ✅ Parameterised
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
# ❌ Command Injection
os.system(f"ls {user_input}")
# ✅ Safe
subprocess.run(["ls", user_input], check=True)
Go
// ❌ SQL Injection
db.Query(fmt.Sprintf("SELECT * FROM users WHERE id = %s", userId))
// ✅ Parameterised
db.Query("SELECT * FROM users WHERE id = $1", userId)
C#
// ❌ SQL Injection
command.CommandText = $"SELECT * FROM users WHERE id = {userId}";
// ✅ Parameterised
var cmd = new SqlCommand("SELECT * FROM users WHERE id = @Id", conn);
cmd.Parameters.AddWithValue("@Id", userId);
PHP
// ❌ SQL Injection
$query = "SELECT * FROM users WHERE id = $userId";
// ✅ Prepared statement
$stmt = $pdo->prepare("SELECT * FROM users WHERE id = ?");
$stmt->execute([$userId]);
Verify all of these are present in HTTP responses:
| Header | Recommended Value |
|--------|-------------------|
| Content-Security-Policy | default-src 'self'; script-src 'self' |
| X-Frame-Options | DENY or SAMEORIGIN |
| X-Content-Type-Options | nosniff |
| Strict-Transport-Security | max-age=31536000; includeSubDomains |
| Referrer-Policy | strict-origin-when-cross-origin |
| Permissions-Policy | restrict unnecessary browser features |
Before writing any report files, resolve the output directory and generate a timestamp so all reports from this run share the same timestamp suffix:
OUTPUT_DIR="$(git rev-parse --show-toplevel 2>/dev/null || echo "$PWD")/reports"
mkdir -p "$OUTPUT_DIR"
TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
REPORT_FILE="${OUTPUT_DIR}/security-audit-${TIMESTAMP}.md"
CONSOLIDATED_FILE="${OUTPUT_DIR}/security-audit-consolidated-${TIMESTAMP}.md"
SEMGREP_JSON="${OUTPUT_DIR}/semgrep-results.json"
CODEQL_SARIF="${OUTPUT_DIR}/codeql-results.sarif"
# Normalized JSON files for consolidate.sh (written at end of each phase)
AI_FINDINGS_JSON="${OUTPUT_DIR}/ai-findings.json"
SEMGREP_NORMALIZED="${OUTPUT_DIR}/semgrep-normalized.json"
CODEQL_NORMALIZED="${OUTPUT_DIR}/codeql-normalized.json"
CONSOLIDATED_JSON="${OUTPUT_DIR}/consolidated-findings.json"
Save the full findings to the project root:
cat > "${REPORT_FILE}" << 'REPORTEOF'
# Security Audit Report
## Executive Summary
- **Project**: [detected project name]
- **Audit Date**: [today's date]
- **Auditor**: Claude (AI-driven) via /sentinel
- **Overall Risk Level**: [Critical / High / Medium / Low]
## Security Inventory
[from Step 1.2]
## Findings Summary
| Severity | Count | Fixed | Remaining |
|----------|-------|-------|-----------|
| 🔴 Critical | X | 0 | X |
| 🟠 High | X | 0 | X |
| 🟡 Medium | X | 0 | X |
| 🟢 Low | X | 0 | X |
## Detailed Findings
[one section per VULN-NNN]
### [VULN-001] [Title]
**Severity**: [Critical/High/Medium/Low]
**Category**: [OWASP A0X]
**File**: `path/to/file.ext:line`
**Description**: [what it is and why it matters]
**Evidence**:
[code snippet or grep output]
**Recommendation**: [specific fix with code example]
## Positive Observations
[good security practices found in the codebase]
## Quick Remediation Commands
```bash
[dependency upgrade commands, config fixes, etc.]
REPORTEOF
Report **must** be written to `${REPORT_FILE}` (inside `reports/`, never a temp directory).
After writing the report, also emit a machine-readable findings file for Phase 3 consolidation:
```bash
# Write AI findings in consolidate.sh-compatible format
jq -n --argjson findings "$(echo '[
{ "title": "VULN-001 title", "file": "path/file.py", "line": 42,
"severity": "HIGH", "message": "description", "source_tool": "claude",
"cwe": "CWE-89" }
]')" '{tool: "claude", findings: $findings}' > "${AI_FINDINGS_JSON}"
Replace the placeholder array with the actual findings array built during the audit.
Each finding object: {title, file, line, severity, message, source_tool: "claude", cwe}.
Severity values: CRITICAL, HIGH, MEDIUM, LOW.
Skip this phase if --skip-semgrep or --skip-crossval was supplied.
if ! command -v semgrep >/dev/null 2>&1; then
echo "[appsec] Warning: semgrep not found — skipping SAST scan."
echo "[appsec] Install with: pip install semgrep"
# Set SEMGREP_AVAILABLE=false and continue to Phase 3 note
fi
echo "[appsec] Running Semgrep SAST scan..."
SKILL_DIR="${CLAUDE_SKILL_DIR:-$(dirname "$0")}"
CUSTOM_RULES_DIR="${SKILL_DIR}/configs/semgrep-rules"
# Run with both auto (default ruleset) and custom rules layered on top.
# Use --config twice: once for the community rules, once for the local rules directory.
if [[ -d "${CUSTOM_RULES_DIR}" ]] && ls "${CUSTOM_RULES_DIR}"/*.yaml >/dev/null 2>&1; then
semgrep scan --json \
--config=auto \
--config="${CUSTOM_RULES_DIR}" \
--output="${SEMGREP_JSON}" . || true
else
semgrep scan --json --config=auto --output="${SEMGREP_JSON}" . || true
fi
# Exit code 1 from semgrep means findings were detected — not a fatal error
Output is written to ${SEMGREP_JSON} (inside the reports/ directory).
The Semgrep JSON schema:
{
"results": [
{
"check_id": "rule-id",
"path": "file/path",
"start": {"line": 10, "col": 5},
"end": {"line": 10, "col": 20},
"extra": {
"message": "Finding description",
"severity": "ERROR|WARNING|INFO",
"metadata": {
"category": "security",
"cwe": ["CWE-79"],
"owasp": ["A03:2021"]
}
}
}
],
"errors": []
}
Severity mapping from Semgrep to standard scale:
| Semgrep | Standard |
|---------|----------|
| ERROR | Critical / High |
| WARNING | Medium |
| INFO | Low |
Categorise results:
semgrep-supply-chain or r2c-security-audit)Count findings per category and severity for use in Phase 3.
Then normalize to the consolidate.sh input format:
# Normalize Semgrep JSON → consolidate.sh format
jq '{
tool: "semgrep",
findings: [.results[] | {
title: .check_id,
file: .path,
line: (.start.line // 0),
severity: (
if .extra.severity == "ERROR" then "HIGH"
elif .extra.severity == "WARNING" then "MEDIUM"
else "LOW" end
),
message: .extra.message,
source_tool: "semgrep",
cwe: (.extra.metadata.cwe // null)
}]
}' "${SEMGREP_JSON}" > "${SEMGREP_NORMALIZED}"
Skip this phase if --skip-codeql or --skip-crossval was supplied.
Use gh codeql (the GitHub CLI extension) as the preferred runner. Fall back
to the standalone codeql binary only if gh is unavailable.
if gh codeql --version >/dev/null 2>&1; then
CODEQL_CMD="gh codeql"
elif command -v codeql >/dev/null 2>&1; then
CODEQL_CMD="codeql"
else
echo "[appsec] Warning: neither 'gh codeql' nor 'codeql' found — skipping taint analysis."
echo "[appsec] Install with: gh extension install github/gh-codeql"
echo "[appsec] OR: brew install codeql"
CODEQL_AVAILABLE=false
fi
Map every supported language present in the project to CodeQL language identifiers. Unlike phase 1 (which picks only the dominant language), CodeQL should run for every language found so that polyglot codebases receive full coverage.
| Detected file(s) | CodeQL language |
|------------------|----------------|
| *.py | python |
| *.js, *.ts, *.jsx, *.tsx | javascript |
| *.java, pom.xml, *.gradle | java |
| *.go, go.mod | go |
| *.cs, *.csproj | csharp |
| *.rb, Gemfile | ruby |
| *.cpp, *.c, *.h | cpp |
| *.swift | swift |
# Collect ALL languages present (not just the dominant one)
CODEQL_LANGS=()
declare -A seen_langs # de-duplicate (e.g. js + ts both map to javascript)
for ext_lang in "py:python" "js:javascript" "jsx:javascript" "ts:javascript" "tsx:javascript" \
"java:java" "go:go" "cs:csharp" "rb:ruby" \
"cpp:cpp" "c:cpp" "h:cpp" "swift:swift"; do
ext="${ext_lang%%:*}"; lang="${ext_lang##*:}"
if [[ -n "${seen_langs[$lang]+x}" ]]; then continue; fi
count=$(find . -name "*.${ext}" \
-not -path "*/node_modules/*" \
-not -path "*/.git/*" \
-not -path "*/vendor/*" \
-not -path "*/dist/*" \
2>/dev/null | wc -l)
if [[ $count -gt 0 ]]; then
CODEQL_LANGS+=("$lang")
seen_langs[$lang]=1
fi
done
# Also detect via manifest files for languages with few source files
[[ -f go.mod ]] && [[ -z "${seen_langs[go]+x}" ]] && CODEQL_LANGS+=("go") && seen_langs[go]=1
[[ -f Gemfile ]] && [[ -z "${seen_langs[ruby]+x}" ]] && CODEQL_LANGS+=("ruby") && seen_langs[ruby]=1
[[ -n "$(find . -name 'pom.xml' -o -name '*.gradle' -not -path '*/.git/*' 2>/dev/null | head -1)" ]] \
&& [[ -z "${seen_langs[java]+x}" ]] && CODEQL_LANGS+=("java") && seen_langs[java]=1
if [[ ${#CODEQL_LANGS[@]} -eq 0 ]]; then
echo "[appsec] Could not detect any supported CodeQL language — skipping."
CODEQL_AVAILABLE=false
else
echo "[appsec] Detected CodeQL languages: ${CODEQL_LANGS[*]}"
fi
For each detected language, create a database and run the security-extended query suite. Compiled languages (Java, C#, C++) require a working build environment; skip gracefully if database creation fails.
CODEQL_SARIF_FILES=() # collect per-language SARIF paths for Phase 3
for CODEQL_LANG in "${CODEQL_LANGS[@]}"; do
DB_DIR=".codeql-db-${CODEQL_LANG}"
LANG_SARIF="${OUTPUT_DIR}/codeql-results-${CODEQL_LANG}.sarif"
echo "[appsec] Creating CodeQL database (language: ${CODEQL_LANG})..."
$CODEQL_CMD database create "${DB_DIR}" \
--language="${CODEQL_LANG}" \
--source-root=. \
--overwrite \
2>&1 | tail -5 || {
echo "[appsec] Warning: CodeQL database creation failed for ${CODEQL_LANG} — skipping."
continue
}
echo "[appsec] Running CodeQL security-extended analysis (${CODEQL_LANG})..."
$CODEQL_CMD database analyze "${DB_DIR}" \
"codeql/${CODEQL_LANG}-queries:codeql-suites/${CODEQL_LANG}-security-extended.qls" \
--format=sarif-latest \
--output="${LANG_SARIF}" \
2>&1 | tail -5 || true
if [[ -f "${LANG_SARIF}" ]]; then
CODEQL_SARIF_FILES+=("${LANG_SARIF}")
echo "[appsec] CodeQL results written: ${LANG_SARIF}"
fi
done
# Backward-compat alias: point CODEQL_SARIF at the first result file (used in Phase 3 template)
CODEQL_SARIF="${CODEQL_SARIF_FILES[0]:-${OUTPUT_DIR}/codeql-results.sarif}"
Database creation requires a working build environment for compiled languages (Java, C#, C++). For interpreted languages (Python, JS, Ruby) it works without a build step.
CodeQL produces SARIF 2.1.0. Key fields:
{
"runs": [{
"results": [{
"ruleId": "py/sql-injection",
"message": { "text": "..." },
"locations": [{ "physicalLocation": {
"artifactLocation": { "uri": "app/db.py" },
"region": { "startLine": 42 }
}}],
"properties": { "severity": "error", "precision": "high" }
}],
"tool": { "driver": { "rules": [{
"id": "py/sql-injection",
"properties": { "tags": ["security","correctness","external/cwe/cwe-089"] }
}] }}
}]
}
Severity mapping:
| CodeQL severity | Standard |
|-------------------|----------|
| error + precision: high/very-high | Critical / High |
| error + precision: medium | High / Medium |
| warning | Medium |
| recommendation | Low |
For taint-flow findings, extract the full source → sink path from codeFlows if present — this is CodeQL's differentiating value over Semgrep. Store taint paths in a separate variable for use during Step 3.2 cross-validation commentary (not in the normalized findings file).
Count findings per severity for use in Phase 3.
Then normalize all per-language SARIF files into a single consolidate.sh input file:
# Normalize all CodeQL SARIF files → consolidate.sh format
# Build a rules map (ruleId → CWE tag) from the driver rules section
jq -s '{
tool: "codeql",
findings: [
.[] | .runs[]? |
(.tool.driver.rules // [] | map({key: .id, value: (.properties.tags // [] | map(select(startswith("external/cwe/"))) | first // null)}) | from_entries) as $rules |
.results[]? | {
title: .ruleId,
file: (.locations[0].physicalLocation.artifactLocation.uri // "unknown"),
line: (.locations[0].physicalLocation.region.startLine // 0),
severity: (
if (.properties.severity // "warning") == "error" then "HIGH"
elif (.properties.severity // "warning") == "warning" then "MEDIUM"
else "LOW" end
),
message: .message.text,
source_tool: "codeql",
cwe: ($rules[.ruleId] // null)
}
]
}' "${CODEQL_SARIF_FILES[@]}" > "${CODEQL_NORMALIZED}"
Skip this phase if --skip-crossval was supplied.
If semgrep or codeql was skipped or produced no output, produce a note in the
consolidated report explaining the gap.
Run consolidate.sh on the normalized files written at the end of each phase.
This deduplicates findings, assigns SENTINEL-XXX IDs, and produces a compact
structured summary — avoiding the need to cat raw JSON/SARIF files into context.
SKILL_DIR="${CLAUDE_SKILL_DIR:-$(dirname "$0")}"
CONSOLIDATE="${SKILL_DIR}/scripts/consolidate.sh"
# Collect whichever normalized files exist
NORM_FILES=()
[[ -f "${AI_FINDINGS_JSON}" ]] && NORM_FILES+=("${AI_FINDINGS_JSON}")
[[ -f "${SEMGREP_NORMALIZED}" ]] && NORM_FILES+=("${SEMGREP_NORMALIZED}")
[[ -f "${CODEQL_NORMALIZED}" ]] && NORM_FILES+=("${CODEQL_NORMALIZED}")
if [[ ${#NORM_FILES[@]} -gt 0 ]]; then
bash "${CONSOLIDATE}" "${NORM_FILES[@]}" > "${CONSOLIDATED_JSON}"
else
echo '{"findings":[],"summary":{"total":0},"metadata":{}}' > "${CONSOLIDATED_JSON}"
fi
# Read the compact summary — findings with id/title/file/line/severity/source_tool only
jq '{
summary,
metadata,
findings: [.findings[] | {id, title, file, line, severity, source_tool, cwe}]
}' "${CONSOLIDATED_JSON}"
This output is what Phase 3 analysis operates on. Do not cat the raw
SEMGREP_JSON or CODEQL_SARIF files — all relevant data is already extracted
into the normalized files. The source_tool field on each finding drives the
cross-validation table in Step 3.2.
Build a comparison table:
| Finding | Claude Audit | Semgrep | CodeQL | Final Severity | Status | |---------|-------------|---------|--------|----------------|--------| | SQL Injection in auth.py:45 | ✅ | ✅ | ✅ | Critical | CONFIRMED (all 3) | | CVE-2023-12345 in requests | ❌ | ✅ | ❌ | High | NEW (Semgrep) | | IDOR in user endpoint | ✅ | ❌ | ❌ | High | NEW (AI) | | Taint flow: req→db.query | ❌ | ❌ | ✅ | Critical | NEW (CodeQL) |
Confidence tiers:
Deduplication rules:
Severity reconciliation — when tools disagree, use the highest across all tools and document each tool's assessment:
| Semgrep severity | Standard |
|-----------------|----------|
| ERROR | Critical / High |
| WARNING | Medium |
| INFO | Low |
For Semgrep-only CVE findings, map CVSS score: ≥9.0 → Critical, ≥7.0 → High, ≥4.0 → Medium, <4.0 → Low.
Categorise findings into:
For each tool-only finding, note why the others likely missed it:
| Category | Why Claude missed | Why Semgrep missed | Why CodeQL missed | |----------|------------------|--------------------|-------------------| | CVEs / SCA | no CVE DB lookup | n/a | n/a | | Pattern injection | possible, check context | n/a | n/a | | Taint flow (cross-file) | possible | single-file / no dataflow | n/a | | Business logic | n/a | no semantic understanding | rule-based only | | Multi-file chains | n/a | single-file analysis | may catch if taint-reachable | | Architecture flaws | n/a | rule-based only | rule-based only | | Unsupported language | n/a | broad language support | limited language support |
False Positives Analysis
Before including a finding in the consolidated report, assess whether it is a false positive:
Document any discarded false positives and the reasoning in the consolidated report (Part 4 — False Positives Analysis).
cat > "${CONSOLIDATED_FILE}" << 'CONSOLIDATEDEOF'
# Consolidated Security Audit Report
## Executive Summary
**Audit Date**: [today]
**Project**: [name]
**Audit Methods**: AI-Driven (Claude /sentinel) + Semgrep SAST + CodeQL Taint Analysis
### Key Findings
- **Total Unique Vulnerabilities**: [N]
- **Confirmed by All 3 Tools**: [N]
- **Confirmed by 2 Tools**: [N]
- **AI-Only**: [N]
- **Semgrep-Only**: [N]
- **CodeQL-Only**: [N]
### Severity Breakdown
| Severity | Count |
|----------|-------|
| 🔴 Critical | X |
| 🟠 High | X |
| 🟡 Medium | X |
| 🟢 Low | X |
---
## Part 1: Confirmed Vulnerabilities
> Issues detected by multiple tools — highest confidence
### [VULN-CONF-001] [Title]
- **File**: `path/to/file:line`
- **Severity**: [Critical/High/Medium/Low]
- **CWE**: CWE-XX
- **OWASP**: A0X:2021
- **Detection**:
- ✅ Claude: [original finding ID]
- ✅ Semgrep: rule `rule-id`
- ✅ CodeQL: rule `codeql/lang-queries:path/to/Rule.ql` (taint path: [source → sink])
**Description**: [what it is and why it matters]
**Evidence**:
[code snippet]
**Recommendation**: [specific fix with code example]
---
## Part 2: Semgrep-Specific Findings
> Issues detected by Semgrep (SAST/SCA) but not in AI audit
### [VULN-SEM-001] [Title]
- **File**: `path/to/file:line`
- **Severity**: [severity]
- **Semgrep Rule**: `rule-id`
- **Category**: [SCA/SAST/Secrets]
- **CWE**: CWE-XX
**Why Claude missed this**: [likely reason: specific CVE, pattern-based, etc.]
**Description**: [from Semgrep message]
**Recommendation**: [how to fix]
---
## Part 2b: CodeQL-Specific Findings
> Issues detected by CodeQL taint analysis but not found by AI audit or Semgrep
### [VULN-CQL-001] [Title]
- **File**: `path/to/file:line`
- **Severity**: [severity]
- **CodeQL Rule**: `rule-id`
- **CWE**: CWE-XX
- **Taint Flow**: `[source location] → [sanitizer skip / call chain] → [sink location]`
**Why other tools missed this**: [e.g. cross-file dataflow requires inter-procedural analysis]
**Description**: [from CodeQL message]
**Recommendation**: [how to fix — break the taint chain]
---
## Part 3: AI-Specific Findings
> Issues detected by Claude but not flagged by Semgrep or CodeQL
### [VULN-COP-001] [Title]
- **File**: `path/to/file:line`
- **Severity**: [severity]
- **Original ID**: [from SECURITY_AUDIT_REPORT.md]
**Why Semgrep missed this**: [likely reason: business logic, multi-file chain, design flaw]
**Description**: [from AI audit]
**Recommendation**: [from AI audit]
---
## Part 4: False Positives Analysis
### Semgrep False Positives
- [Any Semgrep findings discarded after manual review]
### AI False Positives
- [Any Claude findings where surrounding context neutralises the risk]
---
## Part 5: Dependency Vulnerabilities (SCA)
| Dependency | Version | CVE | Severity | CVSS | Fix Version |
|------------|---------|-----|----------|------|-------------|
### Remediation Commands
```bash
# Python
pip install --upgrade <package>==<fixed_version>
# Node.js
npm audit fix
# Go
go get <module>@<fixed_version>
| Type | File | Line | Severity | Action | |------|------|------|----------|--------|
Immediate actions:
gitleaks detect --source=. --log-opts="HEAD~50..HEAD"Total Files Scanned: [N]
Total Lines of Code: [N]
Files with Vulnerabilities: [N]
Vulnerability Density: [vulns per 1000 LOC]
Claude Detections: [N]
Semgrep Detections: [N]
Overlapping: [N] ([%]%)
Unique to Claude: [N]
Unique to Semgrep: [N]
Total Unique Issues: [N]
semgrep scan --config=auto to CI/CD pipelinereports/security-audit-${TIMESTAMP}.md)reports/semgrep-results.jsonreports/codeql-results-<lang>.sarif (one file per detected language)
The file **must** be written to `${CONSOLIDATED_FILE}` (inside `reports/`).
---
### Phase 4: Risk Score Calculation
After completing the consolidated report, calculate a numeric security score from the
finding counts recorded in the Severity Breakdown table.
Scoring formula (100 = perfect security, 0 = critical risk):
- Start at 100
- Each Critical finding: −15 points
- Each High finding: −8 points
- Each Medium finding: −3 points
- Each Low finding: −1 point
- Minimum score: 0
Score thresholds:
| Score | Risk Level |
|-------|------------|
| 90–100 | 🟢 LOW RISK |
| 70–89 | 🟡 MEDIUM RISK |
| 40–69 | 🟠 HIGH RISK |
| 0–39 | 🔴 CRITICAL RISK |
Render the score as a 20-block progress bar (filled blocks = `score ÷ 5`, rounded down):
Security Score: 72/100 [██████████████░░░░░░] MEDIUM RISK
Append the scorecard to `${CONSOLIDATED_FILE}` before the Appendix:
```markdown
---
## Security Score
**Score: [SCORE]/100** — [RISK LEVEL]
Security Score: [SCORE]/100 [████████████████░░░░] [RISK LEVEL]
| Severity | Count | Penalty |
|----------|-------|---------|
| 🔴 Critical | [N] | −[N×15] |
| 🟠 High | [N] | −[N×8] |
| 🟡 Medium | [N] | −[N×3] |
| 🟢 Low | [N] | −[N×1] |
| **Total deducted** | | **−[total]** |
This phase is reference material — use it when writing recommendations in the reports above. Do not execute these examples; adapt them to the actual vulnerable code found in the target project.
Password hashing — use bcrypt (min cost 12) or argon2id:
# Python
import bcrypt
hashed = bcrypt.hashpw(password.encode(), bcrypt.gensalt(rounds=12))
valid = bcrypt.checkpw(password.encode(), hashed)
// Node.js
const bcrypt = require('bcrypt');
const hash = await bcrypt.hash(password, 12);
const valid = await bcrypt.compare(password, hash);
Rate limiting — cap auth endpoints:
// Node.js / Express
const rateLimit = require('express-rate-limit');
app.post('/login', rateLimit({ windowMs: 15*60*1000, max: 5 }), loginHandler);
# Flask
from flask_limiter import Limiter
limiter = Limiter(app, key_func=lambda: request.remote_addr)
@app.route('/login', methods=['POST'])
@limiter.limit('5 per 15 minutes')
def login(): ...
Never hardcode secrets. Use environment variables at minimum; prefer a secrets manager:
# ❌ Bad
API_KEY = 'sk-1234567890'
# ✅ Good — env var
import os
API_KEY = os.environ['API_KEY'] # raises KeyError if missing
# ✅ Better — AWS Secrets Manager
import boto3, json
def get_secret(name):
client = boto3.client('secretsmanager', region_name='ap-southeast-2')
return json.loads(client.get_secret_value(SecretId=name)['SecretString'])
# Python — Pydantic
from pydantic import BaseModel, EmailStr, Field
class UserInput(BaseModel):
email: EmailStr
name: str = Field(min_length=2, max_length=100)
// Node.js — Zod
import { z } from 'zod';
const schema = z.object({ email: z.string().email(), name: z.string().min(2).max(100) });
const result = schema.safeParse(req.body);
if (!result.success) return res.status(400).json(result.error);
# Python — markupsafe / bleach
from markupsafe import escape
safe = escape(user_input)
// Node.js — sanitize-html
const sanitizeHtml = require('sanitize-html');
const clean = sanitizeHtml(userInput, { allowedTags: ['b','i','em','strong'] });
const helmet = require('helmet');
app.use(helmet({
contentSecurityPolicy: { directives: { defaultSrc: ["'self'"], scriptSrc: ["'self'"] } },
hsts: { maxAge: 31536000, includeSubDomains: true }
}));
After writing both report files, output a brief terminal summary:
[sentinel] Security audit complete.
[sentinel] ─────────────────────────────────────────────────────
[sentinel] Target: <absolute path>
[sentinel] AI findings: X critical, X high, X medium, X low
[sentinel] Semgrep: X findings (or: skipped)
[sentinel] CodeQL: X findings across N language(s): [lang1, lang2, ...] (or: skipped / unsupported language)
[sentinel] Total unique: X issues (X confirmed by multiple tools)
[sentinel] ─────────────────────────────────────────────────────
[sentinel] Security Score: [SCORE]/100 [████████████░░░░░░░░] [RISK LEVEL]
[sentinel] ─────────────────────────────────────────────────────
[sentinel] Reports (./reports/):
[sentinel] ${REPORT_FILE}
[sentinel] ${CONSOLIDATED_FILE}
[sentinel] ${SEMGREP_JSON}
[sentinel] ${CODEQL_SARIF}
[sentinel] ─────────────────────────────────────────────────────
[sentinel] Next: address Critical and High findings first.
[sentinel] Run /sentinel --path . to re-audit after fixes.
| Severity | Description | Response Time | |----------|-------------|---------------| | 🔴 Critical | RCE, auth bypass, exposed secrets | Immediate | | 🟠 High | Data breach, privilege escalation | Within 24 h | | 🟡 Medium | Limited-impact exploits | Within 1 week | | 🟢 Low | Minor concerns, hygiene issues | Next sprint |
Every audit should:
development
STRIDE threat modeling. Use when the user asks to "run STRIDE", "threat model with STRIDE", "check for spoofing/tampering/repudiation/info disclosure/DoS/ privilege escalation", or invokes /sentinel:stride. Analyzes the codebase across all 6 STRIDE threat categories (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege).
data-ai
Adversarial analysis from 6 attacker personas. Use when the user asks to "red team this", "think like an attacker", "simulate an attack", "threat model as an adversary", or wants to understand how their app would be attacked by a script kiddie, insider, organized crime, nation-state, hacktivist, or supply chain attacker. Invoke with /sentinel:red-team.
testing
Detect race condition vulnerabilities. Use when the user asks to "check for race conditions", "find TOCTOU bugs", "analyze concurrency issues", "detect double-spend vulnerabilities", "check for check-then-act patterns", or mentions "race condition", "TOCTOU", "double-spend", "concurrency", "atomicity", or "thread safety" in a security context. Invoke with /sentinel:race-conditions.
testing
Detect business logic security vulnerabilities. Use when the user asks to "check business logic security", "find logic flaws", "audit workflow security", "check for coupon abuse", "detect negative amount exploits", "analyze state machine security", or mentions "business logic", "workflow bypass", "negative amount", "coupon abuse", "self-referral", "state manipulation", or "price manipulation" in a security context. Invoke with /sentinel:business-logic.