plugins/static-analysis/skills/semgrep/SKILL.md
Run Semgrep static analysis scan on a codebase using parallel subagents. Supports two scan modes — "run all" (full ruleset coverage) and "important only" (high-confidence security vulnerabilities). Automatically detects and uses Semgrep Pro for cross-file taint analysis when available. Use when asked to scan code for vulnerabilities, run a security audit with Semgrep, find bugs, or perform static analysis. Spawns parallel workers for multi-language codebases.
npx skillsauth add trailofbits/skills semgrepInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run a Semgrep scan with automatic language detection, parallel execution via Task subagents, and merged SARIF output.
--metrics=off — Semgrep sends telemetry by default; --config auto also phones home. Every semgrep command must include --metrics=off to prevent data leakage during security audits.semgrep-rule-creator skillsemgrep-rule-variant-creator skillAll scan results, SARIF files, and temporary data are stored in a single output directory.
OUTPUT_DIR../static_analysis_semgrep_1. If that already exists, increment to _2, _3, etc.In both cases, always create the directory with mkdir -p before writing any files.
# Resolve output directory
if [ -n "$USER_SPECIFIED_DIR" ]; then
OUTPUT_DIR="$USER_SPECIFIED_DIR"
else
BASE="static_analysis_semgrep"
N=1
while [ -e "${BASE}_${N}" ]; do
N=$((N + 1))
done
OUTPUT_DIR="${BASE}_${N}"
fi
mkdir -p "$OUTPUT_DIR/raw" "$OUTPUT_DIR/results"
The output directory is resolved once at the start of Step 1 and used throughout all subsequent steps.
$OUTPUT_DIR/
├── rulesets.txt # Approved rulesets (logged after Step 3)
├── raw/ # Per-scan raw output (unfiltered)
│ ├── python-python.json
│ ├── python-python.sarif
│ ├── python-django.json
│ ├── python-django.sarif
│ └── ...
└── results/ # Final merged output
└── results.sarif
Required: Semgrep CLI (semgrep --version). If not installed, see Semgrep installation docs.
Optional: Semgrep Pro — enables cross-file taint tracking, inter-procedural analysis, and additional languages (Apex, C#, Elixir). Check with:
semgrep --pro --validate --config p/default 2>/dev/null && echo "Pro available" || echo "OSS only"
Limitations: OSS mode cannot track data flow across files. Pro mode uses -j 1 for cross-file analysis (slower per ruleset, but parallel rulesets compensate).
Select mode in Step 2 of the workflow. Mode affects both scanner flags and post-processing.
| Mode | Coverage | Findings Reported | |------|----------|-------------------| | Run all | All rulesets, all severity levels | Everything | | Important only | All rulesets, pre- and post-filtered | Security vulns only, medium-high confidence/impact |
Important only applies two filter layers:
--severity MEDIUM --severity HIGH --severity CRITICAL (CLI flag)category=security, confidence∈{MEDIUM,HIGH}, impact∈{MEDIUM,HIGH}See scan-modes.md for metadata criteria and jq filter commands.
┌──────────────────────────────────────────────────────────────────┐
│ MAIN AGENT (this skill) │
│ Step 1: Detect languages + check Pro availability │
│ Step 2: Select scan mode + rulesets (ref: rulesets.md) │
│ Step 3: Present plan + rulesets, get approval [⛔ HARD GATE] │
│ Step 4: Spawn parallel scan Tasks (approved rulesets + mode) │
│ Step 5: Merge results and report │
└──────────────────────────────────────────────────────────────────┘
│ Step 4
▼
┌─────────────────┐
│ Scan Tasks │
│ (parallel) │
├─────────────────┤
│ Python scanner │
│ JS/TS scanner │
│ Go scanner │
│ Docker scanner │
└─────────────────┘
Follow the detailed workflow in scan-workflow.md. Summary:
| Step | Action | Gate | Key Reference | |------|--------|------|---------------| | 1 | Resolve output dir, detect languages + Pro availability | — | Use Glob, not Bash | | 2 | Select scan mode + rulesets | — | rulesets.md | | 3 | Present plan, get explicit approval | ⛔ HARD | AskUserQuestion | | 4 | Spawn parallel scan Tasks | — | scanner-task-prompt.md | | 5 | Merge results and report | — | Merge script (below) |
Task enforcement: On invocation, create 5 tasks with blockedBy dependencies (each step blocks the previous). Step 3 is a HARD GATE — mark complete ONLY after user explicitly approves.
Merge command (Step 5):
uv run {baseDir}/scripts/merge_sarif.py $OUTPUT_DIR/raw $OUTPUT_DIR/results/results.sarif
| Agent | Tools | Purpose |
|-------|-------|---------|
| static-analysis:semgrep-scanner | Bash | Executes parallel semgrep scans for a language category |
Use subagent_type: static-analysis:semgrep-scanner in Step 4 when spawning Task subagents.
| Shortcut | Why It's Wrong |
|----------|----------------|
| "User asked for scan, that's approval" | Original request ≠ plan approval. Present plan, use AskUserQuestion, await explicit "yes" |
| "Step 3 task is blocking, just mark complete" | Lying about task status defeats enforcement. Only mark complete after real approval |
| "I already know what they want" | Assumptions cause scanning wrong directories/rulesets. Present plan for verification |
| "Just use default rulesets" | User must see and approve exact rulesets before scan |
| "Add extra rulesets without asking" | Modifying approved list without consent breaks trust |
| "Third-party rulesets are optional" | Trail of Bits, 0xdea, Decurity catch vulnerabilities not in official registry — REQUIRED |
| "Use --config auto" | Sends metrics; less control over rulesets |
| "One Task at a time" | Defeats parallelism; spawn all Tasks together |
| "Pro is too slow, skip --pro" | Cross-file analysis catches 250% more true positives; worth the time |
| "Semgrep handles GitHub URLs natively" | URL handling fails on repos with non-standard YAML; always clone first |
| "Cleanup is optional" | Cloned repos pollute the user's workspace and accumulate across runs |
| "Use . or relative path as target" | Subagents need absolute paths to avoid ambiguity |
| "Let the user pick an output dir later" | Output directory must be resolved at Step 1, before any files are created |
| File | Content | |------|---------| | rulesets.md | Complete ruleset catalog and selection algorithm | | scan-modes.md | Pre/post-filter criteria and jq commands | | scanner-task-prompt.md | Template for spawning scanner subagents |
| Workflow | Purpose | |----------|---------| | scan-workflow.md | Complete 5-step scan execution process |
$OUTPUT_DIRsemgrep command used --metrics=off$OUTPUT_DIR/rulesets.txt$OUTPUT_DIR/raw/results.sarif exists in $OUTPUT_DIR/results/ and is valid JSONraw/$OUTPUT_DIR/repos/testing
Draws the 12 Houses of the Zodiac Tarot spread to inject entropy into planning when prompts are vague, ambiguous, or casually delegated. Interprets the spread to guide next steps. Use when the user says 'let fate decide', 'YOLO', 'whatever', 'idk', or other nonchalant phrases, makes Yu-Gi-Oh references, or when you are about to arbitrarily pick between multiple reasonable approaches. Prefer over ask-questions-if-underspecified when the user's tone is casual or playful rather than precision-seeking.
testing
Scans Solana programs for 6 critical vulnerabilities including arbitrary CPI, improper PDA validation, missing signer/ownership checks, and sysvar spoofing. Use when auditing Solana/Anchor programs.
development
Performs comprehensive C/C++ security review for memory corruption, integer overflows, race conditions, and platform-specific vulnerabilities. Use when auditing native C/C++ applications, reviewing daemons or services for memory safety, or hunting integer overflow / use-after-free / race conditions in userspace code.
development
Detects missing zeroization of sensitive data in source code and identifies zeroization removed by compiler optimizations, with assembly-level analysis, and control-flow verification. Use for auditing C/C++/Rust code handling secrets, keys, passwords, or other sensitive data.