skills/semgrep/SKILL.md
Run Semgrep static analysis for fast security scanning and pattern matching. Use when asked to scan code with Semgrep, write custom YAML rules, find vulnerabilities quickly, use taint mode, or set up Semgrep in CI/CD pipelines.
npx skillsauth add igbuend/grimbard semgrepInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Ideal scenarios:
Complements other tools:
Consider CodeQL instead when:
Do NOT use this skill for:
# pip
python3 -m pip install semgrep
# pipx (recommended)
pipx install semgrep
# Homebrew
brew install semgrep
# Docker
docker pull returntocorp/semgrep:latest
docker run --rm -v "${PWD}:/src" returntocorp/semgrep semgrep --config auto /src
# Update
pip install --upgrade semgrep
# Verify
semgrep --version
semgrep --config auto . # Auto-detect rules
semgrep --config auto --metrics=off . # Disable telemetry for proprietary code
semgrep --config p/<RULESET> . # Single ruleset
semgrep --config p/security-audit --config p/trailofbits . # Multiple
| Ruleset | Description |
|---------|-------------|
| p/default | General security and code quality |
| p/security-audit | Comprehensive security rules |
| p/owasp-top-ten | OWASP Top 10 vulnerabilities |
| p/cwe-top-25 | CWE Top 25 vulnerabilities |
| p/r2c-security-audit | r2c security audit rules |
| p/trailofbits | Trail of Bits security rules |
| p/python | Python-specific |
| p/javascript | JavaScript-specific |
| p/golang | Go-specific |
# SARIF output (for CI/CD)
semgrep --config p/security-audit --sarif -o results.sarif .
# JSON output
semgrep --config p/security-audit --json -o results.json .
# Text output with dataflow traces
semgrep --config p/security-audit --dataflow-traces .
# JUnit XML
semgrep --config p/security-audit --junit-xml -o results.xml .
# GitLab SAST format
semgrep --config p/security-audit --gitlab-sast -o gl-sast-report.json .
# Vim quickfix
semgrep --config p/security-audit --vim .
# Single file
semgrep --config p/python app.py
# Specific directory
semgrep --config p/javascript src/
# Include tests (excluded by default)
semgrep --config auto --include='**/test/**' .
# Exclude paths
semgrep --config auto --exclude='vendor' --exclude='node_modules' .
# Multiple languages
semgrep --config p/python --config p/javascript .
# Enable Pro Engine features (requires license)
semgrep --config p/security-audit --pro .
# Pro Engine interfile analysis
semgrep --config p/security-audit --pro --pro-intrafile .
# Disable telemetry
semgrep --config auto --metrics=off .
# Verbose output
semgrep --config p/security-audit --verbose .
# Quiet mode (only show findings)
semgrep --config p/security-audit --quiet .
rules:
- id: hardcoded-password
languages: [python]
message: "Hardcoded password detected: $PASSWORD"
severity: ERROR
pattern: password = "$PASSWORD"
| Syntax | Description | Example |
|--------|-------------|---------|
| ... | Match anything | func(...) |
| $VAR | Capture metavariable | $FUNC($INPUT) |
| <... ...> | Deep expression match | <... user_input ...> |
| Operator | Description |
|----------|-------------|
| pattern | Match exact pattern |
| patterns | All must match (AND) |
| pattern-either | Any matches (OR) |
| pattern-not | Exclude matches |
| pattern-inside | Match only inside context |
| pattern-not-inside | Match only outside context |
| pattern-regex | Regex matching |
| metavariable-regex | Regex on captured value |
| metavariable-comparison | Compare values |
rules:
- id: sql-injection
languages: [python]
message: "Potential SQL injection"
severity: ERROR
patterns:
- pattern-either:
- pattern: cursor.execute($QUERY)
- pattern: db.execute($QUERY)
- pattern-not:
- pattern: cursor.execute("...", (...))
- metavariable-regex:
metavariable: $QUERY
regex: .*\+.*|.*\.format\(.*|.*%.*
Simple pattern matching finds obvious cases:
# Pattern `os.system($CMD)` catches this:
os.system(user_input) # Found
But misses indirect flows:
# Same pattern misses this:
cmd = user_input
processed = cmd.strip()
os.system(processed) # Missed - no direct match
Taint mode tracks data through assignments and transformations:
user_input)cmd = ..., processed = ...)shlex.quote())os.system())rules:
- id: command-injection
languages: [python]
message: "User input flows to command execution"
severity: ERROR
mode: taint
pattern-sources:
- pattern: request.args.get(...)
- pattern: request.form[...]
- pattern: request.json
pattern-sinks:
- pattern: os.system($SINK)
- pattern: subprocess.call($SINK, shell=True)
- pattern: subprocess.run($SINK, shell=True, ...)
pattern-sanitizers:
- pattern: shlex.quote(...)
- pattern: int(...)
rules:
- id: flask-sql-injection
languages: [python]
message: "SQL injection: user input flows to query without parameterization"
severity: ERROR
metadata:
cwe: "CWE-89: SQL Injection"
owasp: "A03:2021 - Injection"
confidence: HIGH
mode: taint
pattern-sources:
- pattern: request.args.get(...)
- pattern: request.form[...]
- pattern: request.json
pattern-sinks:
- pattern: cursor.execute($QUERY)
- pattern: db.execute($QUERY)
pattern-sanitizers:
- pattern: int(...)
fix: cursor.execute($QUERY, (params,))
# test_rule.py
def test_vulnerable():
user_input = request.args.get("id")
# ruleid: flask-sql-injection
cursor.execute("SELECT * FROM users WHERE id = " + user_input)
def test_safe():
user_input = request.args.get("id")
# ok: flask-sql-injection
cursor.execute("SELECT * FROM users WHERE id = ?", (user_input,))
semgrep --test rules/
name: Semgrep
on:
push:
branches: [main]
pull_request:
schedule:
- cron: '0 0 1 * *' # Monthly
jobs:
semgrep:
runs-on: ubuntu-latest
container:
image: returntocorp/semgrep
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Required for diff-aware scanning
- name: Run Semgrep
run: |
if [ "${{ github.event_name }}" = "pull_request" ]; then
semgrep ci --baseline-commit ${{ github.event.pull_request.base.sha }}
else
semgrep ci
fi
env:
SEMGREP_RULES: >-
p/security-audit
p/owasp-top-ten
p/trailofbits
tests/fixtures/
**/testdata/
generated/
vendor/
node_modules/
password = get_from_vault() # nosemgrep: hardcoded-password
dangerous_but_safe() # nosemgrep
semgrep --config rules/ --time . # Check rule performance
ulimit -n 4096 # Increase file descriptors for large codebases
rules:
- id: my-rule
paths:
include: [src/]
exclude: [src/generated/]
# Multi-ruleset scan with SARIF output
semgrep scan \
--config p/security-audit \
--config p/owasp-top-ten \
--config p/cwe-top-25 \
--sarif -o security-audit.sarif \
.
# Python with taint mode
semgrep scan \
--config p/python \
--config p/flask \
--config p/django \
--dataflow-traces \
--sarif -o python-security.sarif \
./backend
# JavaScript/TypeScript
semgrep scan \
--config p/javascript \
--config p/typescript \
--config p/react \
--sarif -o js-security.sarif \
./frontend
# Combine custom and community rules
semgrep scan \
--config ./custom-rules \
--config p/security-audit \
--sarif -o combined-scan.sarif \
.
# Scan only changed files (PR context)
git diff --name-only origin/main...HEAD | \
xargs semgrep scan --config p/security-audit --sarif -o diff-scan.sarif
Semgrep SARIF v2.1.0 includes:
| Severity | Meaning | |----------|---------| | ERROR | High-confidence security vulnerability | | WARNING | Potential security issue requiring review | | INFO | Code smell or best practice violation |
# Show available fixes
semgrep scan --config p/security-audit --autofix --dryrun .
# Apply fixes automatically
semgrep scan --config p/security-audit --autofix .
# Review fixes before applying
semgrep scan --config p/security-audit --autofix --dryrun . | less
# Trail of Bits rules
git clone https://github.com/trailofbits/semgrep-rules.git
semgrep scan -f semgrep-rules/rules --sarif -o results.sarif .
# Semgrep Registry
semgrep scan --config "r/trailofbits" .
# Custom remote rules
semgrep scan --config https://example.com/custom-rules.yaml .
rules:
- id: context-aware-xss
languages: [javascript]
message: "XSS: User input flows to innerHTML"
severity: ERROR
mode: taint
pattern-sources:
- pattern: req.query.$PARAM
pattern-propagators:
- pattern: $X.toString()
from: $X
to: $X.toString()
- pattern: `${$X}`
from: $X
to: `${$X}`
pattern-sinks:
- pattern: $ELEMENT.innerHTML = $DATA
pattern-sanitizers:
- pattern: DOMPurify.sanitize($X)
rules:
- id: sql-injection-advanced
languages: [python]
message: "SQL injection via string formatting"
severity: ERROR
pattern: |
$CURSOR.execute($QUERY)
focus-metavariable: $QUERY
metavariable-regex:
metavariable: $QUERY
regex: .*(\+|format|%).*
# Limit to specific file types
semgrep scan --include='*.py' --include='*.js' .
# Increase timeout for large files
semgrep scan --timeout 60 .
# Use baseline for faster incremental scans
semgrep scan --baseline-commit HEAD~1 .
# Parallel processing (default uses all CPUs)
semgrep scan --jobs 4 .
# Disable expensive rules
semgrep scan --config p/security-audit --exclude-rule 'expensive-rule-id' .
Semgrep supports 30+ languages:
| Feature | Community | Pro | |---------|-----------|-----| | Pattern matching | ✓ | ✓ | | Intra-file taint | ✓ | ✓ | | Custom rules | ✓ | ✓ | | SARIF output | ✓ | ✓ | | Cross-file analysis | ✗ | ✓ | | Interfile taint | ✗ | ✓ | | Supply chain | ✗ | ✓ | | Secrets detection | ✗ | ✓ | | Assistant (AI) | ✗ | ✓ |
# Rule parsing errors
semgrep scan --validate --config custom-rules.yaml
# Timeout on large files
semgrep scan --timeout 120 .
# Memory issues
semgrep scan --max-memory 4000 . # MB
# Debug mode
semgrep scan --debug --config p/security-audit .
# Test rules against test files
semgrep scan --test rules/
# Validate rule syntax
semgrep scan --validate --config rules/my-rule.yaml
# Benchmark rules
semgrep scan --time --config rules/ test-codebase/
| Shortcut | Why It's Wrong |
|----------|----------------|
| "Semgrep found nothing, code is clean" | Semgrep is pattern-based; it can't track complex data flow across functions |
| "I wrote a rule, so we're covered" | Rules need testing with semgrep --test; false negatives are silent |
| "Taint mode catches injection" | Only if you defined all sources, sinks, AND sanitizers correctly |
| "Pro rules are comprehensive" | Pro rules are good but not exhaustive; supplement with custom rules for your codebase |
| "Too many findings = noisy tool" | High finding count often means real problems; tune rules, don't disable them |
development
Security anti-pattern for Cross-Site Scripting vulnerabilities (CWE-79). Use when generating or reviewing code that renders HTML, handles user input in web pages, uses innerHTML/document.write, or builds dynamic web content. Covers Reflected, Stored, and DOM-based XSS. AI code has 86% XSS failure rate.
development
Security anti-pattern for XPath injection vulnerabilities (CWE-643). Use when generating or reviewing code that queries XML documents, constructs XPath expressions, or handles user input in XML operations. Detects unescaped quotes and special characters in XPath queries.
development
Security anti-pattern for weak password hashing (CWE-327, CWE-759). Use when generating or reviewing code that stores or verifies user passwords. Detects use of MD5, SHA1, SHA256 without salt, or missing password hashing entirely. Recommends bcrypt, Argon2, or scrypt.
development
Security anti-pattern for weak encryption (CWE-326, CWE-327). Use when generating or reviewing code that encrypts data, handles encryption keys, or uses cryptographic modes. Detects DES, ECB mode, static IVs, and custom crypto implementations.