.claude/skills/spl-optimizer/SKILL.md
Optimize detection queries for performance across Splunk (SPL), Microsoft Sentinel (KQL), and Elastic Security (EQL/ES|QL). Covers search pipeline internals, common anti-patterns, and optimization techniques for detection rules on each platform.
npx skillsauth add mhaggis/security-detections-mcp Detection Query OptimizerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Every SIEM processes queries differently. Understanding the execution model is essential for writing detections that run fast enough for scheduled execution without excessive resource consumption. This skill covers optimization for the three major query languages.
Set $SIEM_PLATFORM to focus guidance: splunk, sentinel, elastic
| Platform | Query Language | Section | |----------|---------------|---------| | Splunk Enterprise / Cloud | SPL | SPL Optimization | | Microsoft Sentinel / Defender | KQL | KQL Optimization | | Elastic Security | EQL / ES|QL | EQL/ES|QL Optimization | | Sigma | N/A | Optimization happens at the backend/compiler level. Write clean Sigma; let pySigma optimize for the target. |
Every SPL search flows through these phases:
[Input] → [Parsing] → [Search (Map/Reduce)] → [Reporting] → [Output]
The search head determines which indexes and buckets to read.
Optimizations:
index= explicitly (never rely on default index)sourcetype= to narrow bucket scanningearliest= and latest= to limit time rangehost= or source= if applicable// BAD — scans all default indexes
sourcetype=WinEventLog EventCode=4688
// GOOD — targets specific index
index=wineventlog sourcetype=WinEventLog:Security EventCode=4688
Indexers scan buckets and return raw events matching search terms.
Key concept: Bloom filters and tsidx. Splunk uses bloom filters for fast negative lookups and tsidx files for field indexing. Raw text terms in the search string hit bloom filters; field=value pairs hit tsidx if the field is indexed.
Optimizations:
index, sourcetype, source, host) before extracted fields// BAD — extracted field first, no index specification
| search process_name="powershell.exe" index=endpoint
// GOOD — indexed fields first, raw term for bloom filter
index=endpoint sourcetype=sysmon EventCode=1 powershell.exe
| where process_name="powershell.exe"
Transforming commands (stats, timechart, chart, top, etc.) aggregate results.
Optimizations:
stats as early as possible to reduce event volumeby clauses to let indexers do partial aggregationstats over transaction (10x+ faster for most use cases)tstats queries accelerated data models and is dramatically faster than raw search.
// SLOW — raw search with field extraction
index=endpoint sourcetype=sysmon EventCode=1
| stats count by Image, ParentImage
// FAST — tstats against accelerated data model
| tstats count from datamodel=Endpoint.Processes
where Processes.process_name="powershell.exe"
by Processes.process_name, Processes.parent_process_name
Leading wildcards (*something) defeat bloom filter optimization.
// SLOW — leading wildcard, full scan required
index=endpoint process_name=*powershell*
// FAST — trailing wildcard, bloom filter can help
index=endpoint process_name=powershell*
transaction is expensive because it must group events by time ordering.
// SLOW — transaction groups events by session
index=web sourcetype=access_combined
| transaction clientip maxspan=30m
// FAST — stats can achieve similar results
index=web sourcetype=access_combined
| stats earliest(_time) as start, latest(_time) as end,
values(uri_path) as pages, count
by clientip
| eval duration=end-start
Every command in the pipeline processes all events passed to it. Reduce event count as early as possible.
// SLOW — extracts fields then filters
index=endpoint sourcetype=sysmon
| rex field=_raw "CommandLine=(?<cmdline>[^\r\n]+)"
| search cmdline="*-encoded*"
// FAST — filter with raw text first, then extract
index=endpoint sourcetype=sysmon EventCode=1 "-encoded"
| rex field=_raw "CommandLine=(?<cmdline>[^\r\n]+)"
| search cmdline="*-encoded*"
// SLOW — lookup on every event
index=endpoint sourcetype=sysmon EventCode=1
| lookup malware_hashes hash AS SHA256 OUTPUT is_malicious
| where is_malicious="true"
// FASTER — drop unneeded fields first, then lookup
index=endpoint sourcetype=sysmon EventCode=1
| fields + SHA256, Image, CommandLine, _time
| lookup malware_hashes hash AS SHA256 OUTPUT is_malicious
| where is_malicious="true"
Subsearches have a result limit (default 10K) and timeout (default 60s). Use | lookup alternatives when possible.
// RISKY — subsearch may hit limits
index=endpoint [search index=threat_intel | fields hash]
// BEST — use lookup command (streaming, no limits)
index=endpoint sourcetype=sysmon EventCode=1
| lookup threat_intel_hashes.csv hash AS SHA256 OUTPUT is_threat
| where is_threat="true"
| tstats `security_content_summariesonly`
count min(_time) as firstTime max(_time) as lastTime
from datamodel=Endpoint.Processes
where Processes.process_name IN ("powershell.exe","pwsh.exe")
Processes.process IN ("*-EncodedCommand*","*-enc *")
by Processes.dest Processes.user Processes.parent_process_name
Processes.process_name Processes.process Processes.process_id
| `drop_dm_object_name(Processes)`
| `security_content_ctime(firstTime)`
| `security_content_ctime(lastTime)`
Why this pattern is fast:
tstats hits accelerated data model (pre-indexed)where clause reduce events before aggregationby clause| Anti-Pattern | Problem | Fix |
|-------------|---------|-----|
| No index= specification | Scans all default indexes | Always specify index= |
| | search as first command | Bypasses index-time optimization | Put terms before first pipe |
| Leading wildcards *term | Defeats bloom filters | Use trailing wildcards or raw text |
| transaction for simple grouping | Extremely expensive | Use stats with by clause |
| Unnecessary | table in saved search | Adds formatting overhead | Only use in dashboards |
| | eval before | where | Processes all events | Filter first, compute later |
| Subsearch hitting 10K limit | Silent result truncation | Use lookups instead |
| NOT without positive filter | Scans everything then excludes | Add a positive match first |
| Cost | Commands | Notes |
|------|----------|-------|
| Free | search (inline terms) | Hits bloom filter / tsidx |
| Cheap | where, fields, rename, eval | Streaming, per-event |
| Medium | stats, chart, timechart | Aggregation, but parallelizable |
| Expensive | transaction, join, append | Memory-intensive, serialized |
| Very Expensive | | map, nested subsearches | Sequential execution |
After running a search, use Job Inspector (Job menu → Inspect Job) to see:
scanCount — Total events scannedresultCount — Events returnedexecution_time — Wall clock timeKQL queries in Sentinel/Defender are executed by the Kusto engine (Azure Data Explorer). The engine:
Unlike Splunk, Kusto has a built-in query optimizer, but you can still help it significantly.
where ClausesPut the most selective where clauses first. The optimizer pushes predicates down, but explicit early filtering helps.
// SLOW — processes all events, then filters
DeviceProcessEvents
| extend CommandLower = tolower(ProcessCommandLine)
| where CommandLower has "encodedcommand"
| where Timestamp > ago(1h)
// FAST — time and table filters first, compute later
DeviceProcessEvents
| where Timestamp > ago(1h)
| where ProcessCommandLine has "encodedcommand"
has Instead of containshas uses the term index (inverted index) and is orders of magnitude faster than contains, which does a substring scan.
// SLOW — substring scan on every row
DeviceProcessEvents
| where ProcessCommandLine contains "mimikatz"
// FAST — term index lookup
DeviceProcessEvents
| where ProcessCommandLine has "mimikatz"
// ALSO FAST — for exact prefix/suffix matching
DeviceProcessEvents
| where FileName =~ "powershell.exe"
When to use which:
| Operator | Index-Backed | Use When |
|----------|-------------|----------|
| has | Yes | Searching for whole terms |
| has_any | Yes | Searching for any of several terms |
| has_all | Yes | All terms must be present |
| contains | No | Substring match needed (e.g., partial path) |
| startswith | Partial | Prefix matching |
| matches regex | No | Complex pattern matching (slowest) |
let Statement Overheadlet statements are materialized when referenced. Avoid creating large intermediate tables.
// SLOW — materializes full table then filters
let AllProcesses = DeviceProcessEvents | where Timestamp > ago(1d);
AllProcesses
| where FileName == "powershell.exe"
// FAST — filter inline
DeviceProcessEvents
| where Timestamp > ago(1d)
| where FileName == "powershell.exe"
Use let when you need to reference the same dataset multiple times (e.g., join both sides).
in Over Multiple or Conditions// VERBOSE
DeviceProcessEvents
| where FileName == "powershell.exe" or FileName == "pwsh.exe" or FileName == "cmd.exe"
// CLEAN AND FAST
DeviceProcessEvents
| where FileName in ("powershell.exe", "pwsh.exe", "cmd.exe")
// CASE-INSENSITIVE
DeviceProcessEvents
| where FileName in~ ("powershell.exe", "pwsh.exe", "cmd.exe")
project to Drop Unneeded ColumnsReducing column count reduces data transferred between nodes.
// SLOW — carries all columns through the pipeline
DeviceProcessEvents
| where FileName == "powershell.exe"
| join kind=inner (DeviceNetworkEvents) on DeviceId
// FAST — project early to reduce join payload
DeviceProcessEvents
| where FileName == "powershell.exe"
| project DeviceId, Timestamp, ProcessCommandLine, AccountName
| join kind=inner (
DeviceNetworkEvents
| project DeviceId, RemoteIP, RemotePort, Timestamp
) on DeviceId
Joins are expensive. Always put the smaller table on the left.
// FAST — small result set on left
DeviceProcessEvents
| where FileName == "powershell.exe"
| where ProcessCommandLine has "-enc"
| join kind=inner (
DeviceNetworkEvents
| where RemotePort == 443
) on DeviceId, $left.Timestamp == $right.Timestamp
// ALTERNATIVE — use lookup for reference data
let MaliciousIPs = externaldata(IP: string) [@"https://..."] with (format="txt");
DeviceNetworkEvents
| where RemoteIP in (MaliciousIPs)
summarize Efficiently// SLOW — high cardinality summarize
DeviceProcessEvents
| summarize count() by ProcessCommandLine, DeviceName, AccountName
// FAST — reduce cardinality first
DeviceProcessEvents
| where ProcessCommandLine has_any ("encoded", "hidden", "bypass")
| summarize count() by FileName, DeviceName, AccountName
| Anti-Pattern | Problem | Fix |
|-------------|---------|-----|
| contains for whole words | Full substring scan | Use has (term-indexed) |
| matches regex for simple patterns | No index acceleration | Use has, startswith, endswith |
| No time filter | Scans entire retention | Always add where Timestamp > ago(Xh) |
| * in project (all columns) | Excess data transfer | project only needed columns |
| Large table on left of join | Excessive shuffle | Put smaller table on left |
| Nested toscalar() calls | Sequential execution | Materialize with let |
| mv-expand on large arrays | Row explosion | Filter before expanding |
Use the Query performance pane in Log Analytics:
In Advanced Hunting (Defender):
// Suspicious Encoded PowerShell Execution
// MITRE: T1059.001
let timeframe = 1h;
DeviceProcessEvents
| where Timestamp > ago(timeframe)
| where FileName in~ ("powershell.exe", "pwsh.exe")
| where ProcessCommandLine has_any ("-EncodedCommand", "-enc ", "-e ")
| project Timestamp, DeviceName, AccountName, FileName, ProcessCommandLine,
InitiatingProcessFileName, InitiatingProcessCommandLine
| extend AccountName = tolower(AccountName)
Elastic Security supports multiple query languages:
| Language | Use Case | Best For | |----------|----------|----------| | EQL (Event Query Language) | Ordered event correlation | Sequences, parent-child, multi-step attacks | | ES|QL (Elasticsearch Query Language) | Aggregation and exploration | Hunting, aggregations, complex transforms | | KQL (Kibana Query Language) | Simple filtering | Dashboard filters, quick searches | | Lucene | Full-text search | Free-text, complex boolean queries |
EQL queries are fastest when scoped to a specific event category.
// SLOW — searches all event types
any where process.name == "powershell.exe"
// FAST — scoped to process events only
process where process.name == "powershell.exe"
Event categories: process, file, network, registry, dns, library
// SLOW — broad condition first
process where process.parent.name == "explorer.exe"
and process.args : "*-EncodedCommand*"
// FAST — rare condition first
process where process.args : "*-EncodedCommand*"
and process.parent.name == "explorer.exe"
Sequence queries correlate events across time. They are powerful but expensive.
// SLOW — wide time window, no filtering
sequence by host.id with maxspan=30m
[process where process.name == "cmd.exe"]
[network where true]
// FAST — narrow window, filtered events
sequence by host.id with maxspan=2m
[process where process.name == "cmd.exe"
and process.args : "*download*"]
[network where destination.port != 443
and destination.port != 80]
Sequence optimization tips:
maxspan possiblewhere true)by clause to low-cardinality fieldsuntil to define sequence-breaking events: (Wildcard) Over regex// SLOW — regex is expensive
process where process.command_line regex ".*(?i)encodedcommand.*"
// FAST — wildcard match (case-insensitive by default)
process where process.command_line : "*encodedcommand*"
? for Optional Fields// ERROR if field is missing in some events
process where process.parent.name == "winlogon.exe"
// SAFE — handles missing fields gracefully
process where ?process.parent.name == "winlogon.exe"
// SLOW — processes all events then filters
FROM logs-endpoint.events.process-*
| EVAL lower_name = TO_LOWER(process.name)
| WHERE lower_name == "powershell.exe"
// FAST — filter first, compute later
FROM logs-endpoint.events.process-*
| WHERE process.name == "powershell.exe"
| WHERE @timestamp > NOW() - 1 hour
// SLOW — carries all columns
FROM logs-endpoint.events.process-*
| WHERE process.name == "powershell.exe"
| STATS count = COUNT() BY host.name
// FAST — drop unneeded columns early
FROM logs-endpoint.events.process-*
| WHERE process.name == "powershell.exe"
| KEEP process.name, process.command_line, host.name, @timestamp
| STATS count = COUNT() BY host.name
// SLOW — computes on all rows
FROM logs-endpoint.events.process-*
| EVAL cmd_lower = TO_LOWER(process.command_line)
| WHERE cmd_lower LIKE "*encoded*"
// FAST — filter with WHERE first, compute on fewer rows
FROM logs-endpoint.events.process-*
| WHERE process.command_line LIKE "*encoded*"
| Anti-Pattern | Problem | Fix |
|-------------|---------|-----|
| any where instead of typed event | Searches all event types | Use process where, file where, etc. |
| regex for simple matching | Expensive pattern evaluation | Use : wildcard operator |
| Wide maxspan in sequences | Large state to maintain | Use narrowest window possible |
| where true in sequence events | Matches everything | Add meaningful filters |
| No time range in rule settings | Scans full retention | Set look-back to detection interval |
| High-cardinality by in sequences | Memory explosion | Use host.id or user.name, not process.pid |
// Suspicious Encoded PowerShell Execution
// MITRE: T1059.001
process where host.os.type == "windows"
and process.name : ("powershell.exe", "pwsh.exe")
and process.args : ("-EncodedCommand", "-enc", "-e")
and not process.parent.executable : (
"C:\\Windows\\System32\\svchost.exe",
"C:\\Program Files\\*"
)
_search API with profile: true to see shard-level timingThese principles apply regardless of which SIEM you use:
Sigma rules are platform-agnostic and are compiled by pySigma backends into the target query language. Optimization happens at two levels:
To get the best compiled output, avoid overly complex Sigma conditions (deeply nested AND/OR/NOT) that backends may translate inefficiently. Prefer flat condition structures when possible.
testing
Expert at analyzing unstructured threat intelligence reports (CISA alerts, vendor blogs, research papers) and extracting actionable detection logic, TTPs, behavioral indicators, and MITRE ATT&CK mappings. Focuses on behaviors over IOCs. Use when provided with threat reports, security advisories, or campaign documentation.
testing
Analyze software supply chain attacks across package registries (npm, PyPI, RubyGems), CI/CD pipelines (GitHub Actions, GitLab CI), and container ecosystems. Includes detection engineering patterns for Splunk, Sentinel, Elastic, and Sigma.
tools
Analyze pull requests for detection coverage gaps and recommend additional detections, story alignments, and test coverage to extend PRs before merge.
testing
Expert at creating and validating detection rule files for multiple SIEM platforms. Supports Splunk security_content YAML, Sigma rules, Elastic detection TOML, and KQL analytics. Ensures compliance with repository conventions and optimal query performance. Use when creating or modifying detection rules.