skills/detect-anomalies/SKILL.md
Detect anomalies in Axiom datasets using statistical analysis. Use when looking for unusual patterns, volume spikes, outliers, or new error types in observability data.
npx skillsauth add axiomhq/cli detect-anomaliesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Detect anomalies in Axiom datasets by comparing recent patterns to historical baselines using statistical analysis.
When invoked with a dataset name (e.g., /detect-anomalies logs), it's available as $ARGUMENTS.
Statistical anomaly detection requires sufficient data:
If these aren't met, results may be misleading. Consider using simpler threshold-based alerting instead.
Always verify field names first:
axiom query "['<dataset>'] | getschema" --start-time -1h
Compare recent volume to baseline:
Calculate baseline (past 24h excluding last hour):
axiom query "['<dataset>']
| where _time between (ago(25h) .. ago(1h))
| summarize count() by bin(_time, 1h)
| summarize
avg_hourly = avg(count_),
stdev_hourly = stdev(count_)" --start-time -25h -f json
Check recent volume:
axiom query "['<dataset>']
| where _time >= ago(1h)
| summarize
current_count = count(),
current_hour = min(_time)" --start-time -1h -f json
Z-score calculation:
z_score = (current - avg) / stdev|z_score| > 2 indicates anomalyFind values that appeared recently but weren't seen before:
axiom query "['<dataset>']
| where _time >= ago(1h)
| summarize by error_code
| join kind=leftanti (
['<dataset>']
| where _time between (ago(25h) .. ago(1h))
| summarize by error_code
) on error_code" --start-time -25h -f json
Replace error_code with any categorical field (service, endpoint, status).
Find values outside normal distribution:
Calculate bounds:
axiom query "['<dataset>']
| where _time between (ago(25h) .. ago(1h))
| summarize
avg_val = avg(duration),
stdev_val = stdev(duration)
| extend
lower_bound = avg_val - 3 * stdev_val,
upper_bound = avg_val + 3 * stdev_val" --start-time -25h -f json
Find outliers:
axiom query "['<dataset>']
| where _time >= ago(1h)
| where duration < <lower_bound> or duration > <upper_bound>
| limit 100" --start-time -1h -f json
Find infrequent occurrences:
axiom query "['<dataset>']
| where _time >= ago(1h)
| summarize count() by error_message
| where count_ == 1" --start-time -1h -f json
Compare error rate to baseline:
axiom query "['<dataset>']
| where _time >= ago(6h)
| summarize
total = count(),
errors = countif(status >= 500)
by bin(_time, 15m)
| extend error_rate = errors * 100.0 / total
| sort by _time asc" --start-time -6h -f json
Track percentile changes:
axiom query "['<dataset>']
| where _time >= ago(6h)
| summarize
p50 = percentile(duration, 50),
p95 = percentile(duration, 95),
p99 = percentile(duration, 99)
by bin(_time, 15m)
| sort by _time asc" --start-time -6h -f json
| Type | Detection Method | Indicates | |------|------------------|-----------| | Volume Spike | Z-score on count | Traffic surge, attack, incident | | Volume Drop | Z-score on count | Outage, data collection issue | | New Values | Left anti-join | New errors, new services | | Statistical Outlier | 3-sigma rule | Extreme performance issue | | Rare Events | Count = 1 | Unusual conditions | | Error Spike | Error rate increase | Service degradation | | Latency Spike | Percentile increase | Performance issue |
## Anomaly Report: <dataset>
### Summary
- Analysis period: <timeframe>
- Anomalies found: <count>
### Volume Anomalies
| Time | Count | Expected | Z-Score |
|------|-------|----------|---------|
| ... | ... | ... | ... |
### New Values
- Field: `error_code`
- New values: `TIMEOUT_ERROR`, `CONNECTION_REFUSED`
### Statistical Outliers
- Field: `duration`
- Outliers: <count> events above <threshold>
### Error Rate
- Baseline: X%
- Current: Y%
- Change: +Z%
### Recommendations
1. <Investigation action>
2. <Monitoring suggestion>
For query syntax, invoke the axiom-apl skill which provides anomaly detection patterns and function documentation.
development
Analyze OpenTelemetry distributed traces from Axiom. Use when investigating a trace ID, finding traces by criteria (errors, latency, service), or debugging distributed system issues.
data-ai
Explore an Axiom dataset to understand its schema, fields, volume, and patterns. Use when discovering a new dataset, investigating data structure, or understanding what data is available.
tools
APL query language reference for Axiom. Provides operators, functions, patterns, and CLI usage. Auto-invoked by specialized Axiom skills when writing or debugging APL queries.
data-ai
Example TaskFlow authoring pattern for inbox triage. Use when messages need different treatment based on intent, with some routes notifying immediately, some waiting on outside answers, and others rolling into a later summary.