plugins/claude-ops/skills/ops-monitor/SKILL.md
Unified APM and monitoring surface. Polls Datadog, New Relic, and OpenTelemetry backends for active alerts, error traces, and entity health. Use --watch for live polling every 60 seconds. Use --setup to configure monitoring credentials.
npx skillsauth add davepoon/buildwithclaude ops-monitorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
PREFS="${CLAUDE_PLUGIN_DATA_DIR:-$HOME/.claude/plugins/data/ops-ops-marketplace}/preferences.json"
DD_API_KEY=$(jq -r '.datadog_api_key // empty' "$PREFS" 2>/dev/null)
NR_API_KEY=$(jq -r '.newrelic_api_key // empty' "$PREFS" 2>/dev/null)
OTEL_ENDPOINT=$(jq -r '.otel_endpoint // empty' "$PREFS" 2>/dev/null)
Determine $ARGUMENTS mode:
--setup → run Setup flow--watch → run Watch mode--setup)Ask which backends to configure:
Which monitoring backends would you like to configure?
[Datadog] [New Relic] [OpenTelemetry] [All three]
For each selected backend, collect credentials via AskUserQuestion free-text input (one at a time, ≤4 options per call):
Datadog:
datadog_api_key — API Key from app.datadoghq.com/organization-settings/api-keysdatadog_app_key — Application Key from app.datadoghq.com/organization-settings/application-keysNew Relic:
newrelic_api_key — User API Key from one.newrelic.com/api-keysnewrelic_account_id — Numeric Account ID from New Relic admin portalOpenTelemetry:
otel_endpoint — Base URL of your OTEL-compatible backend (e.g., https://otlp.grafana.net)Write each credential to preferences.json using atomic tmpfile swap:
tmp=$(mktemp)
jq --arg k "$KEY" --arg v "$VALUE" '.[$k] = $v' "$PREFS" > "$tmp" && mv "$tmp" "$PREFS"
Run smoke test after saving:
curl -sf -H "DD-API-KEY: $DD_API_KEY" -H "DD-APPLICATION-KEY: $DD_APP_KEY" "https://api.datadoghq.com/api/v1/validate" → expect {"valid": true}curl -sf -H "Api-Key: $NR_API_KEY" "https://api.newrelic.com/graphql" -d '{"query":"{ actor { user { name } } }"}' → expect data.actor.usercurl -sf "$OTEL_ENDPOINT/healthz" → expect HTTP 200Report ✅ or ❌ with status for each backend.
If CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 is set, use Agent Teams when querying multiple backends simultaneously. This enables:
Team setup (only when flag is enabled, multiple backends configured):
TeamCreate("monitor-probes")
Agent(team_name="monitor-probes", name="datadog-probe", subagent_type="ops:monitor-agent", ...)
Agent(team_name="monitor-probes", name="newrelic-probe", subagent_type="ops:monitor-agent", ...)
Agent(team_name="monitor-probes", name="otel-probe", subagent_type="ops:monitor-agent", ...)
If the flag is NOT set or only one backend is configured, use a single monitor-agent subagent.
Spawn monitor-agent via the Agent tool. Display the result as a formatted dashboard:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
OPS ► MONITOR [<timestamp>]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
DATADOG ✅ healthy (0 alerts)
NEW RELIC 🔴 2 critical entities
OTEL ✅ healthy
──────────────────────────────────────────────────────
Total alerts: 2 Severity: CRITICAL
Status icons:
✅ — healthy (0 alerts / configured and reachable)⚠️ — warning (warn-level alerts present)🔴 — critical (critical alerts or unreachable)⬜ — not configuredFor each alert or critical entity, display: service name, alert name, and link to the relevant dashboard.
If no backends are configured, show a setup prompt:
No monitoring backends configured. Run /ops:monitor --setup to add Datadog, New Relic, or OTEL.
--watch)Poll every 60 seconds. On each tick:
while true; do
RESULT=$(# spawn monitor-agent and capture JSON output)
# Diff against previous tick
# Print: timestamp, changed items only
# 🆕 new alert: <name>
# ✅ resolved: <name>
sleep 60
done
Exit on Ctrl-C.
--backend filterIf --backend datadog|newrelic|otel is specified, query and display only that backend.
| Backend | Auth header | Base URL | Health endpoint |
|-------------|------------------------------------------------|---------------------------------|------------------------|
| Datadog | DD-API-KEY: $key + DD-APPLICATION-KEY: $app_key | https://api.datadoghq.com | /api/v1/validate |
| New Relic | Api-Key: $key | https://api.newrelic.com/graphql | POST GraphQL query |
| OTEL | varies by backend | $OTEL_ENDPOINT | /healthz |
# Datadog — active alerts
curl -sf \
-H "DD-API-KEY: ${DD_API_KEY}" \
-H "DD-APPLICATION-KEY: ${DD_APP_KEY}" \
"https://api.datadoghq.com/api/v1/monitor?monitor_tags=*&with_downtimes=false" \
| jq '[.[] | select(.overall_state == "Alert" or .overall_state == "Warn")]'
# New Relic — critical entities (GraphQL)
curl -sf \
-H "Api-Key: ${NR_API_KEY}" \
-H "Content-Type: application/json" \
-d '{"query":"{ actor { entitySearch(queryBuilder: {alertSeverity: CRITICAL}) { results { entities { name alertSeverity entityType } } } } }"}' \
"https://api.newrelic.com/graphql"
# OTEL — health check
curl -sf "${OTEL_ENDPOINT}/healthz"
development
Show drill-me learning progress — topics studied, cards due for review, weakest concepts, and what to study next. Use when the user asks what's due, how their learning is going, or for their drill-me status.
development
Teach the user a topic as an adaptive tutor — retrieval practice, spaced repetition with decay, and persistent memory in ~/.drill-me/. Use when the user wants to learn or be drilled on something, says "drill me on X", "teach me X", or wants to study a topic, a codebase, or a document.
development
Turn any codebase into evidence-grounded Markdown docs plus a machine-readable index.json. Every claim cites its source; never invents deployment steps.
tools
Assesses the current state of the startup project and recommends what to focus on next. Use when there is a need or a question from the user to understand what the next steps are or what to focus on next.