.claude/skills/observability-analyzer/SKILL.md
Query and analyze Claude Code observability data (metrics, logs, traces). Use when analyzing performance, costs, errors, tool usage, sessions, conversations, or subagents.
npx skillsauth add adaptationio/skrillz observability-analyzerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Query Claude Code telemetry and generate insights from metrics, logs, and traces. Works with both default OTEL telemetry and enhanced hook-based telemetry.
| Source | Job Name | Contains |
|--------|----------|----------|
| Default OTEL | claude_code | API metrics, token usage, costs |
| Enhanced Hooks | claude_code_enhanced | Sessions, conversations, tools, subagents |
query-metrics <promql>Execute PromQL query against Prometheus.
query-metrics 'sum(claude_code_token_usage)[7d]'
query-logs <logql>Execute LogQL query against Loki.
query-logs '{job="claude_code_enhanced", event_type="tool_call"} | json' --since 24h
analyze-errorsDetect and group error patterns from enhanced telemetry.
{job="claude_code_enhanced", event_type="tool_result", status="error"} | json
Output: Error types, frequencies, affected tools, recommendations.
analyze-performanceIdentify slow operations and response sizes.
{job="claude_code_enhanced", event_type="tool_result"} | json | response_length > 50000
Output: Large responses, estimated token costs, slow patterns.
analyze-costsCalculate token usage from content size estimates.
sum by (repo) (sum_over_time({job="claude_code_enhanced", event_type="context_utilization"} | json | unwrap estimated_session_tokens [24h]))
Output: Token estimates by repo, session costs, projections.
analyze-toolsTool usage statistics and sequences.
sum by (tool) (count_over_time({job="claude_code_enhanced", event_type="tool_call"} | json [24h]))
Output: Call frequency, success rates, tool sequences, common patterns.
analyze-sessionsSession lifecycle and duration analytics.
{job="claude_code_enhanced", event_type="session_end"} | json
Output: Session durations, turn counts, tools per session, termination reasons.
analyze-conversationsConversation and prompt analytics.
sum by (pattern) (count_over_time({job="claude_code_enhanced", event_type="user_prompt"} | json [24h]))
Output: Prompt patterns (question/debugging/creation/ultrathink), turn distribution.
analyze-subagentsSubagent/Task tool usage.
{job="claude_code_enhanced", event_type="tool_call", tool="Task"} | json
Output: Subagent types used, completion rates, parallel execution patterns.
analyze-skillsSkill invocation analytics.
sum by (skill_name) (count_over_time({job="claude_code_enhanced", event_type="skill_usage"} | json [24h]))
Output: Most used skills, skill usage by repo, trends.
analyze-contextContext window utilization.
{job="claude_code_enhanced", event_type="context_utilization"} | json | context_percentage > 50
Output: High utilization sessions, compaction events, token efficiency.
analyze-reposRepository/project activity.
sum by (repo, tool) (count_over_time({job="claude_code_enhanced", event_type="tool_call"} | json [24h]))
Output: Activity per repo, tool usage by project, branch patterns.
generate-reportComprehensive analysis report (all dimensions). Output: Markdown report with errors, performance, costs, sessions, conversations, tools.
# All events (last hour)
{job="claude_code_enhanced"} | json
# Session analytics
{job="claude_code_enhanced", event_type="session_end"} | json | duration_seconds > 300
# Tool errors
{job="claude_code_enhanced", event_type="tool_result", status="error"} | json
# High context usage
{job="claude_code_enhanced", event_type="context_utilization"} | json | context_percentage > 75
# Subagent spawns
{job="claude_code_enhanced", event_type="tool_call", tool="Task"} | json
# Skill invocations
{job="claude_code_enhanced", event_type="skill_usage"} | json
# Prompt patterns
{job="claude_code_enhanced", event_type="user_prompt"} | json | pattern="ultrathink"
# Tool sequences
{job="claude_code_enhanced", event_type="tool_call"} | json | line_format "{{.tool_name}} → {{.previous_tool}}"
# Context compaction
{job="claude_code_enhanced", event_type="context_compact"} | json
# Permission requests
{job="claude_code_enhanced", event_type="permission_request"} | json
# Total token usage (7 days)
sum(increase(claude_code_token_usage[7d]))
# Error rate by tool
sum by (tool_name) (rate(claude_code_tool_result{status="failure"}[1h]))
# P95 tool latency
histogram_quantile(0.95, claude_code_tool_duration_bucket)
# Daily costs
sum(increase(claude_code_cost_usage[24h]))
| Event Type | Description | Key Fields |
|------------|-------------|------------|
| session_start | Session initialization | source, permission_mode |
| session_end | Session termination | duration_seconds, turn_count, tools_used |
| user_prompt | User message submitted | pattern, prompt_length, estimated_tokens |
| tool_call | Tool invocation | tool_name, tool_details, sequence_position |
| tool_result | Tool completion | status, response_length, is_error |
| skill_usage | Skill invoked | skill_name |
| context_utilization | Token estimate | estimated_session_tokens, context_percentage |
| context_compact | Compaction event | trigger (manual/auto) |
| subagent_complete | Task agent finished | total_subagents |
| permission_request | Permission dialog | notification_type |
| notification | System notification | notification_type |
Access: http://localhost:3000 (admin/admin)
scripts/query-prometheus.sh - PromQL query helperscripts/query-loki.sh - LogQL query helperscripts/analyze-errors.sh - Error analysis automationscripts/analyze-sessions.sh - Session analyticsscripts/generate-report.sh - Full analysis reportdevelopment
Setup secure web-based terminal access to WSL2 from mobile/tablet via ttyd + ngrok/Cloudflare/Tailscale. One-command install, start, stop, status. Use when you need remote terminal access, web terminal, browser-based shell, or mobile access to WSL2 environment.
development
Complete development workflows where Claude writes the code while Gemini and Codex provide research, planning, reviews, and different perspectives. Claude remains the main developer. Use for complex projects requiring expert planning and multi-perspective reviews.
development
Systematic progress tracking for skill development. Manages task states (pending/in_progress/completed), updates in real-time, reports progress, identifies blockers, and maintains momentum. Use when tracking skill development, coordinating work, or reporting progress.
testing
Comprehensive testing workflow orchestrating functional testing, example validation, integration testing, and usability assessment. Sequential workflow for complete skill testing from examples through scenarios to integration validation. Use when conducting thorough testing, pre-deployment validation, ensuring skill functionality, or comprehensive quality checks.