skills/opentelemetry/SKILL.md
Implement OpenTelemetry (OTEL) observability - Collector configuration, Kubernetes deployment, traces/metrics/logs pipelines, instrumentation, and troubleshooting. Use when working with OTEL Collector, telemetry pipelines, observability infrastructure, or Kubernetes monitoring.
npx skillsauth add julianobarbosa/claude-code-skills opentelemetryInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
OpenTelemetry (OTel) is a vendor-neutral observability framework for instrumenting, generating, collecting, and exporting telemetry data (traces, metrics, logs). This skill provides guidance for implementing OTEL in Kubernetes environments.
# Add Helm repo
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
# Install with basic config
helm install otel-collector open-telemetry/opentelemetry-collector \
--namespace monitoring --create-namespace \
--set mode=daemonset
# gRPC endpoint: 4317, HTTP endpoint: 4318
curl -X POST http://otel-collector:4318/v1/traces \
-H "Content-Type: application/json" \
-d '{"resourceSpans":[]}'
Signals: Three types of telemetry data:
Collector Components:
config:
receivers:
otlp:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:4317
http:
endpoint: ${env:MY_POD_IP}:4318
processors:
batch:
timeout: 10s
send_batch_size: 1024
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
exporters:
prometheusremotewrite:
endpoint: "http://prometheus:9090/api/v1/write"
loki:
endpoint: "http://loki:3100/loki/api/v1/push"
service:
pipelines:
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [prometheusremotewrite]
logs:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [loki]
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlp/tempo]
processors:
k8sattributes:
auth_type: "serviceAccount"
passthrough: false
filter:
node_from_env_var: ${env:K8S_NODE_NAME}
extract:
metadata:
- k8s.pod.name
- k8s.namespace.name
- k8s.deployment.name
- k8s.node.name
| Mode | Use Case | Pros | Cons | |------|----------|------|------| | DaemonSet | Node-level collection | Full coverage, host metrics | Higher resource usage | | Deployment | Centralized gateway | Scalable, easier management | Single point of failure | | Sidecar | Per-pod collection | Isolated, fine-grained | Resource overhead per pod |
For in-depth guidance, see:
# Check collector pods
kubectl get pods -n monitoring -l app.kubernetes.io/name=otel-collector
# View collector logs
kubectl logs -n monitoring -l app.kubernetes.io/name=otel-collector --tail=100
# Test OTLP endpoint
kubectl run test-otlp --image=curlimages/curl:latest --rm -it -- \
curl -v http://otel-collector.monitoring:4318/v1/traces
# Validate config syntax
otelcol validate --config=config.yaml
mode: "daemonset" # or "deployment"
presets:
logsCollection:
enabled: true
hostMetrics:
enabled: true
kubernetesAttributes:
enabled: true
kubeletMetrics:
enabled: true
useGOMEMLIMIT: true
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 100m
memory: 256Mi
traceparent headers don't auto-convert; mixed environments silently break trace continuity.cloud.account.id, host.id) that bloat traces — disable when not used for routing.development
End-to-end branch delivery: commit (no AI attribution) → push → open a pull request → ensure a Board work item exists (create one per task, assigned to the configured user, if none) and link it → after merge, clean up branch and worktree. Auto-detects the platform from the remote — Azure Repos + Boards (azure-devops-node-api SDK; OAuth Bearer push fallback via `az`) or GitHub (Octokit; `gh` for auth). Scripts are TypeScript, run via `bun`. Use whenever asked to "ship", "ship it", "ship this branch", "open a PR", "push and open a PR", "raise a PR", "deliver this", "send this for review", or "create a PR and link the work item" — and when a direct push to main is blocked and the change needs to go through a PR instead.
testing
Brief description of what this skill does. Include specific triggers - when should Claude use this skill? Example triggers, file types, or keywords that indicate this skill applies.
tools
Manage and troubleshoot PATH configuration in zsh. Use when adding tools to PATH (bun, nvm, Python venv, cargo, go), diagnosing "command not found" errors, validating PATH entries, or organizing shell configuration in .zshrc and .zshrc.local files.
tools
Zabbix monitoring system automation via API and Python. Use when: (1) Managing hosts, templates, items, triggers, or host groups, (2) Automating monitoring configuration, (3) Sending data via Zabbix trapper/sender, (4) Querying historical data or events, (5) Bulk operations on Zabbix objects, (6) Maintenance window management, (7) User/permission management