Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

harsh040506/monitoring-observability

Name: monitoring-observability
Author: harsh040506

engineering/devops/skills/monitoring-observability/SKILL.md

npx skillsauth add harsh040506/claude-code-unified-skill-plugin-library monitoring-observability

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Monitoring & Observability

Build observable systems using the three pillars: metrics, logs, and traces.

The Three Pillars of Observability

| Pillar | What it tells you | Tool examples | |--------|------------------|---------------| | Metrics | Aggregated numerical measurements over time | Prometheus, Datadog, CloudWatch | | Logs | Discrete events with context | Loki, Elasticsearch, CloudWatch Logs, Datadog Logs | | Traces | End-to-end request flows across services | Jaeger, Tempo, Datadog APM, X-Ray |

You need all three. Metrics tell you something is wrong. Logs tell you what happened. Traces tell you where in the system it went wrong.

SLOs and Error Budgets

Define service reliability targets before building dashboards. Every team should have explicit SLOs.

SLO Definition Template

# slo.yaml
service: api-service
slos:
  - name: availability
    description: "Percentage of requests that succeed (non-5xx)"
    target: 99.9%         # 0.1% error budget = 43.8 min/month
    measurement:
      metric: rate(http_requests_total{status!~"5.."}[5m]) / rate(http_requests_total[5m])
      
  - name: latency-p99
    description: "P99 response time for all requests"
    target: 200ms         # 99% of requests under 200ms
    measurement:
      metric: histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))
      
  - name: throughput
    description: "Service processes at least N requests/second under normal load"
    target: 100 rps

Error Budget Calculation

Error budget = (1 - SLO target) × time window
99.9% availability over 30 days = 0.1% × 30 × 24 × 60 = 43.2 minutes

Burn rate = how fast you're consuming the error budget
- Burn rate 1x = consuming budget at exactly the target rate (sustainable)
- Burn rate 14x = consuming 14 days of budget in 1 hour (page immediately)
- Burn rate 6x = consuming budget too fast (page within 24 hours)

SLO-Based Alerting (Multi-Window, Multi-Burn-Rate)

# Alert when error budget burns too fast
# Page immediately: 2% of 30-day budget consumed in 1 hour
- alert: ErrorBudgetBurnRatePage
  expr: |
    (
      rate(http_requests_total{status=~"5.."}[1h]) / rate(http_requests_total[1h]) > (14.4 * 0.001)
    ) and (
      rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > (14.4 * 0.001)
    )
  for: 2m
  labels:
    severity: page
  annotations:
    summary: "Error budget burning at 14x rate — page immediately"

# Ticket: 5% consumed in 6 hours
- alert: ErrorBudgetBurnRateTicket
  expr: |
    rate(http_requests_total{status=~"5.."}[6h]) / rate(http_requests_total[6h]) > (6 * 0.001)
  for: 15m
  labels:
    severity: ticket

Prometheus & Grafana Stack

Metrics to Instrument in Every Service

The Four Golden Signals (Google SRE):

| Signal | What to measure | Prometheus metric type | |--------|----------------|----------------------| | Latency | Time to serve a request (P50, P95, P99) | histogram | | Traffic | Requests per second | counter | | Errors | Rate of failed requests | counter | | Saturation | How full the service is (CPU, queue depth) | gauge |

Instrumentation (Node.js Example with prom-client)

import { Counter, Histogram, Gauge, register } from 'prom-client';

// HTTP request metrics (latency + traffic + errors)
export const httpRequestDuration = new Histogram({
  name: 'http_request_duration_seconds',
  help: 'HTTP request duration in seconds',
  labelNames: ['method', 'route', 'status_code'],
  buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5],
});

export const httpRequestTotal = new Counter({
  name: 'http_requests_total',
  help: 'Total HTTP requests',
  labelNames: ['method', 'route', 'status_code'],
});

// Queue depth (saturation)
export const jobQueueDepth = new Gauge({
  name: 'job_queue_depth',
  help: 'Number of jobs waiting in the queue',
  labelNames: ['queue_name'],
});

// Business metric
export const ordersProcessed = new Counter({
  name: 'orders_processed_total',
  help: 'Total orders successfully processed',
  labelNames: ['payment_method', 'region'],
});

// Expose metrics endpoint
app.get('/metrics', async (req, res) => {
  res.set('Content-Type', register.contentType);
  res.end(await register.metrics());
});

Prometheus Scrape Config

# prometheus.yml
scrape_configs:
  - job_name: 'api-service'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app]
        action: keep
        regex: api-service
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: (.+)
        replacement: ${1}:${2}

Essential Grafana Dashboards

Dashboard 1: Service Health Overview

Request rate (RPS) — line chart
Error rate (%) — line chart with red alert threshold
P50/P95/P99 latency — line chart
Active pods — gauge
Error budget remaining — stat panel

Dashboard 2: Resource Utilization

CPU usage vs limit per pod
Memory usage vs limit per pod
Network I/O
Disk I/O

Dashboard 3: Business Metrics

Orders/signups/events per minute
Revenue-impacting error rate
Key funnel conversion rates

Structured Logging

Always emit structured JSON logs. Never console.log("error: " + err).

import pino from 'pino';

const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  base: {
    service: 'api-service',
    version: process.env.APP_VERSION,
    env: process.env.NODE_ENV,
  },
  redact: ['req.headers.authorization', 'body.password', 'body.credit_card'],
});

// Good: structured, searchable fields
logger.info({
  event: 'order.created',
  order_id: order.id,
  user_id: user.id,
  amount: order.total,
  payment_method: order.paymentMethod,
  duration_ms: Date.now() - startTime,
}, 'Order created successfully');

// Good: errors with full context
logger.error({
  event: 'payment.failed',
  order_id: order.id,
  error_code: err.code,
  err,  // pino serializes Error objects correctly
}, 'Payment processing failed');

Log Levels

| Level | Use for | |-------|---------| | trace | Very detailed debugging (disabled in production) | | debug | Debug info useful during development | | info | Normal operational events (request received, job completed) | | warn | Unexpected but handled situations (retry succeeded, deprecated API used) | | error | Errors that require attention but didn't crash the service | | fatal | Errors that cause the process to exit |

In production: Set LOG_LEVEL=info. Only enable debug when actively investigating.

Distributed Tracing with OpenTelemetry

Instrumentation Setup (Node.js)

// tracing.ts — must be loaded before everything else
import { NodeSDK } from '@opentelemetry/sdk-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { HttpInstrumentation } from '@opentelemetry/instrumentation-http';
import { ExpressInstrumentation } from '@opentelemetry/instrumentation-express';
import { PgInstrumentation } from '@opentelemetry/instrumentation-pg';

const sdk = new NodeSDK({
  serviceName: 'api-service',
  traceExporter: new OTLPTraceExporter({
    url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || 'http://jaeger:4318/v1/traces',
  }),
  instrumentations: [
    new HttpInstrumentation(),
    new ExpressInstrumentation(),
    new PgInstrumentation(),
  ],
});

sdk.start();

Adding Custom Spans

import { trace } from '@opentelemetry/api';

const tracer = trace.getTracer('api-service');

async function processOrder(orderId: string) {
  return tracer.startActiveSpan('processOrder', async (span) => {
    span.setAttributes({
      'order.id': orderId,
      'order.source': 'api',
    });
    
    try {
      const result = await chargePayment(orderId);
      span.setStatus({ code: SpanStatusCode.OK });
      return result;
    } catch (err) {
      span.recordException(err);
      span.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
      throw err;
    } finally {
      span.end();
    }
  });
}

Alerting Best Practices

Alert Fatigue Prevention

Symptom-based alerts over cause-based alerts:

❌ "CPU is above 80%" — might not impact users
✅ "P99 latency is above SLO threshold" — always user-impacting

Thresholds that eliminate noise:

Use for: 5m so transient spikes don't page
Use multi-window SLO burn rate (see above)
Page on user impact, ticket on operational concern

On-Call Runbook Template

Every alert must link to a runbook:

# Alert: HighErrorRate

**When does this fire?** Error rate > 1% for 5 minutes (SLO burn rate 14x)

**Severity:** P1 — Page immediately

## 1. Initial Assessment (< 2 min)
- Check [Grafana dashboard](https://grafana.example.com/d/api-errors)
- Which endpoints are erroring? `rate(http_requests_total{status=~"5.."}[5m]) by (route)`
- When did it start? Look at the deployment history

## 2. Common Causes

### Bad deployment
- Check if deploy happened with: `kubectl rollout history deployment/api-service -n production`
- If yes: `kubectl rollout undo deployment/api-service -n production`

### Database connectivity
- Check DB errors: `kubectl logs -l app=api-service --since=5m | grep "connection"`
- Check RDS health in AWS console
- If DB is down: enable read-only mode via feature flag

### Downstream service failure
- Check dependencies: `[list of dependencies and their status pages]`

## 3. Escalation
- 15 min without mitigation: page engineering lead
- 30 min without mitigation: page VP Engineering

## 4. Post-Incident
- Open postmortem issue within 24 hours
- Complete postmortem within 5 business days

Deeper Reference

For complete alerting rule sets and distributed tracing integration guides, see:

references/metrics-alerting.md — Prometheus recording rules, alert thresholds, and Grafana dashboard JSON for the four golden signals
references/tracing-patterns.md — OpenTelemetry SDK setup, span attribute conventions, and Tempo/Jaeger query patterns for distributed trace analysis

harsh040506/monitoring-observability

engineering/devops/skills/monitoring-observability/SKILL.md

This skill should be used when the user asks about "monitoring", "observability", "Prometheus", "Grafana", "Datadog", "Loki", "Jaeger", "distributed tracing", "OpenTelemetry", "metrics", "logs", "traces", "alerting", "SLO", "SLA", "error budget", "on-call", "PagerDuty", "incident alert", "dashboard", "runbook", "MTTR", "MTTD", "log aggregation", "log parsing", "structured logging", "service mesh observability", "Istio metrics", "APM", or "application performance monitoring". Also trigger for "how do I know if my service is healthy", "I can't see what's happening in production", "why is P99 latency high", or "set up alerting for my service".

2 stars

testing

Updated Apr 5, 2026

$ install --global

skillsauth

npx skillsauth add harsh040506/claude-code-unified-skill-plugin-library monitoring-observability

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 5, 2026, 5:11 PM4.0s3 files scanned

SKILL.md

name:: monitoring-observability
description:: This skill should be used when the user asks about "monitoring", "observability", "Prometheus", "Grafana", "Datadog", "Loki", "Jaeger", "distributed tracing", "OpenTelemetry", "metrics", "logs", "traces", "alerting", "SLO", "SLA", "error budget", "on-call", "PagerDuty", "incident alert", "dashboard", "runbook", "MTTR", "MTTD", "log aggregation", "log parsing", "structured logging", "service mesh observability", "Istio metrics", "APM", or "application performance monitoring". Also trigger for "how do I know if my service is healthy", "I can't see what's happening in production", "why is P99 latency high", or "set up alerting for my service".

Monitoring & Observability

Build observable systems using the three pillars: metrics, logs, and traces.

The Three Pillars of Observability

You need all three. Metrics tell you something is wrong. Logs tell you what happened. Traces tell you where in the system it went wrong.

SLOs and Error Budgets

Define service reliability targets before building dashboards. Every team should have explicit SLOs.

SLO Definition Template

# slo.yaml
service: api-service
slos:
  - name: availability
    description: "Percentage of requests that succeed (non-5xx)"
    target: 99.9%         # 0.1% error budget = 43.8 min/month
    measurement:
      metric: rate(http_requests_total{status!~"5.."}[5m]) / rate(http_requests_total[5m])
      
  - name: latency-p99
    description: "P99 response time for all requests"
    target: 200ms         # 99% of requests under 200ms
    measurement:
      metric: histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))
      
  - name: throughput
    description: "Service processes at least N requests/second under normal load"
    target: 100 rps

Error Budget Calculation

Error budget = (1 - SLO target) × time window
99.9% availability over 30 days = 0.1% × 30 × 24 × 60 = 43.2 minutes

Burn rate = how fast you're consuming the error budget
- Burn rate 1x = consuming budget at exactly the target rate (sustainable)
- Burn rate 14x = consuming 14 days of budget in 1 hour (page immediately)
- Burn rate 6x = consuming budget too fast (page within 24 hours)

SLO-Based Alerting (Multi-Window, Multi-Burn-Rate)

# Alert when error budget burns too fast
# Page immediately: 2% of 30-day budget consumed in 1 hour
- alert: ErrorBudgetBurnRatePage
  expr: |
    (
      rate(http_requests_total{status=~"5.."}[1h]) / rate(http_requests_total[1h]) > (14.4 * 0.001)
    ) and (
      rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > (14.4 * 0.001)
    )
  for: 2m
  labels:
    severity: page
  annotations:
    summary: "Error budget burning at 14x rate — page immediately"

# Ticket: 5% consumed in 6 hours
- alert: ErrorBudgetBurnRateTicket
  expr: |
    rate(http_requests_total{status=~"5.."}[6h]) / rate(http_requests_total[6h]) > (6 * 0.001)
  for: 15m
  labels:
    severity: ticket

Prometheus & Grafana Stack

Metrics to Instrument in Every Service

The Four Golden Signals (Google SRE):

Instrumentation (Node.js Example with prom-client)

import { Counter, Histogram, Gauge, register } from 'prom-client';

// HTTP request metrics (latency + traffic + errors)
export const httpRequestDuration = new Histogram({
  name: 'http_request_duration_seconds',
  help: 'HTTP request duration in seconds',
  labelNames: ['method', 'route', 'status_code'],
  buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5],
});

export const httpRequestTotal = new Counter({
  name: 'http_requests_total',
  help: 'Total HTTP requests',
  labelNames: ['method', 'route', 'status_code'],
});

// Queue depth (saturation)
export const jobQueueDepth = new Gauge({
  name: 'job_queue_depth',
  help: 'Number of jobs waiting in the queue',
  labelNames: ['queue_name'],
});

// Business metric
export const ordersProcessed = new Counter({
  name: 'orders_processed_total',
  help: 'Total orders successfully processed',
  labelNames: ['payment_method', 'region'],
});

// Expose metrics endpoint
app.get('/metrics', async (req, res) => {
  res.set('Content-Type', register.contentType);
  res.end(await register.metrics());
});

Prometheus Scrape Config

# prometheus.yml
scrape_configs:
  - job_name: 'api-service'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app]
        action: keep
        regex: api-service
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: (.+)
        replacement: ${1}:${2}

Essential Grafana Dashboards

Dashboard 1: Service Health Overview

Request rate (RPS) — line chart
Error rate (%) — line chart with red alert threshold
P50/P95/P99 latency — line chart
Active pods — gauge
Error budget remaining — stat panel

Dashboard 2: Resource Utilization

CPU usage vs limit per pod
Memory usage vs limit per pod
Network I/O
Disk I/O

Dashboard 3: Business Metrics

Orders/signups/events per minute
Revenue-impacting error rate
Key funnel conversion rates

Structured Logging

Always emit structured JSON logs. Never console.log("error: " + err).

import pino from 'pino';

const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  base: {
    service: 'api-service',
    version: process.env.APP_VERSION,
    env: process.env.NODE_ENV,
  },
  redact: ['req.headers.authorization', 'body.password', 'body.credit_card'],
});

// Good: structured, searchable fields
logger.info({
  event: 'order.created',
  order_id: order.id,
  user_id: user.id,
  amount: order.total,
  payment_method: order.paymentMethod,
  duration_ms: Date.now() - startTime,
}, 'Order created successfully');

// Good: errors with full context
logger.error({
  event: 'payment.failed',
  order_id: order.id,
  error_code: err.code,
  err,  // pino serializes Error objects correctly
}, 'Payment processing failed');

Log Levels

In production: Set LOG_LEVEL=info. Only enable debug when actively investigating.

Distributed Tracing with OpenTelemetry

Instrumentation Setup (Node.js)

// tracing.ts — must be loaded before everything else
import { NodeSDK } from '@opentelemetry/sdk-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { HttpInstrumentation } from '@opentelemetry/instrumentation-http';
import { ExpressInstrumentation } from '@opentelemetry/instrumentation-express';
import { PgInstrumentation } from '@opentelemetry/instrumentation-pg';

const sdk = new NodeSDK({
  serviceName: 'api-service',
  traceExporter: new OTLPTraceExporter({
    url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || 'http://jaeger:4318/v1/traces',
  }),
  instrumentations: [
    new HttpInstrumentation(),
    new ExpressInstrumentation(),
    new PgInstrumentation(),
  ],
});

sdk.start();

Adding Custom Spans

import { trace } from '@opentelemetry/api';

const tracer = trace.getTracer('api-service');

async function processOrder(orderId: string) {
  return tracer.startActiveSpan('processOrder', async (span) => {
    span.setAttributes({
      'order.id': orderId,
      'order.source': 'api',
    });
    
    try {
      const result = await chargePayment(orderId);
      span.setStatus({ code: SpanStatusCode.OK });
      return result;
    } catch (err) {
      span.recordException(err);
      span.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
      throw err;
    } finally {
      span.end();
    }
  });
}

Alerting Best Practices

Alert Fatigue Prevention

Symptom-based alerts over cause-based alerts:

❌ "CPU is above 80%" — might not impact users
✅ "P99 latency is above SLO threshold" — always user-impacting

Thresholds that eliminate noise:

Use for: 5m so transient spikes don't page
Use multi-window SLO burn rate (see above)
Page on user impact, ticket on operational concern

On-Call Runbook Template

Every alert must link to a runbook:

# Alert: HighErrorRate

**When does this fire?** Error rate > 1% for 5 minutes (SLO burn rate 14x)

**Severity:** P1 — Page immediately

## 1. Initial Assessment (< 2 min)
- Check [Grafana dashboard](https://grafana.example.com/d/api-errors)
- Which endpoints are erroring? `rate(http_requests_total{status=~"5.."}[5m]) by (route)`
- When did it start? Look at the deployment history

## 2. Common Causes

### Bad deployment
- Check if deploy happened with: `kubectl rollout history deployment/api-service -n production`
- If yes: `kubectl rollout undo deployment/api-service -n production`

### Database connectivity
- Check DB errors: `kubectl logs -l app=api-service --since=5m | grep "connection"`
- Check RDS health in AWS console
- If DB is down: enable read-only mode via feature flag

### Downstream service failure
- Check dependencies: `[list of dependencies and their status pages]`

## 3. Escalation
- 15 min without mitigation: page engineering lead
- 30 min without mitigation: page VP Engineering

## 4. Post-Incident
- Open postmortem issue within 24 hours
- Complete postmortem within 5 business days

Deeper Reference

For complete alerting rule sets and distributed tracing integration guides, see:

references/metrics-alerting.md — Prometheus recording rules, alert thresholds, and Grafana dashboard JSON for the four golden signals
references/tracing-patterns.md — OpenTelemetry SDK setup, span attribute conventions, and Tempo/Jaeger query patterns for distributed trace analysis

Related Skills

harsh040506/single-cell-rna-qc

testing

VerifiedTrustedCommunity

Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations. Use when users request QC analysis, filtering low-quality cells, assessing data quality, or following scverse/scanpy best practices for single-cell analysis.

2SKILL.mdUpdated Apr 5, 2026

harsh040506/single-cell-rna-qc

harsh040506/scvi-tools

tools

VerifiedTrustedCommunity

Deep learning for single-cell analysis using scvi-tools. This skill should be used when users need (1) data integration and batch correction with scVI/scANVI, (2) ATAC-seq analysis with PeakVI, (3) CITE-seq multi-modal analysis with totalVI, (4) multiome RNA+ATAC analysis with MultiVI, (5) spatial transcriptomics deconvolution with DestVI, (6) label transfer and reference mapping with scANVI/scArches, (7) RNA velocity with veloVI, or (8) any deep learning-based single-cell method. Triggers include mentions of scVI, scANVI, totalVI, PeakVI, MultiVI, DestVI, veloVI, sysVI, scArches, variational autoencoder, VAE, batch correction, data integration, multi-modal, CITE-seq, multiome, reference mapping, latent space.

2SKILL.mdUpdated Apr 5, 2026

harsh040506/scvi-tools

harsh040506/scientific-problem-selection

testing

VerifiedTrustedCommunity

This skill should be used when scientists need help with research problem selection, project ideation, troubleshooting stuck projects, or strategic scientific decisions. Use this skill when users ask to pitch a new research idea, work through a project problem, evaluate project risks, plan research strategy, navigate decision trees, or get help choosing what scientific problem to work on. Typical requests include "I have an idea for a project", "I'm stuck on my research", "help me evaluate this project", "what should I work on", or "I need strategic advice about my research".

2SKILL.mdUpdated Apr 5, 2026

harsh040506/scientific-problem-selection

harsh040506/nextflow-development

development

VerifiedTrustedCommunity

Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or public datasets from GEO/SRA. Triggers on nf-core, Nextflow, FASTQ analysis, variant calling, gene expression, differential expression, GEO reanalysis, GSE/GSM/SRR accessions, or samplesheet creation.

2SKILL.mdUpdated Apr 5, 2026

harsh040506/nextflow-development

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/harsh040506/claude-code-unified-skill-plugin-library.git

# Copy into Claude Code skills folder (global)
cp -r claude-code-unified-skill-plugin-library/engineering/devops/skills/monitoring-observability ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

harsh040506/claude-code-unified-skill-plugin-library

2 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT