Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

fatih-developer/api-observability-planner

Name: api-observability-planner
Author: fatih-developer

skills/api-observability-planner/SKILL.md

npx skillsauth add fatih-developer/fth-skills api-observability-planner

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

API Observability Planner Protocol

This skill ensures that when an API goes down, the team knows exactly why before the users even notice. It shifts telemetry from "just log the errors" to a structured observability pipeline.

Core assumption: If you can't measure it, you can't manage it. Blind APIs cause prolonged outages.

1. The Three Pillars Strategy (Static)

Define exactly what your framework will emit:

Logs (Events): Structured JSON logging. Never use raw text strings ("User 123 failed login" vs {"event": "login_failed", "user_id": 123, "reason": "bad_password"}).
Metrics (Aggregations): Implement the RED Method:
- Rate: Requests per second.
- Errors: Failed request rate (4xx vs 5xx).
- Duration: Latency percentiles (p50, p90, p99).
Traces (Workflows): Distributed tracing (W3C Trace Context). Ensure trace_id and span_id propagate across microservices and database calls.

2. Health & Alerting Design

Define what constitutes "Healthy" and when pagers should go off.

Deep Health Checks: /healthz shouldn't just return 200 OK. It should verify DB connection, Redis reachability, and critical downstreams.
Alert Rules:
- Warning: p99 latency > 500ms for 5 minutes.
- Critical: 5xx error rate > 5% for 2 minutes.

3. Output Generation

Required Outputs (Must write BOTH to docs/api-report/):

Human-Readable Markdown (docs/api-report/api-observability-report.md)

### 🔭 API Observability Blueprint

**Instrumentation Strategy:** OpenTelemetry (OTel)
**Log Format:** Structured JSON

#### 📊 Core Metrics (RED Method)
1. **Rate:** Tracked via Prometheus `http_requests_total`.
2. **Errors:** Alerting on HTTP 500-599. (4xx are client problems, track but don't wake up on-call).
3. **Duration:** Tracked via `http_request_duration_seconds` (Buckets: 50ms, 100ms, 500ms, 1s, 5s).

#### 🚨 Alert Configuration (PagerDuty / Slack)
- **High Severity:** Order Creation 5xx Rate > 1% over 5m.
- **Low Severity:** Database Disk Space < 20%.

#### 🆔 Tracing Propagation
Inject `traceparent` and `tracestate` headers into all outgoing upstream HTTP/gRPC requests.

Machine-Readable JSON (docs/api-report/api-observability-output.json)

{
  "skill": "api-observability-planner",
  "framework": "OpenTelemetry",
  "metrics": {
    "latency_thresholds_ms": {"p95": 200, "p99": 500}
  },
  "alerts": [
    {"name": "High 5xx Rate", "condition": "error_rate > 1%", "duration": "5m", "severity": "High"}
  ]
}

Guardrails

Log Forging / Injection: Ensure log sanitization is implemented to prevent multiline log spoofing.
PII in Logs: Explicitly call out that passwords, tokens, credit_cards, and emails must be masked or scrubbed before being written to stdout or log aggregators.

fatih-developer/api-observability-planner

skills/api-observability-planner/SKILL.md

Architects which metrics to collect, how logs should be formatted, and how distributed tracing should be implemented across boundaries.

4 stars

development

Updated Apr 13, 2026

$ install --global

skillsauth

npx skillsauth add fatih-developer/fth-skills api-observability-planner

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 13, 2026, 4:45 AM66.2s2 files scanned

SKILL.md

name:: api-observability-planner
description:: Architects which metrics to collect, how logs should be formatted, and how distributed tracing should be implemented across boundaries.

API Observability Planner Protocol

This skill ensures that when an API goes down, the team knows exactly why before the users even notice. It shifts telemetry from "just log the errors" to a structured observability pipeline.

Core assumption: If you can't measure it, you can't manage it. Blind APIs cause prolonged outages.

1. The Three Pillars Strategy (Static)

Define exactly what your framework will emit:

Logs (Events): Structured JSON logging. Never use raw text strings ("User 123 failed login" vs {"event": "login_failed", "user_id": 123, "reason": "bad_password"}).
Metrics (Aggregations): Implement the RED Method:
- Rate: Requests per second.
- Errors: Failed request rate (4xx vs 5xx).
- Duration: Latency percentiles (p50, p90, p99).
Traces (Workflows): Distributed tracing (W3C Trace Context). Ensure trace_id and span_id propagate across microservices and database calls.

2. Health & Alerting Design

Define what constitutes "Healthy" and when pagers should go off.

Deep Health Checks: /healthz shouldn't just return 200 OK. It should verify DB connection, Redis reachability, and critical downstreams.
Alert Rules:
- Warning: p99 latency > 500ms for 5 minutes.
- Critical: 5xx error rate > 5% for 2 minutes.

3. Output Generation

Required Outputs (Must write BOTH to docs/api-report/):

Human-Readable Markdown (docs/api-report/api-observability-report.md)

### 🔭 API Observability Blueprint

**Instrumentation Strategy:** OpenTelemetry (OTel)
**Log Format:** Structured JSON

#### 📊 Core Metrics (RED Method)
1. **Rate:** Tracked via Prometheus `http_requests_total`.
2. **Errors:** Alerting on HTTP 500-599. (4xx are client problems, track but don't wake up on-call).
3. **Duration:** Tracked via `http_request_duration_seconds` (Buckets: 50ms, 100ms, 500ms, 1s, 5s).

#### 🚨 Alert Configuration (PagerDuty / Slack)
- **High Severity:** Order Creation 5xx Rate > 1% over 5m.
- **Low Severity:** Database Disk Space < 20%.

#### 🆔 Tracing Propagation
Inject `traceparent` and `tracestate` headers into all outgoing upstream HTTP/gRPC requests.

Machine-Readable JSON (docs/api-report/api-observability-output.json)

{
  "skill": "api-observability-planner",
  "framework": "OpenTelemetry",
  "metrics": {
    "latency_thresholds_ms": {"p95": 200, "p99": 500}
  },
  "alerts": [
    {"name": "High 5xx Rate", "condition": "error_rate > 1%", "duration": "5m", "severity": "High"}
  ]
}

Guardrails

Log Forging / Injection: Ensure log sanitization is implemented to prevent multiline log spoofing.
PII in Logs: Explicitly call out that passwords, tokens, credit_cards, and emails must be masked or scrubbed before being written to stdout or log aggregators.

Related Skills

fatih-developer/prompt-crafter

tools

VerifiedTrustedCommunity

Create, optimize, critique, and programmatically structure prompts for AI systems. Use this skill whenever the user is designing or improving a static prompt, system prompt, coding prompt, agent prompt, workflow prompt, MCP-oriented prompt package, or an algorithmic prompt optimization pipeline. Also use it when the user asks to turn vague AI behavior into a precise instruction set, tool policy, agent spec, evaluation metric, or prompt architecture.

5SKILL.mdUpdated Jun 4, 2026

fatih-developer/prompt-crafter

fatih-developer/plan-hardener

testing

VerifiedTrustedCommunity

Assumption-first architecture review skill to stress-test project plans and expose hidden risks.

5SKILL.mdUpdated Jun 4, 2026

fatih-developer/plan-hardener

fatih-developer/design-md-enforcer

testing

VerifiedTrustedCommunity

Enforce and manage DESIGN.md specifications, extract design systems from URLs, and combine design reasoning with token roles to prevent drift.

5SKILL.mdUpdated Jun 4, 2026

fatih-developer/design-md-enforcer

fatih-developer/claude-style-coding

testing

VerifiedTrustedCommunity

Forces the agent to act with a Claude-like product mindset, prioritizing user journey, UX states, and visual quality before coding.

5SKILL.mdUpdated Jun 4, 2026

fatih-developer/claude-style-coding

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/fatih-developer/fth-skills.git

# Copy into Claude Code skills folder (global)
cp -r fth-skills/skills/api-observability-planner ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

fatih-developer/fth-skills

4 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT