Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

curiositech/log-aggregation-architect

Name: log-aggregation-architect
Author: curiositech

skills/log-aggregation-architect/SKILL.md

npx skillsauth add curiositech/windags-skills log-aggregation-architect

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Log Aggregation Architect

Expert in designing centralized log pipelines with structured logging, efficient collection, and cost-effective retention.

Activation Triggers

Activate on: "log aggregation", "structured logging", "Fluentd config", "Vector pipeline", "Loki setup", "ELK stack", "log pipeline", "log retention policy", "centralized logging", "log shipping"

NOT for: Metrics/dashboards → monitoring-stack-deployer | Distributed tracing → logging-observability | Alerting → site-reliability-engineer

Quick Start

Standardize log format — JSON structured logs with consistent fields across all services
Deploy collection agents — Vector or Fluentd as DaemonSet on every node
Choose storage backend — Grafana Loki (cost-effective), Elasticsearch (full-text search), or ClickHouse (analytics)
Define retention tiers — hot (7d searchable), warm (30d compressed), cold (1y archived)
Build correlation — trace ID propagation so logs link to traces and metrics

Core Capabilities

| Domain | Technologies | |--------|-------------| | Collection | Vector 0.43, Fluentd 1.17, Fluent Bit 3.2, OTEL Collector | | Storage | Grafana Loki 3.x, Elasticsearch 8.x, ClickHouse, S3 archive | | Structured Logging | JSON, logfmt, OpenTelemetry Logs, pino, winston, slog (Go) | | Pipeline | Transform, filter, route, sample, deduplicate, redact PII | | Visualization | Grafana (Loki), Kibana (Elastic), Grafana Explore |

Architecture Patterns

Vector Pipeline (Recommended 2026)

# vector.toml — collect, transform, route
[sources.kubernetes]
type = "kubernetes_logs"
auto_partial_merge = true

[transforms.structured]
type = "remap"
inputs = ["kubernetes"]
source = '''
  # Parse JSON logs, fallback to raw message
  . = parse_json(.message) ?? {"message": .message}
  .timestamp = now()
  .service = .kubernetes.pod_labels."app.kubernetes.io/name" ?? "unknown"
  .environment = .kubernetes.pod_namespace
  # Redact PII
  .message = redact(.message, filters: ["pattern"],
    patterns: [r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'])
'''

[transforms.sampler]
type = "sample"
inputs = ["structured"]
rate = 10  # Keep 1 in 10 debug logs
exclude."level" = ["error", "warn", "info"]  # Always keep non-debug

[sinks.loki]
type = "loki"
inputs = ["sampler"]
endpoint = "http://loki-gateway:3100"
labels.service = "{{ service }}"
labels.level = "{{ level }}"
encoding.codec = "json"

Structured Log Schema (Cross-Language Standard)

{
  "timestamp": "2026-03-20T14:30:00.000Z",
  "level": "info",
  "message": "Order processed successfully",
  "service": "order-api",
  "trace_id": "abc123def456",
  "span_id": "789ghi",
  "user_id": "usr_masked",
  "order_id": "ord_12345",
  "duration_ms": 142,
  "http": {
    "method": "POST",
    "path": "/api/v1/orders",
    "status": 201
  }
}

Retention Tier Strategy

HOT (0-7 days):
  ├─ Full-text searchable in Loki/Elasticsearch
  ├─ Instant query response (<1s)
  └─ Cost: $$$ (SSD, indexed)

WARM (7-30 days):
  ├─ Compressed, queryable with delay
  ├─ Query response 5-30s
  └─ Cost: $$ (HDD, partial index)

COLD (30-365 days):
  ├─ S3/GCS archive, queryable via Athena/BigQuery
  ├─ Query response: minutes
  └─ Cost: $ (object storage, no index)

DELETED (365+ days):
  └─ Lifecycle policy auto-deletes (compliance permitting)

Anti-Patterns

Unstructured string logs — console.log("User " + id + " did thing") is unsearchable. Use structured JSON with consistent field names.
Logging sensitive data — PII, tokens, passwords in logs. Redact at the pipeline level (Vector remap, Fluentd filter) before storage.
No log levels — everything at INFO. Use DEBUG for development, INFO for business events, WARN for recoverable issues, ERROR for failures requiring attention.
Unbounded retention — keeping all logs forever. Define retention tiers with automatic lifecycle policies. Most logs lose value after 30 days.
Missing trace correlation — logs without trace IDs cannot be correlated with distributed traces. Propagate OpenTelemetry trace context into every log line.

Quality Checklist

[ ] All services emit JSON structured logs
[ ] Consistent field schema across services (timestamp, level, service, trace_id)
[ ] Log collection agents deployed as DaemonSet (Vector or Fluent Bit)
[ ] PII redaction applied in pipeline before storage
[ ] Debug logs sampled (not all collected in production)
[ ] Retention tiers defined: hot, warm, cold with lifecycle policies
[ ] Trace IDs propagated into log context
[ ] Log-based alerts configured for error rate spikes
[ ] Grafana Explore or Kibana connected for log search
[ ] Storage costs monitored and budget-capped
[ ] Log pipeline has backpressure handling (no data loss under load)
[ ] Compliance requirements met (GDPR right-to-erasure for logs with PII)

curiositech/log-aggregation-architect

skills/log-aggregation-architect/SKILL.md

Centralized log pipeline architect with structured logging, Fluentd/Vector, and retention policies. Activate on: log aggregation, structured logging, Fluentd, Vector, Loki, ELK stack, log pipeline, log retention, centralized logging. NOT for: metrics and dashboards (use monitoring-stack-deployer), distributed tracing (use logging-observability), alerting rules (use site-reliability-engineer).

testing

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add curiositech/windags-skills log-aggregation-architect

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 2:15 PM4.5s1 file scanned

SKILL.md

license:: Apache-2.0
name:: log-aggregation-architect
description:: Centralized log pipeline architect with structured logging, Fluentd/Vector, and retention policies. Activate on: log aggregation, structured logging, Fluentd, Vector, Loki, ELK stack, log pipeline, log retention, centralized logging. NOT for: metrics and dashboards (use monitoring-stack-deployer), distributed tracing (use logging-observability), alerting rules (use site-reliability-engineer).
allowed-tools:: Read,Write,Edit,Bash(docker:*,kubectl:*,terraform:*,npm:*,npx:*)
category:: DevOps & Infrastructure
- skill:: monitoring-stack-deployer
reason:: Metrics and logs often share infrastructure and correlation

Log Aggregation Architect

Expert in designing centralized log pipelines with structured logging, efficient collection, and cost-effective retention.

Activation Triggers

Activate on: "log aggregation", "structured logging", "Fluentd config", "Vector pipeline", "Loki setup", "ELK stack", "log pipeline", "log retention policy", "centralized logging", "log shipping"

NOT for: Metrics/dashboards → monitoring-stack-deployer | Distributed tracing → logging-observability | Alerting → site-reliability-engineer

Quick Start

Standardize log format — JSON structured logs with consistent fields across all services
Deploy collection agents — Vector or Fluentd as DaemonSet on every node
Choose storage backend — Grafana Loki (cost-effective), Elasticsearch (full-text search), or ClickHouse (analytics)
Define retention tiers — hot (7d searchable), warm (30d compressed), cold (1y archived)
Build correlation — trace ID propagation so logs link to traces and metrics

Core Capabilities

Architecture Patterns

Vector Pipeline (Recommended 2026)

# vector.toml — collect, transform, route
[sources.kubernetes]
type = "kubernetes_logs"
auto_partial_merge = true

[transforms.structured]
type = "remap"
inputs = ["kubernetes"]
source = '''
  # Parse JSON logs, fallback to raw message
  . = parse_json(.message) ?? {"message": .message}
  .timestamp = now()
  .service = .kubernetes.pod_labels."app.kubernetes.io/name" ?? "unknown"
  .environment = .kubernetes.pod_namespace
  # Redact PII
  .message = redact(.message, filters: ["pattern"],
    patterns: [r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'])
'''

[transforms.sampler]
type = "sample"
inputs = ["structured"]
rate = 10  # Keep 1 in 10 debug logs
exclude."level" = ["error", "warn", "info"]  # Always keep non-debug

[sinks.loki]
type = "loki"
inputs = ["sampler"]
endpoint = "http://loki-gateway:3100"
labels.service = "{{ service }}"
labels.level = "{{ level }}"
encoding.codec = "json"

Structured Log Schema (Cross-Language Standard)

{
  "timestamp": "2026-03-20T14:30:00.000Z",
  "level": "info",
  "message": "Order processed successfully",
  "service": "order-api",
  "trace_id": "abc123def456",
  "span_id": "789ghi",
  "user_id": "usr_masked",
  "order_id": "ord_12345",
  "duration_ms": 142,
  "http": {
    "method": "POST",
    "path": "/api/v1/orders",
    "status": 201
  }
}

Retention Tier Strategy

HOT (0-7 days):
  ├─ Full-text searchable in Loki/Elasticsearch
  ├─ Instant query response (<1s)
  └─ Cost: $$$ (SSD, indexed)

WARM (7-30 days):
  ├─ Compressed, queryable with delay
  ├─ Query response 5-30s
  └─ Cost: $$ (HDD, partial index)

COLD (30-365 days):
  ├─ S3/GCS archive, queryable via Athena/BigQuery
  ├─ Query response: minutes
  └─ Cost: $ (object storage, no index)

DELETED (365+ days):
  └─ Lifecycle policy auto-deletes (compliance permitting)

Anti-Patterns

Unstructured string logs — console.log("User " + id + " did thing") is unsearchable. Use structured JSON with consistent field names.
Logging sensitive data — PII, tokens, passwords in logs. Redact at the pipeline level (Vector remap, Fluentd filter) before storage.
No log levels — everything at INFO. Use DEBUG for development, INFO for business events, WARN for recoverable issues, ERROR for failures requiring attention.
Unbounded retention — keeping all logs forever. Define retention tiers with automatic lifecycle policies. Most logs lose value after 30 days.
Missing trace correlation — logs without trace IDs cannot be correlated with distributed traces. Propagate OpenTelemetry trace context into every log line.

Quality Checklist

[ ] All services emit JSON structured logs
[ ] Consistent field schema across services (timestamp, level, service, trace_id)
[ ] Log collection agents deployed as DaemonSet (Vector or Fluent Bit)
[ ] PII redaction applied in pipeline before storage
[ ] Debug logs sampled (not all collected in production)
[ ] Retention tiers defined: hot, warm, cold with lifecycle policies
[ ] Trace IDs propagated into log context
[ ] Log-based alerts configured for error rate spikes
[ ] Grafana Explore or Kibana connected for log search
[ ] Storage costs monitored and budget-capped
[ ] Log pipeline has backpressure handling (no data loss under load)
[ ] Compliance requirements met (GDPR right-to-erasure for logs with PII)

Related Skills

curiositech/revisiting-interview-data-analysing-turn

data-ai

VerifiedTrustedCommunity

license: Apache-2.0 NOT for unrelated tasks outside this domain.

8SKILL.mdUpdated Jul 19, 2026

curiositech/revisiting-interview-data-analysing-turn

curiositech/redis-patterns-expert

development

VerifiedTrustedCommunity

Use when designing caching strategies (cache-aside, write-through, write-behind), implementing distributed locks, building rate limiters, leaderboards, real-time streams (XADD/consumer groups), pub/sub, or tuning eviction policies. Triggers: thundering-herd on cache miss, dogpile on key expiry, Redlock vs SET-NX-PX choice, sliding-window rate limiter, hot-key on a single cluster slot, big-key blowup, MULTI/EXEC across slots, KEYS in production. NOT for Redis Cluster operations/admin (different domain), embedded KV (SQLite, leveldb), in-process LRU caches, or Memcached.

8SKILL.mdUpdated Jul 19, 2026

curiositech/redis-patterns-expert

curiositech/react-server-components-boundary

tools

VerifiedTrustedCommunity

Drawing the `'use client'` boundary correctly in React Server Components apps (Next.js App Router, RSC frameworks) — leaf-pushing, slot composition, serialization rules, and environment poisoning prevention. Grounded in react.dev and Next.js 16 docs.

8SKILL.mdUpdated Jul 19, 2026

curiositech/react-server-components-boundary

curiositech/rate-limiting-strategy

development

VerifiedTrustedCommunity

Use when designing rate limiting for an API, choosing between token bucket / sliding window / leaky bucket / fixed window, implementing it in Redis, deciding edge (Cloudflare/Upstash) vs origin enforcement, sizing per-user vs per-IP vs per-endpoint quotas, returning the right 429 response with Retry-After, or fixing the boundary-burst bug in fixed-window limiters. Triggers: 429 too many requests, INCR + EXPIRE, ZADD + ZREMRANGEBYSCORE + ZCARD, X-RateLimit-Remaining header, Cloudflare WAF rate limiting rules, Upstash @upstash/ratelimit, leaky bucket shaping vs policing, distributed rate limiter consistency. NOT for DDoS mitigation specifically (different scale), CAPTCHA / bot management, full WAF design, or per-user quota billing.

8SKILL.mdUpdated Jul 19, 2026

curiositech/rate-limiting-strategy

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/curiositech/windags-skills.git

# Copy into Claude Code skills folder (global)
cp -r windags-skills/skills/log-aggregation-architect ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

curiositech/windags-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT