Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

mongo-ai/posthog-selfhosted-diagnosis

Name: posthog-selfhosted-diagnosis
Author: mongo-ai

.claude/skills/posthog-selfhosted-diagnosis/SKILL.md

npx skillsauth add mongo-ai/posthog-triage-agent posthog-selfhosted-diagnosis

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Self-hosted diagnosis workflow

Before you diagnose: fetch live docs

Run docs-search("self-hosted troubleshooting {specific symptom}")
Run docs-search("self-hosted deployment {docker OR helm OR kubernetes}")
If potential known bug: run gh search issues --repo PostHog/posthog "self-hosted {symptom}"
Use the fetched docs as your primary reference

The golden rule

PostHog support helps with PostHog configuration. Customer infrastructure (Kubernetes clusters, Docker hosts, cloud provider settings, networking) is the customer's responsibility.

Be helpful and point them in the right direction, but be clear about boundaries. Don't guess at infrastructure fixes you can't verify.

Step 0: Determine deployment type

| Signal | Deployment type | |--------|----------------| | "docker compose" or "docker-compose.yml" | Docker Compose (hobby/small) | | "helm", "kubernetes", "k8s", "EKS", "GKE", "AKS" | Kubernetes / Helm | | "posthog cloud" or no mention of self-hosting | Cloud — wrong skill, use standard diagnosis |

If unclear, ask: "Are you using PostHog Cloud (app.posthog.com / eu.posthog.com) or a self-hosted deployment?"

Step 1: Check instance health

Ask the customer to check:

Instance status page: {their-posthog-url}/_health or /api/projects/
ClickHouse status: Check if ClickHouse container/pod is running
Postgres status: Check if Postgres is accepting connections
Redis status: Check if Redis is responding to PING
Kafka status (if applicable): Check consumer lag

For Docker Compose:

docker compose ps       # All containers should be "Up"
docker compose logs -f  # Check for error patterns

For Kubernetes:

kubectl get pods -n posthog    # All pods should be Running/Ready
kubectl logs -n posthog <pod>  # Check for errors

Step 2: Classify the self-hosted issue

| Symptom | Likely cause | Category | |---------|-------------|----------| | "No events coming in" | Ingestion pipeline, reverse proxy, or SDK config | PostHog + Infra | | "Events arrive but queries are slow" | ClickHouse resource limits or schema issues | Infra | | "Session replay not working" | Recording config, MinIO/object storage, or CSP | PostHog + Infra | | "Feature flags not evaluating" | API endpoint not reachable, or /decide blocked | PostHog + Infra | | "Can't log in / SSO broken" | Auth config, Postgres, or network issues | Infra | | "Upgrade failed" | Migration issues, version compatibility | PostHog + Infra | | "Out of disk / memory" | Resource limits, ClickHouse data growth | Infra | | "CORS / CSP errors" | Reverse proxy misconfiguration | Infra | | "SSL certificate errors" | Certificate config, proxy termination | Infra |

Step 3: Common self-hosted issues

Reverse proxy / CORS / SSL

The most common self-hosted issue. Run: docs-search("self-hosted reverse proxy CORS configuration")

Key checks:

Is the reverse proxy (nginx, Caddy, Traefik) forwarding WebSocket connections? Session replay and live events need WebSocket support.
Is the Host header being passed through correctly?
Are CORS headers configured to allow the customer's app domain?
Is SSL termination happening at the proxy or at PostHog?

ClickHouse resource issues

ClickHouse is the most resource-hungry component.

Common patterns:

OOM kills: ClickHouse container/pod restarting → needs more memory
Slow queries: Check system.query_log for long-running queries
Disk full: ClickHouse data grows fast → check retention settings, add more disk, or configure TTL

Ask:

"How much memory is allocated to ClickHouse? (Recommend minimum 4GB, 8GB+ for production)"
"How much disk space is available? Check df -h on the ClickHouse data volume"

Event ingestion failures

If events aren't arriving:

Check if the Kafka consumer is running and processing
Check for ingestion warnings in PostHog (if UI is accessible)
Verify the SDK is pointed at the correct self-hosted URL (not app.posthog.com)
Check the reverse proxy logs for blocked requests

Upgrade / migration failures

Run: docs-search("self-hosted upgrade migration troubleshooting")

Common issues:

Async migrations stuck: Check the async migrations page in PostHog UI (Settings → Instance Status → Async Migrations)
Version skipping: PostHog sometimes requires sequential upgrades (can't skip major versions)
Schema mismatch: ClickHouse schema not matching expected state after upgrade

Step 4: Search known self-hosted issues

gh search issues --repo PostHog/posthog "self-hosted {symptom}" --limit 10
gh search issues --repo PostHog/posthog "self hosted {alternate terms}" --limit 10

Also check the self-hosted docs:

gh api repos/PostHog/posthog.com/contents/contents/docs/self-host --jq '.[].name'

Step 5: Determine support boundary

In scope for PostHog support:

PostHog configuration and settings
SDK configuration pointed at self-hosted instance
PostHog upgrade guidance and migration issues
Feature configuration (flags, replay, experiments)
Explaining PostHog's infrastructure requirements
Known bugs in PostHog that affect self-hosted

Out of scope (customer's responsibility):

Kubernetes cluster management and troubleshooting
Docker host configuration and resource allocation
Cloud provider settings (AWS, GCP, Azure)
Network configuration, firewalls, load balancers
SSL certificate management
Database administration (Postgres tuning, ClickHouse optimization)
Backup and disaster recovery

How to communicate the boundary:

DO: "PostHog needs at least 4GB of memory for ClickHouse to run smoothly. It looks like your deployment might be hitting resource limits. Here's our infrastructure requirements doc: [link]. Your DevOps team can use this to right-size the deployment."

DON'T: "That's an infrastructure issue, not our problem." or attempting to debug their Kubernetes cluster configuration.

Step 6: Consider Cloud recommendation

If the customer is struggling with self-hosted complexity, it's appropriate to gently suggest PostHog Cloud:

"If managing the infrastructure is becoming a challenge, PostHog Cloud handles all of this automatically and includes the same features. Happy to help you evaluate whether Cloud might be a better fit — you can migrate your data over."

Only suggest this when:

The issue is primarily infrastructure, not PostHog
The customer has expressed frustration with self-hosted maintenance
The customer isn't on self-hosted for compliance/data residency reasons

Escalation

Escalate to engineering when:

A PostHog bug specifically affects self-hosted but not Cloud
Migration/upgrade is stuck and the documented steps don't resolve it
Data corruption or loss in ClickHouse/Postgres
A security issue specific to the self-hosted deployment

mongo-ai/posthog-selfhosted-diagnosis

.claude/skills/posthog-selfhosted-diagnosis/SKILL.md

Diagnose PostHog self-hosted deployment issues including Docker Compose and Helm/Kubernetes problems, ClickHouse and Postgres failures, resource limits, reverse proxy misconfigurations, and data ingestion gaps. Clearly distinguishes PostHog bugs from customer infrastructure issues.

tools

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add mongo-ai/posthog-triage-agent posthog-selfhosted-diagnosis

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 17, 2026, 12:53 AM31.0s1 file scanned

SKILL.md

name:: posthog-selfhosted-diagnosis
description:: >-

Self-hosted diagnosis workflow

Before you diagnose: fetch live docs

Run docs-search("self-hosted troubleshooting {specific symptom}")
Run docs-search("self-hosted deployment {docker OR helm OR kubernetes}")
If potential known bug: run gh search issues --repo PostHog/posthog "self-hosted {symptom}"
Use the fetched docs as your primary reference

The golden rule

PostHog support helps with PostHog configuration. Customer infrastructure (Kubernetes clusters, Docker hosts, cloud provider settings, networking) is the customer's responsibility.

Be helpful and point them in the right direction, but be clear about boundaries. Don't guess at infrastructure fixes you can't verify.

Step 0: Determine deployment type

If unclear, ask: "Are you using PostHog Cloud (app.posthog.com / eu.posthog.com) or a self-hosted deployment?"

Step 1: Check instance health

Ask the customer to check:

Instance status page: {their-posthog-url}/_health or /api/projects/
ClickHouse status: Check if ClickHouse container/pod is running
Postgres status: Check if Postgres is accepting connections
Redis status: Check if Redis is responding to PING
Kafka status (if applicable): Check consumer lag

For Docker Compose:

docker compose ps       # All containers should be "Up"
docker compose logs -f  # Check for error patterns

For Kubernetes:

kubectl get pods -n posthog    # All pods should be Running/Ready
kubectl logs -n posthog <pod>  # Check for errors

Step 2: Classify the self-hosted issue

Step 3: Common self-hosted issues

Reverse proxy / CORS / SSL

The most common self-hosted issue. Run: docs-search("self-hosted reverse proxy CORS configuration")

Key checks:

Is the reverse proxy (nginx, Caddy, Traefik) forwarding WebSocket connections? Session replay and live events need WebSocket support.
Is the Host header being passed through correctly?
Are CORS headers configured to allow the customer's app domain?
Is SSL termination happening at the proxy or at PostHog?

ClickHouse resource issues

ClickHouse is the most resource-hungry component.

Common patterns:

OOM kills: ClickHouse container/pod restarting → needs more memory
Slow queries: Check system.query_log for long-running queries
Disk full: ClickHouse data grows fast → check retention settings, add more disk, or configure TTL

Ask:

"How much memory is allocated to ClickHouse? (Recommend minimum 4GB, 8GB+ for production)"
"How much disk space is available? Check df -h on the ClickHouse data volume"

Event ingestion failures

If events aren't arriving:

Check if the Kafka consumer is running and processing
Check for ingestion warnings in PostHog (if UI is accessible)
Verify the SDK is pointed at the correct self-hosted URL (not app.posthog.com)
Check the reverse proxy logs for blocked requests

Upgrade / migration failures

Run: docs-search("self-hosted upgrade migration troubleshooting")

Common issues:

Async migrations stuck: Check the async migrations page in PostHog UI (Settings → Instance Status → Async Migrations)
Version skipping: PostHog sometimes requires sequential upgrades (can't skip major versions)
Schema mismatch: ClickHouse schema not matching expected state after upgrade

Step 4: Search known self-hosted issues

gh search issues --repo PostHog/posthog "self-hosted {symptom}" --limit 10
gh search issues --repo PostHog/posthog "self hosted {alternate terms}" --limit 10

Also check the self-hosted docs:

gh api repos/PostHog/posthog.com/contents/contents/docs/self-host --jq '.[].name'

Step 5: Determine support boundary

In scope for PostHog support:

PostHog configuration and settings
SDK configuration pointed at self-hosted instance
PostHog upgrade guidance and migration issues
Feature configuration (flags, replay, experiments)
Explaining PostHog's infrastructure requirements
Known bugs in PostHog that affect self-hosted

Out of scope (customer's responsibility):

Kubernetes cluster management and troubleshooting
Docker host configuration and resource allocation
Cloud provider settings (AWS, GCP, Azure)
Network configuration, firewalls, load balancers
SSL certificate management
Database administration (Postgres tuning, ClickHouse optimization)
Backup and disaster recovery

How to communicate the boundary:

DON'T: "That's an infrastructure issue, not our problem." or attempting to debug their Kubernetes cluster configuration.

Step 6: Consider Cloud recommendation

If the customer is struggling with self-hosted complexity, it's appropriate to gently suggest PostHog Cloud:

Only suggest this when:

The issue is primarily infrastructure, not PostHog
The customer has expressed frustration with self-hosted maintenance
The customer isn't on self-hosted for compliance/data residency reasons

Escalation

Escalate to engineering when:

A PostHog bug specifically affects self-hosted but not Cloud
Migration/upgrade is stuck and the documented steps don't resolve it
Data corruption or loss in ClickHouse/Postgres
A security issue specific to the self-hosted deployment

Related Skills

mongo-ai/posthog-web-analytics-diagnosis

tools

VerifiedTrustedCommunity

Diagnose PostHog web analytics issues including missing pageviews, incorrect bounce rates, broken channel attribution, missing UTM data, reverse proxy problems, and discrepancies with other analytics tools.

SKILL.mdUpdated Apr 16, 2026

mongo-ai/posthog-web-analytics-diagnosis

mongo-ai/posthog-triage-report

business

VerifiedTrustedCommunity

Final synthesis skill. Produce a structured, evidence-graded triage report with a clear root-cause assessment, honest confidence, and a ready-to-send customer response.

SKILL.mdUpdated Apr 16, 2026

mongo-ai/posthog-triage-report

mongo-ai/posthog-ticket-intake

tools

VerifiedTrustedCommunity

Normalize an incoming support ticket into structured investigation inputs: product area, identifiers, scope clues, URLs, timeframe, and likely first diagnostic path.

SKILL.mdUpdated Apr 16, 2026

mongo-ai/posthog-ticket-intake

mongo-ai/posthog-survey-diagnosis

development

VerifiedTrustedCommunity

Diagnose PostHog survey issues including surveys not appearing, targeting mismatches, response collection failures, display timing problems, and API-mode survey integration issues.

SKILL.mdUpdated Apr 16, 2026

mongo-ai/posthog-survey-diagnosis

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/mongo-ai/posthog-triage-agent.git

# Copy into Claude Code skills folder (global)
cp -r posthog-triage-agent/.claude/skills/posthog-selfhosted-diagnosis ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

mongo-ai/posthog-triage-agent

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT