posthog

110 verified skills6,127 total stars

building-a-dashboard

Build a new dashboard, or update an existing one, from a set of insights — the same job the in-app assistant does with its upsert-dashboard tool, but over MCP. Use when a user asks to create a dashboard, put several metrics/charts together on one page, assemble a dashboard for a topic (product analytics, retention, revenue, activation, etc.), or add/remove/replace insights on a dashboard they already have. Covers deciding create vs update, reusing existing insights vs creating new ones, and using PostHog's vetted dashboard templates as reference for what a strong dashboard on a topic looks like.

tools62

signals-scout-mcp-tool-calls

Signals scout for PostHog MCP tool calls. Watches $mcp_tool_call telemetry for tools that need improvement — high, broad-reach failure rates, retry/hammering that betrays a confusing schema, slow or context-bloating responses — groups problem tools by $mcp_tool_category (the owning product team) and files one report per problem category listing that category's problem tools each with a fix suggestion; falls back to one report per tool where category coverage is absent. Immediately-actionable reports carry a fix-loop metric (measurement query, baseline, goal) so the auto-started implementation task iterates until the number moves. Otherwise writes durable memory and closes out empty. Adapts to which fields the project actually captures.

tools62

testing-mcp-tools-locally

Set up the local dev environment, seed data, and API keys to test the staff-only managed migrations MCP tools (managed-migrations-support-list, managed-migrations-support-get) end to end. Use when testing batch import support tooling, debugging MCP tool responses or discovery (tools not appearing), or verifying the support API before deploying. Covers the discovery gate: hidden scope, is_staff, user:read, and why wildcard keys and OAuth never work.

tools62

authoring-error-tracking-alerts

Author error tracking alerts that fire when an issue is created, reopened, or starts spiking. Use when the user asks to set up error notifications, route exceptions to Slack/webhook/Linear, or evaluate which error events are worth alerting on. Covers trigger-event selection, integration choice, dedup against existing alerts, and shipping with the canonical message body shape.

development62

turning-engineering-analytics-into-insights

Converts engineering analytics (PR / CI) data into saved PostHog insights, dashboards, and subscriptions, and explains what data the product reads so it can be queried directly with SQL. The engineering analytics dashboard and MCP tools run curated HogQL privately over per-team GitHub warehouse tables; this skill teaches discovering those tables via engineering-analytics-sources, replicating the curated column semantics in HogQL, reading the exposed engineering_analytics_* warehouse views where product logic is involved (CI cost, fingerprinted failure lines, commit attribution), saving the query with insight-create, and scheduling delivery with subscriptions-create. Use when asked to "save this as an insight", "put CI health / merge times on a dashboard", "email me PR throughput weekly", "chart CI cost", "track time to first review", "subscribe to these numbers", "alert on CI success rate", or "what data/tables/views does engineering analytics read". For ad-hoc CI and merge questions use diagnosing-ci-and-merge-bottlenecks; to investigate one specific CI failure use investigating-ci-failures.

tools62

exploring-llm-costs

Investigate LLM spend in PostHog — total cost over time, cost by model, provider, user, trace, or custom dimension, token and cache-hit economics, and cost regressions. Use when the user asks "how much are we spending on LLMs?", "which model / user / feature is most expensive?", "why did cost spike?", wants to build a cost dashboard or alert, or pastes a trace URL and asks about its cost.

development61

setting-up-a-custom-rest-source

Connect an arbitrary REST API to the PostHog data warehouse as a Custom source by authoring a JSON manifest, with no per-source code. Use when the user points at an API that has no built-in PostHog connector — "import data from this REST API", "sync my internal API", "connect this API from its docs", "build a custom data warehouse source" — and gives a docs URL or a natural-language description of the endpoints. Walks through drafting the RESTAPIConfig manifest (auth — bearer, API key, HTTP basic, or OAuth2 client credentials / refresh token — pagination, record path, incremental cursor, parent/child fan-out), validating it, test-reading live rows to verify the field mappings, and creating the source. If the API already has a native PostHog connector, use setting-up-a-data-warehouse-source instead — this skill checks the connector registry first and only handles APIs with no native connector.

tools61

signals-scout-logs

Signals scout for PostHog logs. Watches for emerging and rate-shifted message patterns (window-over-window deltas), volume bursts, severity-distribution shifts, service silence, and trace-correlated bursts.

tools61

signals-scout-csp-violations

Signals scout for Content Security Policy violation reports. Watches `$csp_violation` events for blocked-URL clusters, per-directive bursts, post-deploy regressions, and suspicious third-party domains, and files each validated cluster as a report in the inbox.

testing61

designing-email-templates

Author, save, and edit email templates in the PostHog workflows library — compose email design JSON with Liquid personalization and create and round-trip-edit templates over MCP. Use when asked to design, build, update, or fix an email template for workflows, broadcasts, or campaigns.

tools61

exploring-mcp-intent-clusters

Explore PostHog MCP intent clusters — agent goals grouped by semantic similarity, with each cluster's tool distribution and error rates. Use when the user asks "what are agents trying to do with the MCP?", "group the intents", "which goals fail most?", "what does each cluster route to?", wants to recompute the clustering, or pastes an MCP analytics intent-clustering URL.

tools61

grouping-noisy-errors

Consolidate PostHog error tracking issues that are the same actual error reported under different fingerprints. Use when the user asks "why do I have so many TypeError issues that look the same?", "merge these duplicates", "stop splitting this error into new issues", or wants to clean up fingerprint sprawl. Decides between a one-shot merge of existing issues and a durable grouping rule that keeps future events from creating new fingerprints. Does NOT group conceptually similar bugs across different runtimes, SDKs, or call sites.

development61

copying-endpoints-across-projects

Copy a PostHog endpoint (a saved HogQL/insight query exposed as an API route) to another project in the same organization, or duplicate it under a new name in the same project. Use when the user wants to duplicate an endpoint, promote an endpoint from staging to production, replicate an endpoint's query/variables/freshness config in another workspace, or clone an endpoint to iterate on it. Unlike feature flags and experiments, endpoints have NO native cross-project copy tool — this skill covers the read-then-recreate flow (endpoint-get then endpoint-create), the active-project switching it requires, name-collision checks, and the safe defaults (land unmaterialised in the target, verify with endpoint-run). Does not cover editing endpoint versions (see managing-endpoint-versions) or authoring a brand-new endpoint from scratch (see creating-an-endpoint).

tools61

signals-scout-anomaly-detection

Signals scout that watches a project's most-viewed dashboards and insights for recent anomalies — bursts, drops, flat-lines, and trend breaks scored against each insight's own seasonality-matched baseline. Files each anomaly as a finished 1:1 inbox report on the report channel (emit_report / edit_report) rather than a weak signal.

data-ai61

exploring-mcp-tool-usage

Starting point for exploring how a PostHog MCP server's tools are used — routes a broad question to the typed tool that answers it. Use when the user asks "how is my MCP doing?", "what should I look at?", "explore my tool calls", "who uses my MCP tools?", "what are agents doing with the MCP?", or pastes an MCP analytics URL without a specific question. Offers a menu of questions, each backed by a query tool, then hands off to the focused skill.

tools61

investigating-error-issue

Investigates a single PostHog error tracking issue end-to-end. Use when the user provides an issue ID or pastes an issue URL (`/error_tracking/<id>`) and wants to understand the error — who it affects, what triggers it, when it started, whether it correlates with a release, browser, OS, or feature flag, and what the next step should be. Pulls aggregated metrics, sample exception events, segment breakdowns, linked replays, and synthesizes a hypothesis-grade summary in one pass.

tools61

setting-up-a-data-warehouse-source

Guide the user through connecting a new data warehouse source — Postgres, MySQL, Stripe, Hubspot, MongoDB, Salesforce, BigQuery, Snowflake, and so on. Use when the user wants to "connect Stripe", "import data from Postgres", "add a new data source", "sync my warehouse tables", or wants to pick sync methods for each table. Walks through source-type discovery, credential validation, table discovery, per-table sync_type selection, and the final create call. Also covers picking a good prefix and what to do right after creation.

development61

signals-scout-customer-analytics

Signals scout for PostHog Customer analytics (Accounts). Watches per-account engagement for churn-risk shapes — engagement cliffs, dormancy, champion departure — and the expansion inverse, weighted by commercial ownership, and files each validated risk as a report in the inbox.

tools61

signals-scout-data-pipelines

Signals scout for PostHog data pipelines — CDP destinations and transformations, batch exports, and hog flows. Watches for delivery failures, degraded functions, and stalled exports against each pipeline's baseline, and files each validated delivery contradiction as a report in the inbox.

development61

signals-scout-experiments

Signals scout for PostHog A/B experiments. Watches running experiments for validity threats (sample ratio mismatch, contamination, exposure stalls, mid-run flag mutations) and lifecycle drift (zombies, decided-but-running), and files each validated validity threat as a report in the inbox.

testing61

signals-scout-ai-observability

Signals scout for PostHog AI observability. Watches LLM traces for cost, latency, error, volume, and eval-performance regressions, sliced by the dimensions it discovers over time, and files each validated regression as a report in the inbox.

testing61

signals-scout-replay-vision

Signals scout for PostHog Replay Vision scanners. Watches that enabled scanners keep observing (throughput / quota cliffs) and that what they see in aggregate gets surfaced (score shifts, recurring themes across sessions), and files each validated finding as a report in the inbox.

tools61

signals-scout-web-analytics

Signals scout for PostHog web traffic. Watches per-channel session volume, attribution breakage, and landing-page health (bounce / 404 steps) against the site's own baseline, and files each validated divergence as a report in the inbox. Per-page web vitals have their own dedicated `signals-scout-web-vitals`.

development61

signals-scout-skills-store

Skill-hygiene scout for the team's PostHog skills store, read entirely via the MCP skill tools. Watches recently-changed skills — plus a slow rotation over the most-used, highest-leverage ones — for statically-verifiable authoring violations: vague descriptions, bloated bodies, dead bundled-file links, kitchen-sink scope, committed secrets. Files each non-compliant skill as a report in the inbox, with the copy-ready fix inside.

tools61

signals-scout-product-analytics

Signals scout for core product-analytics flows — funnels, retention, lifecycle, stickiness, and paths. Watches the team's saved flows for a derived-rate regression (conversion or retention sliding) while entrants hold, and files it as a report in the inbox.

data-ai61

signals-scout-insight-alerts

Signals scout over a project's own configured insight alerts. Reads each alert's recent firing history and files a report for the firings a human likely missed — especially ones the standard notification path stayed silent on.

testing61

signals-scout-revenue-analytics

Signals scout for PostHog revenue analytics. Watches for upstream failures (Stripe sync stalls, capture regressions), config drift, and goal-miss escalations, and files each validated finding as a report in the inbox.

development61

exploring-mcp-sessions

Investigate individual PostHog MCP sessions — the sequence of tool calls a single agent made in one run, what it was trying to do, and where it went wrong. Use when the user asks "what did this MCP session do?", "show me the tool calls for session X", "what was the agent's goal?", "which sessions had errors?", "who is connecting to my MCP?", or pastes an MCP analytics sessions URL.

tools61

triaging-error-issues

Triage PostHog error tracking issues during a daily or on-call review. Use when the user asks "what's broken?", "what new errors do we have?", "show me top errors today", "what should I look at this morning", or wants a prioritized list of active issues to work on. Surfaces new and high-impact issues, ranks by users affected and recency, points at linked replays, and proposes next actions (investigate, assign, suppress, merge).

data-ai61

signals-scout-ingestion-warnings

Signals scout for ingestion warnings — events and person/group updates that were dropped, mangled, or partially rejected during ingestion. Watches the warnings stream for new warning types, bursts above a type's own baseline, and error-severity clusters with broad reach, and files each actionable root cause as a report with the affected events and the fix.

business61

authoring-log-alerts

Author useful, low-noise log alerts on services in a PostHog project. Use when the user asks to set up alerts for their logs, suggest alerts they should add, or evaluate whether a service is worth monitoring. Covers service triage, baseline characterisation, threshold drafting, back-testing via simulate, and shipping with a notification destination.

testing61

exploring-autocapture-events

Guides exploration of $autocapture events captured by posthog-js to understand user interactions, find CSS selectors (especially data-attr attributes), evaluate selector uniqueness, query matching clicks ad-hoc, and create actions. Use when the user asks about autocapture data, wants to find what users are clicking, needs to build actions from click events, asks about elements_chain, wants to build a trend or funnel filtered by clicks or other autocapture interactions, asks which properties autocapture sends, or asks how to filter $autocapture events. Only applies to projects using posthog-js autocapture.

tools61

signals-scout-general

Cross-product Signals scout. Looks for cross-product correlations and explores the surfaces the per-product specialist scouts don't cover.

testing61

creating-experiments

Guides agents through the 3-step experiment creation flow: defining the hypothesis, configuring rollout, and setting up analytics. Delegates rollout decisions to configuring-experiment-rollout and metric setup to configuring-experiment-analytics. TRIGGER when: user asks to create a new experiment or A/B test, OR when you are about to call experiment-create. DO NOT TRIGGER when: user is updating an existing experiment, managing lifecycle, or only browsing experiments.

testing61

suppressing-noisy-errors

Create PostHog error tracking suppression rules to drop high-volume, low-value errors at ingestion. Use when the user asks "stop capturing this error", "drop browser extension errors", "ignore ResizeObserver loops", "suppress bot-driven errors", or wants to reduce ingestion cost from noisy unactionable errors. Identifies suppression candidates, scopes the filter tightly, decides between full suppression and sampling, and confirms the rule before creating it. Suppressed errors are dropped permanently — this skill defaults to caution.

tools61

exploring-llm-traces

ABSOLUTE MUST to debug and inspect LLM/AI agent traces using PostHog's MCP tools. Use when the user pastes a trace or session URL (e.g. /ai-observability/traces/<id> or /ai-observability/sessions/<id>), asks to debug a trace, figure out what went wrong, check if an agent used a tool correctly, verify context/files were surfaced, inspect subagent behavior, investigate LLM decisions, or analyze token usage and costs. Also use when raw SQL/HogQL against `events.properties.$ai_input` / `$ai_output_choices` returns empty — message content lives only on the dedicated `posthog.ai_events` table.

tools61

managing-path-cleaning-rules

Inspects URL paths and proposes, tests, orders, and applies project-level path cleaning rules so dynamic segments (numeric IDs, UUIDs, slugs, dates) collapse into readable aliases. Use when the user says "clean the paths", "normalize URLs", "group similar pages", "too many distinct paths", "/users/123 and /users/456 are the same page", "set up path cleaning", or asks why a Web analytics or Paths breakdown is fragmented across thousands of nearly-identical URLs. Covers regex syntax (re2), alias placeholder convention, rule ordering, the test workflow, and applying rules via the path-cleaning-rules-update MCP tool.

tools61

signals-scout-apm

Signals scout for PostHog distributed tracing (APM / OpenTelemetry spans). Watches RED metrics per (service, operation) — error rate, p95 latency, request volume — for regressions, new error signatures, and traffic cliffs, and files each validated regression as a report in the inbox.

tools61

signals-scout-inbox-validation

Follow-up Signals scout for the inbox itself. After a deployment soak window, re-measures the problems behind recently resolved reports and files a report when a fix didn't hold, plus a gated escalation check on dismissed reports.

testing61

signals-scout-session-replay

Signals scout for PostHog session replay. Watches that sessions keep recording (capture cliffs) and that friction inside recordings — rage/dead-click clusters, error-after-interaction cohorts — gets surfaced, and files each validated cliff or cluster as a report in the inbox.

tools61

signals-scout-web-vitals

Focused Signals scout for PostHog projects capturing Core Web Vitals (`$web_vitals`). Watches each page's p75 LCP / INP / CLS / FCP against the absolute Google thresholds (good / needs-improvement / poor) and against its own history: pages standing in the poor band, pages crossing a band boundary after a deploy, and sharp in-band regressions. Reads the historical trajectory — not just the moment a value changes — so a page that is steadily slow surfaces even when nothing moved today. Every finding carries a metric-specific cause hypothesis and a concrete remediation, filed as a report in the inbox only above the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet.

development61

diagnosing-missing-recordings

Diagnoses why a session recording is missing or was not captured. Use when a user asks why a session has no replay, why recordings aren't appearing, or wants to troubleshoot session replay capture issues for a specific session ID or across their project. Covers SDK diagnostic signals, project settings, sampling, triggers, ad blockers, and quota/billing scenarios.

development61

investigate-metric

Diagnose why a product metric changed (dropped, spiked, or plateaued) by orchestrating breakdowns, actors, paths, lifecycle, retention, and annotations queries. Use when the user reports an anomaly, asks "why did X change?", or needs root-cause analysis for a trend, funnel, retention, stickiness, or lifecycle metric.

business61

signals-scout-surveys

Signals scout for PostHog surveys. Watches active surveys for score regressions, response-volume drops, abandonment spikes, and targeting drift, and aggregates open-text responses into recurring themes — filing each as a report in the inbox.

business61

signals-scout-error-tracking

Signals scout for PostHog error tracking. Watches `$exception` bursts, stuck loops, multi-fingerprint clusters, and status regressions, and files each validated issue as a report in the inbox.

testing61

signals-scout-observability-gaps

Signals scout for observability gaps — significant event volumes with no insight, dashboard, or alert coverage. Files a report recommending new insights, dashboards, or alerts as the team's product evolves.

data-ai61

signals-scout-data-warehouse

Focused Signals scout for PostHog projects importing external data into the warehouse. Watches the import side — external data sources, per-table sync schemas, webhook push channels, and materialized views — for the moments an import quietly stops keeping its promise: a source connection in Error, a schema Failed or stuck Running, silent staleness behind a green Completed status, a broken webhook push channel, a row-volume cliff, and failed materialized views. When armed imports are healthy, switches to the optimization lane: reads the per-team `query_log` table for recurring, multi-user query time and read-bytes concentrated on warehouse tables or repeated query shapes, filing materialization candidates and unused matviews as P3 suggestions. Files each validated import contradiction as an inbox report; otherwise writes durable memory and closes out empty.

tools61

signals-scout-health-checks

Signals scout over PostHog's own health checks. Reads the project's active health issues, bundles them by kind, weights by blast radius, and files the ones genuinely worth acting on as reports in the inbox.

testing61

diagnosing-stacktrace-symbolication

Help users debug PostHog Error Tracking stack-trace symbolication for any supported platform — JavaScript/TypeScript web, React Native (Hermes), Android (Proguard / R8), or iOS / macOS (dSYM). The PostHog symbol-set lookup flow is universal across platforms; build-tool and artifact details live in per-platform references (JavaScript is fleshed out, others come as we encounter them). Use when stack frames stay minified or obfuscated after symbols are uploaded, PostHog symbol sets show last_used but frames are not readable, chunk IDs or dSYM UUIDs do not match, "Token not found" appears, uploaded source maps / dSYMs / Proguard mappings look empty, or bundler / symbol-upload configuration needs troubleshooting.

tools61

review-hog-authoring

How to author custom ReviewHog skills — the review perspectives, blind-spot checks, and validation criteria that drive ReviewHog's automated PR reviews. Use when a user wants a new review perspective (a specialist lens on their PRs), a custom blind-spot sweep, or their own validation bar for which findings get published. Trigger on "create a ReviewHog perspective", "custom review perspective", "my own blind-spot check", "custom validation criteria", "tune what ReviewHog publishes".

testing59

analyzing-expensive-users

Analyze the most expensive users in AI observability and explain why they cost so much. Use when the user asks about top spenders, expensive users, per-user LLM cost, user-level cost drivers, or patterns behind high AI observability spend.

data-ai59

review-hog-validation-criteria

The validation criteria for ReviewHog — the bar for deciding whether a flagged PR issue is worth keeping. Keeps real, user-affecting correctness / security / data-loss / contract / performance problems; drops overengineering, speculation, paranoia, never-gonna-happen edge cases, and style.

development59

review-hog-perspective-performance-reliability

The Performance & Reliability review perspective for ReviewHog. Verifies that changed code will perform and hold up in production — resource efficiency, error handling and recovery, scalability, and operational readiness. Reports performance and reliability issues only.

development59

review-hog-perspective-contracts-security

The Contracts & Security review perspective for ReviewHog. Verifies that changed code is safe and maintains compatibility — API contracts and breaking changes, injection / authz / data exposure, input validation, and schema / interface alignment. Reports security and contract issues only.

development59

review-hog-perspective-logic-correctness

The Logic & Correctness review perspective for ReviewHog. Verifies that changed code does what it is supposed to do — business logic, edge cases, data transformations, and query / data-access correctness. Reports correctness issues only; security and performance are separate perspectives.

development59

configuring-experiment-analytics

Configures the analytics side of a PostHog experiment — exposure criteria (default `$feature_flag_called` vs custom exposure events), primary and secondary metrics, the supported metric types (count, sum, ratio with `math` and `math_property`, retention with `retention_window_start` and `start_handling`), multivariate user handling ("Exclude" vs "First seen variant"), and how to read results once the experiment is live. Use when the user adds or edits a primary or secondary metric (e.g. "add a secondary metric tracking 'downloaded_file' per user"), sets up a ratio metric (e.g. "revenue from purchase_completed / pageviews"), sets up a retention metric (e.g. "$pageview → uploaded_file, 7-day window"), configures custom exposure (e.g. "only count users who hit /checkout"), changes multivariate handling, or asks "who is in the analysis?", "how do I measure impact?", "is this winning?", "what's the confidence level?", or "should I ship?".

testing59

review-hog-blind-spots-general

The general blind-spot check for ReviewHog — the final sweep that runs after every enabled review perspective has reviewed a chunk. Hunts for real, high-value issues that ALL of the perspectives missed, conditioned on what they actually found; returns an empty list over padding.

testing59

feature-usage-feed

Set up an LLM-judge evaluation that extracts canonical use cases for a PostHog feature at scale and streams the results to a Slack channel as a live feed. Use when someone wants to understand how users are actually using a specific AI/LLM-powered feature in production — what they're investigating, what questions they're trying to answer, and what patterns surface — without manually reading hundreds of traces. Assumes the feature emits `$ai_generation` and `$ai_evaluation` events with `$session_id` linkage to the trigger user's recording (the standard setup post the session-summary linkage PRs).

testing59

debugging-surveys

Debug, support, and build PostHog Surveys across the backend and all five SDKs (web/posthog-js, iOS, Android, Flutter, React Native). Use whenever a Surveys support ticket is pasted ("survey not showing", "fewer responses than expected", "responses disappeared", "survey shows on wrong platform"), when diagnosing why a survey does or doesn't display, or when doing survey feature work that must ship across SDKs. Covers the eligibility pipeline, cross-SDK feature parity, the known-cause catalog, read-only diagnostic queries, staff access, and the customer-reply style guide.

development59

finding-deleted-feature-flags

Find feature flags that were soft-deleted in the active project within a recent time window. Use when the user asks "what flags were deleted in the last N days", "show me recently deleted feature flags", "who deleted flag X", "audit recent flag deletions", or anything similar. Handles the non-obvious gotcha that system.feature_flags exposes the deleted boolean but does not expose a deletion timestamp — the actual deleted-at time lives in the per-flag activity log and must be cross-referenced.

development59

suggesting-data-imports

Use when the user asks about revenue, payments, subscriptions, billing, CRM deals, support tickets, ad spend, production database tables, or other data PostHog does not collect natively — or wants to join or correlate PostHog product events with that external business data. Also use when a query fails because a table does not exist or returns no results for expected external data. The data warehouse can import from SaaS tools (Stripe, Hubspot, Zendesk, etc.), ad platforms, production databases (Postgres, MySQL, BigQuery, Snowflake), and other arbitrary data sources. Covers checking existing sources, identifying the right source type, and guiding the setup.

tools59

exploring-llm-clusters

Investigate AI observability clusters — understand usage patterns in AI/LLM traffic, compare cluster behavior, compute cost/latency metrics, and drill into individual traces within clusters.

data-ai59

authoring-signals-scouts

How to author, edit, and adapt PostHog Signals scouts — the scheduled agents that scan a project and emit findings into the Signals inbox. Use when a user wants to customize a canonical scout for their own setup (narrow its scope, retune its thresholds, add disqualifiers), tweak a scout's schedule or dry-run posture, or write a brand-new scout from scratch for a specific use case (a custom event, a product surface no canonical scout covers). Covers the scout SKILL.md anatomy, the emit contract, the dedupe + scratchpad-memory conventions, the per-team skills-store path vs the canonical in-repo path, and the emit-and-inspect test loop (with dry-run as an optional safety net). Trigger on "write/edit/customize a signals scout", "new scout for X", "tune my scout schedule", "make a scout that watches <event>".

testing59

diagnosing-endpoint-performance

Diagnose why a PostHog endpoint is slow or expensive and propose a concrete fix — bump the cache TTL, enable materialisation, restructure variables, or rewrite the query. Use when the user says "this endpoint is slow", "my endpoint times out", "we're hitting the cost cap on this one", or asks "should I materialise this?". Focuses on a single named endpoint, not a project-wide audit.

testing59

exploring-ai-failures

Find where an AI/LLM application is failing in production and surface the failure patterns, working from real traces. Use when someone wants to understand what's going wrong with an AI feature, find and categorize failure modes, triage errors, or investigate quality issues (wrong answers, ignored instructions, hallucinations, tool misuse) — "what's failing in my agent", "surface error patterns", "why are the responses bad", "find the common failure modes", "what should I fix next". Covers scoping to one use case, finding failing traces by whichever signal fits the context (code errors, metric outliers, trace-type slices, manual review, existing-eval spikes, clustering), and reading them into a ranked failure taxonomy.

tools59

exploring-apm-traces

Investigates distributed application performance using PostHog APM (OpenTelemetry span) data via MCP. Use when the user asks about service traces, slow HTTP/database spans, error spans, error-rate trends or spikes, latency distributions, trace IDs, or span attributes — not AI observability traces or product logs. Uses posthog:query-apm-spans, posthog:apm-trace-get, posthog:apm-spans-sparkline, posthog:apm-services-list, posthog:apm-attributes-list, and posthog:apm-attribute-values-list.

tools59

planning-voice-agent-user-interviews

Plan a round of user interviews conducted by PostHog's AI voice agent (a "robo interviewer") — the automated voice-agent interview product. Captures a UserInterviewTopic (who to target, what to ask, framing context, question list) and calls user-interview-topics-create. ONLY trigger when the user clearly wants an AI voice agent to actually run the interview calls (e.g. "set up robo user interviews", "have the voice agent interview these users"). Do NOT trigger for ordinary user research that does not involve the voice agent — finding or shortlisting users to talk to ("who'd be a good fit to interview about Y"), planning questions for a human-run interview, or analysing feedback are audience discovery, handled with normal data queries, not this skill. Also do NOT trigger for uploading a recorded interview audio file or browsing topics with user-interview-topics-list. When intent is ambiguous, first confirm what kind of research it is and whether they want an AI voice agent to conduct it (see Step 0).

development59

improving-mcp-tools

Run an improve-my-MCP campaign: an autoresearch-style loop that measures the MCP agent experience with the eval harness, picks the highest-impact tool problem from production data, makes one bounded fix, and keeps it only if before/after scores improve. Use when asked to "improve my MCP", run an MCP improvement campaign, fix tool discoverability or descriptions based on evidence, or prepare an eval-backed PR for a tool change. Every shipped change must carry eval evidence; guardrails below are hard rules.

tools59

exploring-replay-vision-observations

Guides agents through pulling a Replay Vision scanner's observations, reading the findings, and acting on them — summarizing patterns across sessions, drilling into individual recordings, and turning real, corroborated issues into PostHog tasks, insights, or an investigating-replay hand-off. TRIGGER when: user wants to pull/read/triage Replay Vision observations, asks "what has my scanner found", wants to act on or summarize scanner findings, turn observations into tasks/work, or points at a /replay-vision/<scanner-id> URL. DO NOT TRIGGER when: creating or sizing a scanner (use creating-replay-vision-scanners), running a one-off scan you don't then analyse, or authoring a signals scout.

documentation58

exploring-signals-scouts

How to explore and make sense of PostHog Signals scouts — the scheduled agents that scan a project and emit findings into the Signals inbox. Use when a user wants to understand what scouts they have, how each one is behaving, and whether the fleet is actually working. Covers surveying the fleet and its schedules, reading recent scout runs and drilling into a single run's reasoning, inspecting the durable scratchpad memory the fleet has built up, tracing a run to the findings it emitted, and assessing a scout's health and performance over time (cadence, success rate, emit rate, signal-to-noise). Read-only and exploratory — to write or tune a scout, use `authoring-signals-scouts` instead. Trigger on "what are my scouts doing", "how is my <x> scout performing", "show me recent scout runs", "why did this scout find/emit nothing", "what has the fleet learned", "explore scout run <id>", "is my scout working".

testing58

checking-deploy-timing

Determine when a PostHog code change reached a given environment by reading the hidden GIT deploy annotations in the project and correlating them with the merge commit on GitHub. Use when PostHog staff ask "when was X deployed", "is my change live in the US/EU yet", "has my PR shipped", "did the fix roll out to prod-us", or otherwise want to know whether/when a commit, PR, or feature went out to a region. Do not answer deploy-timing questions from event/data volume alone — that only shows when data changed, not when code shipped.

development58

analysing-exported-session-recordings

Decode and analyse a downloaded session-recording export zip (the `data/` blocks plus ClickHouse metadata) to understand what is making a recording large or slow: size composition by DOM mutations vs full snapshots vs network bodies, biggest payloads, content-type breakdown, churned tags/attributes. Use when asked "why is this recording so big?", "what's in this export?", "break down replay size for this session", "analyse the exported recording", or to validate whether a size-reduction change (e.g. skipping binary network bodies) would actually help a given session. Pairs with `exporting-session-recordings`, which produces the zip this skill reads.

tools57

debugging-signals-pipeline

Debug the signals pipeline locally end-to-end. Covers emitting test signals from fixtures, monitoring Temporal workflows via the REST API, reading sandbox agent logs from object storage, inspecting Docker sandbox containers, and diagnosing common failures (stale ClickHouse embeddings, agentsh network denials, inactivity timeouts). Use when a signal isn't reaching the inbox, a signal-report-summary workflow fails, or a sandbox task run times out.

tools57

filtering-bot-traffic

Identify, measure, and exclude bot / crawler / AI-agent traffic in PostHog web and product analytics using the traffic classification surface (the isLikelyBot / getTrafficType HogQL functions and the $virt_* virtual properties). Use when the user asks to "exclude bots", "filter out crawlers", "remove bot traffic from my numbers", "how much of my traffic is bots / AI crawlers", "is GPTBot / ChatGPT / Claude hitting my site", "break down traffic by human vs bot", or wants clean human-only counts in an insight or dashboard. For the real-time Live tab bot tiles, use exploring-live-traffic instead.

development57

exploring-llm-evaluations

Investigate AI observability evaluations — `hog` (deterministic code-based), `llm_judge` (LLM-prompt-based), and `sentiment` (user-message sentiment). Find existing evaluations, inspect their configuration, run them against specific generations, query individual results, and generate AI-powered summaries for boolean pass/fail runs. Use when the user asks to debug why an evaluation is failing, surface common failure modes, compare results across filters, dry-run a Hog evaluator, prototype a new LLM-judge prompt, inspect sentiment classifications, or manage the evaluation lifecycle.

development54

auditing-warehouse-view-health

Audit the health of a PostHog project's materialized views (saved queries) — find every failed materialization and flag unused or stale materialized views that cost storage and compute. Use when the user asks "which of my views are broken?", "why is this materialized view failing?", "are any of my views wasting compute?", or wants a one-shot triage of view health. For source/sync health use `auditing-warehouse-source-health`.

testing54

diagnosing-failed-warehouse-syncs

Diagnose why a data warehouse sync is failing and recommend the right recovery action. Use when the user asks "why isn't my Stripe/Postgres/Hubspot sync working?", "this table has been stuck for hours", "the data in the warehouse looks wrong", or wants to troubleshoot a specific source or schema. Covers source-level vs schema-level failures, stuck Running states, credential and schema-drift errors, incremental-field misconfig, CDC prerequisite failures, and the cancel / reload / resync / delete-data recovery actions.

testing54

investigating-metric-anomalies

Investigates server/infrastructure metric anomalies in PostHog Metrics — from "this metric is rising/dropping/spiking" or a fired alert to a probable cause with evidence. Use when asked why a metric looks wrong (ingestion lag rising, error rate spiking, latency degrading, queue depth growing, throughput dropping), when an alert fires on an OTel/Prometheus metric, or for any incident triage that starts from a metric symptom. Composes characterize-metric-anomaly, query-metrics, and metric-names-list with logs (query-logs) and traces (APM span tools) for cross-signal root-cause correlation.

tools54

skills-store

Discover and use shared team skills stored in PostHog. Use when the user asks to list, browse, load, or manage "shared skills", "team skills", or references the "skills store" / "skill store".

tools54

choosing-trend-or-slope-view

Clarify how to visualize change over a time range before building a trend. Use whenever the user asks how much something changed, grew, dropped, improved, or regressed between two points or periods — "how much did X change from A to B", "before vs after", "start vs end", "week over week", "compare this month to last", "change over time" — or mentions a "slope chart" / "slopegraph". Two readings of "change" need different charts: the whole trend (a line, every interval) versus just the two endpoints (a slope, start vs end). Ask which they want, then render it. Not for choosing a saved insight ChartDisplayType in the insight editor.

development54

consuming-endpoints-from-client-code

Wire a PostHog endpoint into a client app or SDK. Covers fetching the OpenAPI spec, generating a typed client with openapi-generator or @hey-api/openapi-ts, sending the right auth header, shaping the variables payload (HogQL code_name vs insight breakdown property), handling rate-limit and materialised-endpoint error responses. Use when the user says "how do I call my endpoint", "generate a client for this", or "what auth header do I use".

tools54

working-with-skills

Best practices for agents managing PostHog skills via the MCP `skill-*` tools — how to discover, read, create, update, and refactor skills efficiently, especially large skills with many bundled files. Use whenever you are about to call any `skill-*` tool, asked to author or edit a shared skill, or troubleshoot why a skill write was rejected. Pairs with `skills-store` (which covers the raw tool surface) by adding the decision-tree, efficiency, and pitfall guidance.

tools54

auditing-warehouse-source-health

Audit the health of a PostHog project's data warehouse sources and syncs — find every broken or degraded source connection, sync schema, and webhook channel. Use when the user asks "why are my imports failing?", "what's broken with my sources?", "why is my warehouse data stale?", or wants a one-shot triage of source/sync health before deciding where to dig in. Produces a prioritized report grouped by severity, with recommended next steps. For materialized-view health use `auditing-warehouse-view-health`; for a single failing sync use `diagnosing-failed-warehouse-syncs`.

development54

exporting-session-recordings

Export a single session recording's raw data (recording blocks + ClickHouse metadata) to a downloadable zip, and download it. Staff-only, via the django admin portal. Use when asked to "export a session recording", "get the raw recording data", "download a replay export", "pull a recording for analysis/support", or to inspect recording block / canvas frame sizes offline. Explains the team-ownership rule, how to build the admin export / download links for the owning team, and that there is no export or download API/MCP tool.

tools54

assessing-heatmaps

Assesses what a page's heatmap is telling you and recommends concrete changes. Pulls click / rageclick / scroll-depth data for a URL, names the hot elements by cross-referencing autocapture events on the same page, and can create a saved heatmap the user opens in PostHog, then summarizes the behavior and proposes improvements. TRIGGER when: user asks what a heatmap shows, why people aren't clicking something, where users rage-click, how far they scroll, what to change on a page based on heatmap/click data, or to 'analyze/assess/review the heatmap' for a URL. DO NOT TRIGGER when: the user only wants to create a saved heatmap screenshot with no analysis (use heatmaps-saved-create directly), or is asking about session replay in general (use investigating-replay).

tools54

managing-reminders

Create and manage PostHog reminders — private, human-paced nudges that fire as in-app notifications on a schedule, optionally linked to a PostHog resource. Use when the user says "remind me to…", wants a one-off or recurring nudge (daily/weekly/monthly/yearly, a cron schedule, or a specific date/time), wants to be reminded to look at a dashboard, insight, experiment, feature flag, survey, notebook, replay, or error, or wants to list, change, or cancel their reminders. Covers when to pick a reminder over an alert or subscription, the one-off vs recurring vs cron schedule field mappings, timezones, and attaching a resource.

testing54

creating-ai-subscription

Create a recurring AI-generated PostHog report — schedule a free-text prompt to run on a cron, with the LLM-synthesized markdown delivered to email or Slack on each tick. Use when the user wants a recurring AI summary of X on any cadence (daily, weekly, monthly, yearly) rather than a one-off report. (To attach an AI summary to an existing insight/dashboard subscription instead of a free-text prompt, see `managing-subscriptions` and its `summary_enabled` option.)

data-ai53

finding-replay-for-issue

Finds the most informative session recording linked to an error tracking issue. Use when a user has an error tracking issue ID and wants to watch a replay showing what the user was doing when the error occurred. Ranks linked sessions by recency, activity score, and journey completeness, then summarizes the pre-error context. Replaces blind session picking from potentially hundreds of linked recordings.

development53

exploring-endpoint-execution-logs

Explore and diagnose a PostHog endpoint's execution logs — error messages, failed runs, cache misses, slow runs, or unexpected row counts during endpoint invocations. Use when the user says "my endpoint is failing", "show me the logs for endpoint X", "what error did endpoint Y produce", "why did endpoint Z return no rows", "is this endpoint hitting cache", or "check the last N runs". Focused on a single named endpoint's runtime log entries, not project-wide auditing or query performance profiling.

testing53

managing-subscriptions

Manage PostHog subscriptions — scheduled email, Slack, or webhook deliveries of insight or dashboard snapshots, optionally with an AI-written summary attached to each delivery. Use when the user wants to subscribe to an insight or dashboard, get an AI summary attached to those deliveries, check existing subscriptions, change delivery frequency, add or remove recipients, or stop receiving updates.

development53

formatting-insight-axes

Pick the right y-axis unit when creating or updating a TrendsQuery insight via `posthog:insight-create` or `posthog:insight-update`. Use when the agent is about to add a `formula` purely to convert units (e.g. dividing seconds by 60 to display minutes), when a `math_property` is a duration, currency, ratio, or large count, or whenever the user mentions "format the y-axis", "duration", "seconds", "minutes", "hours", "milliseconds", "ms", "percentage", "currency", "decimals", "axis label", or "axis unit" in the context of a graph insight.

development53

auditing-experiments-flags

Audit PostHog experiments and feature flags for configuration issues, staleness, and best-practice violations. Read when the user asks to audit, health-check, or review experiments or feature flags, check flag hygiene, or verify experiment setup.

testing53

analyzing-experiment-session-replays

Analyze session replay patterns across experiment variants to understand user behavior differences. Use when the user wants to see how users interact with different experiment variants, identify usability issues, compare behavior patterns between control and test groups, or get qualitative insights to complement quantitative experiment results.

testing53

diagnosing-sdk-health

Diagnoses the health of a project's PostHog SDK integrations — which SDKs are out of date and how to fix them. Use when a user asks about PostHog SDK versions, outdated SDKs, upgrade recommendations, "SDK health", "SDK doctor" (the former name), or when events or features seem off and it might be due to an old SDK.

development51

creating-an-endpoint

Create a PostHog endpoint with the right shape on the first try — covers query kind choice, name conventions, what to expose as variables (HogQL code_name vs insight breakdown), data_freshness_seconds, and whether to materialise on day one. Use when the user says "create an endpoint", "expose this query as an API", "turn this insight into an endpoint", or asks for help structuring a new endpoint. Steers away from common mistakes: materialising a query with cohort breakdowns or compare mode, inline-only variables on a materialised endpoint, unbounded date ranges, ambiguous names.

development50

managing-endpoint-versions

Work safely with endpoint versions — preview a draft in the playground, roll back to an older version, update settings on one version without bumping query history, deactivate a specific version. Use when the user asks "how do I roll back my endpoint", "preview my changes before publishing", "I want to fix v5 without bumping the version", or anything involving the version history. Calls out today's limitations honestly: there is no pointer flip; "rollback" means forking the old query into a new top version.

testing50

auditing-endpoints

Audit every endpoint in a PostHog project for staleness, failed materialisations, and unused materialised versions. Use when the user asks "what endpoints can I clean up?", "are any of my endpoints broken?", "which materialised versions are still being called?", or wants a one-shot cleanup pass over the Endpoints product. Produces a prioritised report grouped by issue type, with recommended actions but does not modify anything without explicit confirmation.

testing50

triaging-visual-review-runs

Inspects PostHog Visual Review (VR) runs that gate PR merges with screenshot regression checks. Use when the user mentions "visual review", "VR", "snapshot diff", "screenshot test", "storybook regression", "playwright snapshot", asks why a PR is blocked or what changed visually, wants to triage the VR backlog, decide whether a snapshot diff is real vs flaky, or check whether a story has been changing across runs. Also invoke when a PR has a failing `visual-review` status check, when a PR comment mentions "Visual review", or when the user is on a branch with an open VR run.

testing50

signals-scout-llm-analytics

Focused Signals scout for PostHog projects using LLM analytics. Watches `$ai_generation`, `$ai_evaluation`, `$ai_trace` and related events for cost spikes, latency drift, eval pass-rate drops, runaway loops, and error rates. Emits findings only when they clear the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet — no dependencies on other skills. Picked uniformly at random by the coordinator alongside `signals-scout-general` and other specialists.

testing49

finding-sessions-to-watch

Guides a user from "I want to watch recordings but don't know which ones" to a short, high-signal list of sessions worth watching. Use when the user asks which sessions or replays to watch, wants help finding interesting / useful recordings, says they don't know where to start in session replay, or wants to watch sessions about a goal (signup, pricing, onboarding, checkout, a feature, rageclicks, errors, mobile, a specific person) without naming exact filters. Turns a vague intent into a focused RecordingsQuery via `query-session-recordings-list`, then deep-links the best few and hands off to `investigating-replay`. Do NOT use when the user already has a recording/session ID (use investigating-replay) or wants the replay for a known error issue (use finding-replay-for-issue).

tools49

downloading-batch-export-files

Export PostHog events, persons, or sessions on demand and download the resulting files. Use when the user asks to download/export raw PostHog data, create a one-off file export, fetch a Parquet or JSONLines export, or use the file_download_batch_exports API. Covers starting the export with MCP, polling completion, and downloading via the existing REST redirect endpoint.

tools47

planning-user-interviews

Plan a user interview topic in PostHog — pick who to target (cohort, emails, or PostHog distinct IDs), draft what to ask about, and prepare the voice-agent context plus a question list. Use when the user asks to "talk to users", "check how users feel about X", "interview some customers", "set up a user interview", "run a user-research call", "find users to ask about Y", or otherwise wants qualitative feedback through a conversation. Walks the user through targeting (cohorts-list, persons-list, or accepting emails / distinct IDs directly), captures the topic, and prompts for agent context and questions before calling user-interview-topics-create. Cohort targeting is resolved to explicit emails/distinct_ids at create time — topics snapshot their audience and do not re-evaluate cohort membership later. Do NOT trigger when the user is uploading a recorded interview audio file (that's the separate UserInterview/transcript flow) or only browsing existing topics with user-interview-topics-list.

testing38

exploring-live-traffic

Inspects PostHog Web analytics Live tab data — current users online, last-30-minutes pageviews, top pages, referrers, devices, browsers, countries, bot traffic, and the per-minute bot/users charts. Use when the user asks "who is on my site right now?", "what is happening live?", "what bots are crawling me?", asks about the "live tab" / "live dashboard", wants live numbers (last 30 min), or wants help filtering or drilling into the live view. Also covers building product-analytics insights that mirror what the tiles show.

development31

copying-flags-across-projects

Copy a feature flag from one PostHog project to one or more target projects in the same organization. Use when the user wants to duplicate a flag, promote a flag from staging to production, sync flags across projects, or replicate a flag configuration in a different workspace. Covers cohort remapping, scheduled-change handling, encrypted payloads, and the safe defaults (disabled in target, no scheduled changes).

tools29

debugging-local-replay

Debugs why session recordings aren't appearing in the local dev environment. Use when a developer reports that local replay ingestion isn't working, recordings aren't showing up despite /s calls, or the replay pipeline seems broken after hogli start. Covers the full local pipeline: SDK capture, Caddy proxy, capture-replay (Rust), Kafka, ingestion-sessionreplay (Node), recording-api (Node), SeaweedFS, and common failure modes like orphaned processes, stuck phrocs workers, and trigger misconfiguration.

development29

signals

How to query the document_embeddings table for raw signal data using HogQL. Use when you need to perform semantic search over signals, fetch every signal that contributed to a specific report, or list signal types. For browsing the curated report layer (the Inbox) — listing reports, filtering by status/source, drilling into a single report by ID — use the `inbox-exploration` skill first; drop into this skill afterwards if the user wants the underlying observations.

testing29

tuning-incremental-sync-config

Change the sync configuration of an existing data warehouse schema — switch sync_type, pick a different incremental_field, set primary_key_columns, choose cdc_table_mode, or change sync_frequency. Use when the user asks "switch my orders table from full refresh to incremental", "this table is syncing too slowly / too frequently", "I need to pick a different incremental column", "set up CDC for this Postgres table", or when diagnosis of a failing sync pointed to an incremental-field or PK misconfiguration.

testing26

auditing-warehouse-data-health

Audit the health of a PostHog project's data warehouse — find every broken or degraded pipeline item across sources, sync schemas, materialized views, batch exports, and transformations. Use when the user asks "what's broken in my warehouse?", "give me a health check", "audit my data pipeline", "why are some dashboards stale?", or wants a one-shot triage summary before deciding where to spend time. Produces a prioritized report of issues grouped by severity and type, with recommended next steps.

development26

cleaning-up-stale-feature-flags

Identify and clean up stale feature flags in a PostHog project. Use when the user wants to find unused, fully rolled out, or abandoned feature flags, review them for safety, and then disable or delete them. Covers staleness detection, dependency checking, and safe removal workflows.

testing22

query-examples

HogQL query examples and reference material for PostHog data. Read when writing SQL queries to find patterns for analytics (trends, funnels, retention, lifecycle, paths, stickiness, web analytics, error tracking, logs, sessions, LLM traces) and system data (insights, dashboards, cohorts, feature flags, experiments, surveys, hog flows, data warehouse). Includes HogQL syntax differences, system model schemas, and available functions.

development22