skills/suggesting-data-imports/SKILL.md
Use when the user asks about revenue, payments, subscriptions, billing, CRM deals, support tickets, production database tables, or other data that PostHog does not collect natively. Also use when a query fails because a table does not exist or returns no results for expected external data. The data warehouse can import from SaaS tools (Stripe, Hubspot, etc.), production databases (Postgres, MySQL, BigQuery, Snowflake), and other arbitrary data sources. Covers checking existing sources, identifying the right source type, and guiding the setup.
npx skillsauth add posthog/ai-plugin suggesting-data-importsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill helps identify when data the user needs lives outside PostHog and guides them toward importing it via the data warehouse. The key insight is recognizing the gap — then connecting it to the right source type.
PostHog collects product analytics events, persons, sessions, and groups via its SDKs. Additional products are available but must be enabled: session replay, feature flags, experiments, surveys, web analytics, error tracking, AI observability, conversations, logs, revenue analytics, workflows, CDP destinations, and batch exports. PostHog does not collect external business data like payments, subscriptions, CRM records, support tickets from other systems, or production database tables — that data must be imported via the data warehouse.
Listen for signals that the user needs external data:
If a query failed, check the error — if it's "table not found" or similar, the data likely needs to be imported.
Call posthog:external-data-sources-list to see existing sources. The data might already be imported but the user doesn't know the table name or prefix.
If a source exists for the system they're asking about, call posthog:external-data-schemas-list to show the available tables. The data might be there but under a different name or prefix.
Also call posthog:read-data-warehouse-schema to see all queryable tables — the data might already be available as a view or joined table.
If the data isn't imported yet, call posthog:external-data-sources-wizard to see available source types. Match the user's need to a source:
Common patterns:
| User wants | Source type | Key tables | | -------------------------- | ---------------------------------------------- | ------------------------------------------- | | Revenue / payment data | Stripe, Chargebee, Shopify | charges, subscriptions, invoices, customers | | CRM / sales pipeline | Hubspot, Salesforce, Attio | contacts, deals, companies | | Support tickets | Zendesk | tickets, users, organizations | | Product data from their DB | Postgres, MySQL, BigQuery, Snowflake, Redshift | user's own tables | | Marketing / ads | Google Ads, Meta Ads, LinkedIn Ads, TikTok Ads | campaigns, ad_groups, ads | | Email marketing | Mailchimp, Klaviyo | campaigns, lists, subscribers | | Project management | Linear | issues, projects | | Error tracking (external) | Sentry | issues, events |
Present the recommendation concisely:
Example: "Your Stripe data isn't in PostHog yet. If you connect a Stripe source, you'll get tables like charges, subscriptions, and customers that you can join with PostHog events to analyze revenue by user behavior."
If the user wants to proceed, hand off to the setting-up-a-data-warehouse-source skill — it covers the full
three-step flow (wizard → db-schema → create), sync-type selection, webhook registration, and prefix guidance.
Do not duplicate that workflow here.
Once connected, help the user write their first query joining PostHog data with the imported data. Use posthog:execute-sql to demonstrate.
Common join patterns:
SELECT * FROM stripe_customers sc JOIN persons p ON sc.email = p.properties.$emailposthog:read-data-warehouse-schema and posthog:external-data-schemas-list before saying data doesn't exist.stripe_charges not charges). The user might not know the prefix./data-warehouse/new.posthog:external-data-sources-list: Check existing source connectionsposthog:external-data-schemas-list: Check what tables are already importedposthog:read-data-warehouse-schema: See all queryable tables including viewsposthog:external-data-sources-wizard: Get available source typesposthog:execute-sql: Run queries to demonstrate what's possiblesetting-up-a-data-warehouse-source: Full source creation workflow — hand off here once the user decides to connect a sourcetools
Focused Signals scout for PostHog projects with web traffic. Watches the acquisition and site-health layer the web analytics product reports on: per-channel session volume diverging from the site's own rhythm (an acquisition source silently collapsing or surging), attribution breakage (paid/campaign traffic reclassifying into Direct or Unknown when tagging breaks), landing pages that break (bounce-rate steps, 404 spikes, entry-path cliffs), and page-performance regressions (web vitals p75 steps). Emits findings only when they clear the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet.
tools
Focused Signals scout for PostHog projects using session replay. Watches two promises the replay product makes: that sessions are actually being recorded (capture integrity — recording volume vanishing while site traffic doesn't), and that the friction evidence inside recordings gets seen (rage-click / dead-click clusters concentrating on a page or element, error-after-interaction cohorts, recurring replay vision themes nobody aggregates). Emits findings only when they clear the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet.
tools
Focused Signals scout for PostHog setup health. Reads the project's active health issues — the deterministic findings of PostHog's own health checks (no live events, outdated SDKs, missing reverse proxy, absent web vitals, ingestion warnings, failing data-warehouse models, and more) — and decides which are genuinely worth surfacing. Unlike a one-signal-per-issue push, it bundles kind-clusters into a single finding, weights by real blast radius (cross-referencing actual event volume and reach), and prioritizes issues an agent can resolve via the MCP. Emits only above the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet — no dependencies on other skills.
tools
Focused Signals scout for PostHog projects using feature flags. Watches the flag roster and the `$feature_flag_called` evaluation stream for contradictions between a flag's configured state and its real traffic: evaluation cliffs on healthy flags, ghost flags (code calling keys that no longer exist), response-distribution shifts with no corresponding flag edit, and flag debt (stale, fully-rolled-out, or dead flags still burning evaluations). Emits findings only when they clear the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet — no dependencies on other skills.