Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

posthog/creating-experiments

Name: creating-experiments
Author: posthog

skills/creating-experiments/SKILL.md

npx skillsauth add posthog/ai-plugin creating-experiments

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Creating experiments

This skill walks through the 3-step flow for creating a new A/B test experiment.

Core principle: draft first, iterate on details

Create the experiment as a draft quickly, then iterate on metrics and configuration. The user gets a tangible draft immediately and can refine it.

The 3-step creation flow

Step 1: What are we testing?

Gather these before calling experiment-create:

Experiment name — descriptive, inferred from context when possible
Hypothesis — what you expect to happen (goes in description)
Feature flag key — kebab-case. Ask if they want a new flag or to reuse an existing one. The flag is auto-created — do NOT create one separately.
Type — leave empty (will internally default to "product". The "web" value is reserved for no-code experiments configured visually with the PostHog toolbar in a browser; it cannot be meaningfully driven via MCP. If a user asks for a no-code/toolbar experiment, point them to the PostHog UI instead of creating one here.)

If the user gives enough context to infer these, don't ask — just proceed.

Step 2: Who sees what variant?

This is about rollout configuration.

Before asking any rollout question, load configuring-experiment-rollout. The disambiguation wording, recommendations, and post-answer branches live there — do not formulate rollout questions yourself, and do not assume an example you remember covers the user's path.

Key decision points (covered in detail by configuring-experiment-rollout):

Variant split (how many variants, what percentage each)
Overall rollout percentage (what % of all users enter the experiment)
Whether to persist the flag across authentication steps

If the user doesn't mention rollout specifics, use defaults: 50/50 control/test, 100% rollout.

Step 3: How to measure impact?

This is about analytics and metrics. Load the configuring-experiment-analytics skill for guidance. That skill's first step checks for an existing shared metric to reuse before building a new one — don't duplicate a metric the project already has set up.

Do NOT configure metrics on creation. Metrics are not passed to experiment-create — they are added afterwards via experiment-update. This keeps the creation call lightweight.

When the user specifies metrics upfront, acknowledge them and add them immediately after creation. When they don't, create the draft and then guide them through metric setup as a follow-up.

How to create

Call experiment-create with:

{
  "name": "Descriptive experiment name",
  "feature_flag_key": "kebab-case-key",
  "description": "Hypothesis: [what you expect to happen]",
  "feature_flag": {
    "filters": {
      "multivariate": {
        "variants": [
          { "key": "control", "name": "Control", "rollout_percentage": 50 },
          { "key": "test", "name": "Test", "rollout_percentage": 50 }
        ]
      },
      "groups": [{ "properties": [], "rollout_percentage": 100 }]
    },
    "ensure_experience_continuity": false
  }
}

Flag config goes in the feature_flag object, in the flag's own filters shape (not the deprecated parameters keys). Two different percentages live in there, do NOT mix them up:

filters.multivariate.variants[].rollout_percentage is how users inside the experiment are split across variants (must sum to 100, recommended to have an even split).
filters.groups[0].rollout_percentage is the overall gate: what fraction of all users enter the experiment at all (0-100, defaults to 100).

Key details:

Minimum 2, maximum 20 variants. No specific variant key is required — the analysis baseline defaults to the variant keyed "control" when present, else the first variant (override with stats_config.baseline_variant_key). Convention: key the baseline "control" unless the user asks for specific keys.
filters.groups[0].rollout_percentage defaults to 100 if omitted.
ensure_experience_continuity persists a user's variant across authentication steps; leave it false unless the flag is shown to both logged-out and logged-in users (see configuring-experiment-rollout).
Stats default to Bayesian. Only set stats_config if the user requests Frequentist.

After creation

Always show the experiment URL. The experiment-create response includes _posthogUrl — always display this link so the user can view and configure the experiment in the UI.
Remind the user to implement the feature flag in code. Link to the experiment page and say "implement the flag as shown here" — the experiment detail page shows implementation snippets for the user's SDK.
Guide through metrics if not yet configured — load the configuring-experiment-analytics skill.
Launch when ready — use the experiment-launch tool.

posthog/creating-experiments

skills/creating-experiments/SKILL.md

Guides agents through the 3-step experiment creation flow: defining the hypothesis, configuring rollout, and setting up analytics. Delegates rollout decisions to configuring-experiment-rollout and metric setup to configuring-experiment-analytics. TRIGGER when: user asks to create a new experiment or A/B test, OR when you are about to call experiment-create. DO NOT TRIGGER when: user is updating an existing experiment, managing lifecycle, or only browsing experiments.

61 stars

testing

Updated Jul 18, 2026

$ install --global

skillsauth

npx skillsauth add posthog/ai-plugin creating-experiments

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jul 18, 2026, 6:04 AM163.9s1 file scanned

SKILL.md

name:: creating-experiments
description:: Guides agents through the 3-step experiment creation flow: defining the hypothesis, configuring rollout, and setting up analytics. Delegates rollout decisions to configuring-experiment-rollout and metric setup to configuring-experiment-analytics.\nTRIGGER when: user asks to create a new experiment or A/B test, OR when you are about to call experiment-create.\nDO NOT TRIGGER when: user is updating an existing experiment, managing lifecycle, or only browsing experiments.

Creating experiments

This skill walks through the 3-step flow for creating a new A/B test experiment.

Core principle: draft first, iterate on details

Create the experiment as a draft quickly, then iterate on metrics and configuration. The user gets a tangible draft immediately and can refine it.

The 3-step creation flow

Step 1: What are we testing?

Gather these before calling experiment-create:

Experiment name — descriptive, inferred from context when possible
Hypothesis — what you expect to happen (goes in description)
Feature flag key — kebab-case. Ask if they want a new flag or to reuse an existing one. The flag is auto-created — do NOT create one separately.
Type — leave empty (will internally default to "product". The "web" value is reserved for no-code experiments configured visually with the PostHog toolbar in a browser; it cannot be meaningfully driven via MCP. If a user asks for a no-code/toolbar experiment, point them to the PostHog UI instead of creating one here.)

If the user gives enough context to infer these, don't ask — just proceed.

Step 2: Who sees what variant?

This is about rollout configuration.

Key decision points (covered in detail by configuring-experiment-rollout):

Variant split (how many variants, what percentage each)
Overall rollout percentage (what % of all users enter the experiment)
Whether to persist the flag across authentication steps

If the user doesn't mention rollout specifics, use defaults: 50/50 control/test, 100% rollout.

Step 3: How to measure impact?

Do NOT configure metrics on creation. Metrics are not passed to experiment-create — they are added afterwards via experiment-update. This keeps the creation call lightweight.

When the user specifies metrics upfront, acknowledge them and add them immediately after creation. When they don't, create the draft and then guide them through metric setup as a follow-up.

How to create

Call experiment-create with:

{
  "name": "Descriptive experiment name",
  "feature_flag_key": "kebab-case-key",
  "description": "Hypothesis: [what you expect to happen]",
  "feature_flag": {
    "filters": {
      "multivariate": {
        "variants": [
          { "key": "control", "name": "Control", "rollout_percentage": 50 },
          { "key": "test", "name": "Test", "rollout_percentage": 50 }
        ]
      },
      "groups": [{ "properties": [], "rollout_percentage": 100 }]
    },
    "ensure_experience_continuity": false
  }
}

Flag config goes in the feature_flag object, in the flag's own filters shape (not the deprecated parameters keys). Two different percentages live in there, do NOT mix them up:

filters.multivariate.variants[].rollout_percentage is how users inside the experiment are split across variants (must sum to 100, recommended to have an even split).
filters.groups[0].rollout_percentage is the overall gate: what fraction of all users enter the experiment at all (0-100, defaults to 100).

Key details:

Minimum 2, maximum 20 variants. No specific variant key is required — the analysis baseline defaults to the variant keyed "control" when present, else the first variant (override with stats_config.baseline_variant_key). Convention: key the baseline "control" unless the user asks for specific keys.
filters.groups[0].rollout_percentage defaults to 100 if omitted.
ensure_experience_continuity persists a user's variant across authentication steps; leave it false unless the flag is shown to both logged-out and logged-in users (see configuring-experiment-rollout).
Stats default to Bayesian. Only set stats_config if the user requests Frequentist.

After creation

Always show the experiment URL. The experiment-create response includes _posthogUrl — always display this link so the user can view and configure the experiment in the UI.
Remind the user to implement the feature flag in code. Link to the experiment page and say "implement the flag as shown here" — the experiment detail page shows implementation snippets for the user's SDK.
Guide through metrics if not yet configured — load the configuring-experiment-analytics skill.
Launch when ready — use the experiment-launch tool.

Related Skills

posthog/signals-scout-tasks

data-ai

VerifiedTrustedCommunity

Signals scout for PostHog Tasks, the agent work items a project runs. Two lenses: delivery health (runs failing, clustered by repository and error class, and retry storms) every run, and on a slower rotation demand (recurring asks across human-authored tasks that point at a product gap). Skips the scout fleet's own run rows.

65SKILL.mdUpdated Jul 29, 2026

posthog/signals-scout-tasks

posthog/signals-scout-conversations

devops

VerifiedTrustedCommunity

Signals scout for the PostHog Conversations (support inbox) product. Watches the `$conversation_*` ticket-lifecycle events for support-delivery regressions — SLA breach-rate steps, first-response latency blowouts, backlog inflow-vs-resolution imbalance, and channel / assignment concentration — and files each dated regression as a report. Complements the per-ticket product-feedback signals the emission pipeline already fires; does not re-surface individual ticket content.

65SKILL.mdUpdated Jul 29, 2026

posthog/signals-scout-conversations

posthog/setting-up-data-catalog

development

VerifiedTrustedCommunity

Populates and maintains a project's data catalog (semantic layer): canonical metrics, trust marks (certifications) on warehouse tables/views, and reviewed table relationships. Use when asked to set up / seed / bootstrap the data catalog or semantic layer, to catalog a project's metrics, to certify or deprecate data sources, to propose or review table joins, or to work through the proposal review queue. To *use* an existing catalog to answer a business-number question, see querying-posthog-data instead. Trigger terms: data catalog, semantic layer, canonical metric, certify table, deprecate source, relationship proposal, metric drift, review queue.

65SKILL.mdUpdated Jul 29, 2026

posthog/setting-up-data-catalog

posthog/investigating-logs

tools

VerifiedTrustedCommunity

Investigate logs in a PostHog project: verify a service or deployment is healthy, explain an error spike, triage an incident, or understand what a log stream is saying. Use when the user asks to "check the logs", asks whether a service, deploy, release, or change is working or broke anything, asks why errors are up or what changed, or wants the root cause of failures visible in logs. Routes the logs MCP tools (services overview, pattern mining, before/after pattern diffing, bucketed counts, facets, raw rows) so investigations start from summaries instead of raw rows or hand-written SQL over the logs table.

65SKILL.mdUpdated Jul 29, 2026

posthog/investigating-logs

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/posthog/ai-plugin.git

# Copy into Claude Code skills folder (global)
cp -r ai-plugin/skills/creating-experiments ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

posthog/ai-plugin

61 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT