skills/creating-replay-vision-scanners/SKILL.md
Guides agents through creating and safely sizing a Replay Vision scanner: choosing the scanner type (monitor/classifier/scorer/summarizer), shaping the RecordingsQuery that selects sessions, and — crucially — estimating observation volume and checking the org's monthly quota before creating, so a broad scanner doesn't exhaust the budget on its first scheduled sweep. TRIGGER when: user asks to create, set up, or configure a Replay Vision scanner, OR when you are about to call vision-scanners-create, OR when widening an existing scanner's query or sampling_rate via vision-scanners-update. DO NOT TRIGGER when: only reading scanners or observations, deleting a scanner, or running an existing scanner against a single session on demand (vision-scanners-scan-session).
npx skillsauth add posthog/ai-plugin creating-replay-vision-scannersInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A scanner is a standing LLM probe over session recordings. Once created and enabled, it runs on a
Temporal schedule that sweeps every 5 minutes, applying its prompt to each new matching recording and
recording the result as an observation (a queryable $recording_observed event). Each observation counts
against a monthly org quota (a fixed number of observations per calendar month).
That schedule is exactly why creation needs a gut-check: a scanner with a permissive query and full sampling starts consuming quota automatically and can drain the whole month's budget within its first few sweeps. Creation itself does not check quota — that protection only kicks in at observation time, by which point the budget may already be gone.
Never create an enabled scanner blind. Estimate its volume, check remaining quota, and — when the projected volume is a meaningful fraction of what's left — show the user the numbers and get confirmation before creating. This is the heart of the skill; the rest is supporting detail.
Pick a scanner_type and write its scanner_config. Every type needs a prompt; the rest is type-specific:
| Type | What it produces | scanner_config shape |
| ------------ | ----------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| monitor | Open-ended observation against a prompt (e.g. "flag rage clicks") | {"prompt": "..."} |
| classifier | Assigns tags from a fixed label set | {"prompt": "...", "tags": ["tag-a", "tag-b"]} — tags needs ≥1 entry; optional "multi_label": true, "allow_freeform_tags": false |
| scorer | Numeric score on a rubric | {"prompt": "...", "scale": {"min": 1, "max": 5, "label": "frustration"}} — min < max; label optional |
| summarizer | Free-text summary; optional facet embeddings for search | {"prompt": "..."}; optional "length": "short" \| "medium" \| "long" (default "medium"), "emits_embeddings": false |
scanner_type is locked after creation — to change it you delete and recreate, so confirm the type is
right up front, and get the scanner_config shape right (a wrong shape is a create error, not a silent
default).
If the user's intent makes the type and prompt obvious, just proceed — don't interrogate them.
The query is a RecordingsQuery shape that selects which recordings the scanner watches. date_from and
date_to are ignored (the schedule controls time), so don't bother setting them. Narrow the query to the
sessions that actually matter — by event, URL, person property, duration, etc. A narrow query is the single
biggest lever on cost.
sampling_rate (0..1, default 1.0) is a random downsample applied after the query matches. Lower it to
trade coverage for budget.
Before creating, run both checks and reason about them together:
vision-scanners-estimate-create with the proposed query + sampling_rate.
It returns matched_sessions_in_window, the window_days measured, and
estimated_observations_per_month.vision-quota-retrieve for remaining and exhausted against the org's monthly
monthly_quota.Then decide:
estimated_observations_per_month comfortably fits within remaining, proceed.remaining, stop and tell the user the concrete numbers
— e.g. "This scanner is projected to produce ~X observations/month; you have Y of Z left this month." —
and confirm before creating, or suggest tightening the query or lowering sampling_rate first.exhausted, say so — a new enabled scanner won't produce anything until the quota
resets, and its observations will be silently skipped.Confirmation here is a conversation step, not an API capability — surface the trade-off and let the user choose. When the projected volume is clearly small relative to the budget, you don't need to ask.
Call vision-scanners-create. Minimal example:
{
"name": "Rage click monitor",
"scanner_type": "monitor",
"scanner_config": { "prompt": "Flag sessions where the user repeatedly clicks the same element in frustration." },
"query": { "kind": "RecordingsQuery", "events": [{ "id": "$rageclick", "type": "events" }] },
"sampling_rate": 1.0,
"model": "gemini-3-flash-preview",
"enabled": true
}
name must be unique within the team. Set enabled: false if the user wants to create it paused (no
schedule, no quota consumption) and turn it on later.
vision-scanners-observations-list for one scanner over time, or vision-observations-list
(requires session_id) for every scanner's findings on a single session. To dig into a recording, hand off
to the investigating-replay skill.vision-scanners-update is a partial update — send only changed fields. Re-run the Step 3 gut-check
whenever you widen scope: a broader query or a higher sampling_rate raises the sweep volume just like a
fresh broad scanner would. Toggling enabled, tweaking the prompt, or narrowing the query don't need a
re-estimate. Editing config bumps scanner_version; past observations keep a snapshot of the old config.
ineligible (e.g. too_short, no_recording) — a terminal
non-error outcome. Check error_reason when triaging why a scanner produced nothing.testing
Focused Signals scout for PostHog projects running surveys. Watches active surveys for score regressions (NPS / CSAT / rating drops), response-volume drops, abandonment spikes, and targeting drift, AND aggregates open-text responses into recurring themes the team should know about (clusters of complaints, praise, feature requests). Emits findings only when a theme or anomaly clears the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet — no dependencies on other skills. Picked uniformly at random by the coordinator alongside `signals-scout-general` and other specialists.
development
Focused Signals scout for PostHog projects using revenue analytics. Watches the derived revenue product for upstream failures (Stripe sync stalls, capture regressions), config drift (missing subscription property, currency mix surprises, broken Stripe↔person joins, deferred-revenue gaps), and goal-miss escalations. Emits findings only when they clear the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet — no dependencies on other skills. Picked uniformly at random by the coordinator alongside `signals-scout-general` and other specialists.
testing
Focused Signals scout for finding observability gaps in PostHog itself — significant event volumes the team isn't tracking, custom events with no insight or dashboard coverage, insights pointing at events that have stopped firing, dashboards missing related context, critical events with no alerts. Watches the event-stream-vs-saved- inventory delta as the team's product evolves and emits findings recommending new insights, dashboard additions, or alerts when gaps clear the confidence bar. Self-contained peer in the signals-scout-* fleet — picked uniformly at random by the coordinator alongside `signals-scout-general` and other specialists.
testing
Focused Signals scout for PostHog projects using logs. Watches for volume bursts, severity-distribution shifts, service silence, fresh message patterns, and trace-correlated bursts via the logs ingestion pipeline. Emits findings only when they clear the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet — no dependencies on other skills. Picked uniformly at random by the coordinator alongside `signals-scout-general` and other specialists.