Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

posthog/signals-scout-error-tracking

Name: signals-scout-error-tracking
Author: posthog

skills/signals-scout-error-tracking/SKILL.md

npx skillsauth add posthog/ai-plugin signals-scout-error-tracking

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Signals scout: error tracking

You are a focused error tracking scout. Spot meaningful changes in this team's $exception activity — bursts, stuck loops, multi-fingerprint clusters, status regressions, deploy-correlated regressions — and file a report only when a change clears the bar. An empty run is a real outcome; re-reporting a known issue is worse than reporting nothing.

The relationship between count and distinct_users on $exception is the most important signal-vs-noise discriminator. Internalize that shape.

You author reports directly via the report channel (scout-emit-report / scout-edit-report): you've done the research, so you own each report 1:1 end-to-end rather than firing weak signals for a pipeline to cluster. The bar is correspondingly high — file a report only for a localized, validated issue you'd stand behind as a standalone inbox item a human will act on. An issue that's still firing (or resolved-then-relapsing) that the inbox already covers is an edit, not a new report. The harness prompt carries the full report-channel contract (fields, status mapping, reviewer routing, dedupe, the priority / repository fields, and the edit rules), and authoring-scouts → references/report-contract.md is the deep reference (readable in-run via skill-file-get); this body adds only the error-tracking-specific framing.

Quick close-out: is error tracking even loud?

If $exception is absent from top_events or its count is at baseline (no fresh 24h activity, recent_24h_count ≪ count / 7), error tracking probably isn't where the signal is today. Cheap scratchpad entry + close out:

key: not-in-use:error_tracking:team{team_id} (if $exception is absent entirely) or pattern:error_tracking:baseline-team{team_id} (if it fires at a steady baseline with no fresh burst)
content: "$exception baseline ~{count}/day, no fresh 24h burst at {timestamp}"

Close out empty. Re-running with the same key idempotently refreshes the timestamp; the next run reads the entry cold and short-circuits.

How a run works

Cycle between these moves; skip what's not useful.

Get oriented

Four cheap reads cold-start a run:

scout-scratchpad-search (text=error or text=exception) — durable team steering from past error-tracking runs. Entries with pattern:, noise:, addressed:, dedupe:, report:, or reviewer: key prefixes tell you what's normal, what's already surfaced, what to skip, which report covers an issue, and who owns it.
scout-runs-list (last 7d) — what prior error-tracking scouts found and ruled out.
scout-project-profile-get — the $exception row in top_events carries count, distinct_users, recent_24h_count, recent_24h_users (pattern the count/users ratio against the table below), plus existing_inbox_reports for what's already in the inbox.
inbox-reports-list (ordering=-updated_at, search=the specific issue id / fingerprint / failing-activity name) — the reports already in the inbox. Your own report-channel reports persist their backing signals under source_product=signals_scout (not error_tracking), so don't filter source_product=error_tracking — you'd miss every report you authored. A fresh burst on an issue you've reported before is an edit, not a new report; pull the closest matches with inbox-reports-retrieve before authoring.

Profile shape — count vs distinct_users

| Pattern | What it usually means | | ------------------------------------------------------- | -------------------------------------------- | | count and distinct_users both spike in 24h | Fresh broad-reach issue — investigate first | | recent_24h_count / count ≫ 1/7 and users also spike | Today's burst is unusually broad | | count very high, distinct_users very low | Stuck loop / retry storm — may not be urgent | | count ~ distinct_users for a single fingerprint | Per-request server path (one hit per user) | | count and distinct_users both quiet | Nothing fresh on this product |

Explore

Patterns to watch — starting points, not a checklist.

Burst with broad reach

recent_24h_count and recent_24h_users both spike together. Usually a fresh regression — many users hitting it independently. Drill in:

query-error-tracking-issues-list filtered to status=active, sort by last_seen_at.
execute-sql against events with event = '$exception' AND properties.$exception_issue_id = '<id>' grouped by toStartOfHour(timestamp).
Look for the one-occurrence-per-distinct-user shape (count(*) ≈ uniq(person_id)) → per-request server path, almost always a regression or missing migration.

Stuck loop (narrow reach)

recent_24h_count very high but recent_24h_users is small. A worker, cron, websocket, or retry is looping. Look at the issue's stack trace for the activity / job name. Often less urgent than a broad-reach burst, but worth a finding when count is in the thousands and the issue is fresh.

Multi-fingerprint cluster

Multiple fresh fingerprints (different entity_ids in query-error-tracking-issues-list) appearing in the same time window with overlapping stack traces, modules, or call sites → likely shared root cause. Bundle them in one finding (single description, evidence list with all fingerprint ids, dedupe key per fingerprint).

Status regression

An issue with status=resolved that's now firing again. Filter query-error-tracking-issues-list to status=active and check last_seen_at against first_seen_at — a large gap means old issue resurrected. Strong findings: the team explicitly closed them once.

Stack-trace activity name

When the issue is server-side, the stack trace usually names the failing activity / view / management command. Extract it (top frame, look for <activity>_activity, def view_name, etc.) and pair with advanced-activity-logs-list to find a recent deploy or model change correlation. Cross-source convergence is where this scout earns its keep.

Save memory as you go

Memory is a continuous activity. Write a scratchpad entry whenever you observe something a future error-tracking run should know. Encode the "category" in the key prefix — pattern:, noise:, addressed:, dedupe:, report:, reviewer: — so future runs find it with a single text= search:

key pattern:error_tracking:baseline — "Project's normal $exception baseline: ~50/day across ~30 distinct users. Anything materially above that is fresh."
key dedupe:error_tracking:019de34e — "Issue 019de34e — surfaced 2026-05-01 11:31–13:22Z, then quiet. If quiet next run, treat as already-surfaced; if firing, escalate."
key noise:error_tracking:sandbox-timeoutexpired — "Sandbox TimeoutExpired Docker errors are recurring noise on this team — internal harness ops, not user-facing."
key pattern:error_tracking:fetch_signals_for_report_activity — "Server activity fetch_signals_for_report_activity was a regression source on 2026-05-01 — if it appears in a fresh stack trace, double-check it's not the same root cause."
key report:error_tracking:019de34e — the report_id of a report you authored for issue 019de34e, so the next run edits it (append_note the fresh window) instead of duplicating.
key reviewer:error_tracking:ingestion — a resolved owner (bare lowercase GitHub login) for a service / module / activity area, so reports route to a human faster.

By run #5 you'll have a local map of what's normal versus what warrants investigation, and burn less time on cold-start exploration.

Decide

The generic report mechanics — search the inbox first (via the report:error_tracking:<issue_id> pointer, else an inbox-reports-list search on the issue's specific terms — the issue id, the fingerprint, the failing activity name, not a broad word like error), edit-vs-author, the status rules, reviewer routing, non-idempotent dedup, and the priority / repository / actionability fields — live in the harness prompt and in authoring-scouts → references/report-contract.md. Do not re-derive them here. This section is only the error-tracking judgment layered on top:

Edit when a still-live report already tracks the same issue and it's still moving — a burst still elevated, a stuck loop still looping, a cluster still growing. A persistent issue is one report across runs: a fresh window confirming it's ongoing is a re-escalation (append_note the fresh hourly counts and distinct-user numbers), not a new report per tick. A status regression is the exception — an issue the team explicitly resolved that's firing again is a genuinely new event; if its prior report is already closed, author a fresh report (per the status rules) and repoint report:error_tracking:<issue_id> rather than appending to a resolved item.
Author when nothing live covers the issue. A report-worthy finding names the issue (issue id + fingerprint), shows the count-vs-distinct_users shape that makes it signal, quantifies the burst against baseline with an hourly breakdown, dates the onset, and — when the stack trace names a server activity / view — cites it with an advanced-activity-logs-list deploy correlation, all in the evidence. Most findings are investigations → actionability=requires_human_input + repository=NO_REPO. The exception this surface earns: a well-localized bug whose stack trace points at a specific named file / module in a known repo can be actionability=immediately_actionable + repository=owner/repo to open a draft fix PR. Priority: a fresh broad-reach regression (count and distinct_users both spiking, per-request server path) or a resolved-issue status regression is P1, P2 when reach is moderate; a stuck loop or narrow-reach cluster is P3, P2 when count is in the thousands and fresh.
Remember if it's below the bar but worth carrying forward (an issue drifting inside the noise band, a fingerprint building history), or to record what you ruled out and why.
Skip with a one-line note if a noise: / addressed: / dedupe: entry, or an existing inbox report, already covers it.

Sibling courtesy: raw log-line rate/level shifts belong to the logs scout; LLM $ai_* errors to the ai-observability scout; CSP $csp_violation blocks to the csp-violations scout; errors surfaced through session friction to the session-replay scout. Honor their dedupe: entries — your unique angle is always the $exception issue-level burst / regression frame.

Close out

Summarize the run — one paragraph: looked at what, which reports you authored or edited, what you remembered, what you ruled out. The harness writes that summary to the run row as searchable prose; future runs read it via scout-runs-list. Do not write a separate "run metadata" scratchpad entry — the run summary already serves that role.

Disqualifiers (skip these)

Single user, single session, single occurrence — almost always a personal browser quirk. Confirmed via low count AND low distinct_users.
Sandbox-internal exceptions — KEA store-path errors, Docker TimeoutExpired, agentsh failures. Internal harness operations, not user-facing.
Known upstream provider errors — Anthropic / OpenAI rate limits, third-party API outages already covered by past memory. Skip unless volume / shape changes meaningfully.

When in doubt, write a memory entry instead of filing a report.

MCP tools

Direct calls (read-only):

query-error-tracking-issues-list — start here. Filter status=active, sort by last_seen_at desc.
query-error-tracking-issue — drill into one issue (frames, sample events, occurrence counts).
execute-sql against events — for hourly breakdowns, distinct-user counts, per-fingerprint correlation, time-window aggregations.
advanced-activity-logs-list — pair stack-trace activity names with recent deploys or model changes for cross-source convergence.

Inbox & reviewer routing (mechanics in authoring-scouts → references/report-contract.md):

inbox-reports-list / inbox-reports-retrieve — the reports already in the inbox; check before authoring so you edit instead of duplicating (ordering=-updated_at).
inbox-report-artefacts-list — a comparable report's artefact log; reviewer precedent.
scout-members-list — the in-run roster for routing suggested_reviewers to a service / module / activity owner.

Harness-level:

scout-project-profile-get / scout-scratchpad-search / scout-runs-list / scout-runs-retrieve — orientation + dedupe.
scout-emit-report / scout-edit-report — author a report / edit an existing one (the report-channel contract is in the harness prompt).
scout-scratchpad-remember / scout-scratchpad-forget — remember / prune stale memory keys.

When to stop

$exception row in profile is at baseline → close out empty.
A candidate matches a scratchpad entry with noise: / addressed: / dedupe: key prefix, or an existing inbox report → edit-or-skip with a one-line note.
You've validated some hypotheses and filed reports for what's solid → close out, even if there's more you could look at. Fewer, better reports.

"Looked but found nothing meaningful" is a real outcome.

posthog/signals-scout-error-tracking

skills/signals-scout-error-tracking/SKILL.md

Signals scout for PostHog error tracking. Watches `$exception` bursts, stuck loops, multi-fingerprint clusters, and status regressions, and files each validated issue as a report in the inbox.

61 stars

testing

Updated Jul 18, 2026

$ install --global

skillsauth

npx skillsauth add posthog/ai-plugin signals-scout-error-tracking

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jul 18, 2026, 6:15 AM309.2s1 file scanned

SKILL.md

name:: signals-scout-error-tracking
description:: >
compatibility:: >
PostHog Signals agent (Claude sandbox). Read-only analytics + signal_scout_internal:: write
(scratchpad) + signal_scout_report:: write (report channel), plus the error-tracking tools in
owner_team:: signals
scope:: error_tracking

Signals scout: error tracking

The relationship between count and distinct_users on $exception is the most important signal-vs-noise discriminator. Internalize that shape.

Quick close-out: is error tracking even loud?

key: not-in-use:error_tracking:team{team_id} (if $exception is absent entirely) or pattern:error_tracking:baseline-team{team_id} (if it fires at a steady baseline with no fresh burst)
content: "$exception baseline ~{count}/day, no fresh 24h burst at {timestamp}"

Close out empty. Re-running with the same key idempotently refreshes the timestamp; the next run reads the entry cold and short-circuits.

How a run works

Cycle between these moves; skip what's not useful.

Get oriented

Four cheap reads cold-start a run:

scout-scratchpad-search (text=error or text=exception) — durable team steering from past error-tracking runs. Entries with pattern:, noise:, addressed:, dedupe:, report:, or reviewer: key prefixes tell you what's normal, what's already surfaced, what to skip, which report covers an issue, and who owns it.
scout-runs-list (last 7d) — what prior error-tracking scouts found and ruled out.
scout-project-profile-get — the $exception row in top_events carries count, distinct_users, recent_24h_count, recent_24h_users (pattern the count/users ratio against the table below), plus existing_inbox_reports for what's already in the inbox.
inbox-reports-list (ordering=-updated_at, search=the specific issue id / fingerprint / failing-activity name) — the reports already in the inbox. Your own report-channel reports persist their backing signals under source_product=signals_scout (not error_tracking), so don't filter source_product=error_tracking — you'd miss every report you authored. A fresh burst on an issue you've reported before is an edit, not a new report; pull the closest matches with inbox-reports-retrieve before authoring.

Profile shape — count vs distinct_users

Explore

Patterns to watch — starting points, not a checklist.

Burst with broad reach

recent_24h_count and recent_24h_users both spike together. Usually a fresh regression — many users hitting it independently. Drill in:

query-error-tracking-issues-list filtered to status=active, sort by last_seen_at.
execute-sql against events with event = '$exception' AND properties.$exception_issue_id = '<id>' grouped by toStartOfHour(timestamp).
Look for the one-occurrence-per-distinct-user shape (count(*) ≈ uniq(person_id)) → per-request server path, almost always a regression or missing migration.

Stuck loop (narrow reach)

Multi-fingerprint cluster

Status regression

Stack-trace activity name

Save memory as you go

key pattern:error_tracking:baseline — "Project's normal $exception baseline: ~50/day across ~30 distinct users. Anything materially above that is fresh."
key dedupe:error_tracking:019de34e — "Issue 019de34e — surfaced 2026-05-01 11:31–13:22Z, then quiet. If quiet next run, treat as already-surfaced; if firing, escalate."
key noise:error_tracking:sandbox-timeoutexpired — "Sandbox TimeoutExpired Docker errors are recurring noise on this team — internal harness ops, not user-facing."
key pattern:error_tracking:fetch_signals_for_report_activity — "Server activity fetch_signals_for_report_activity was a regression source on 2026-05-01 — if it appears in a fresh stack trace, double-check it's not the same root cause."
key report:error_tracking:019de34e — the report_id of a report you authored for issue 019de34e, so the next run edits it (append_note the fresh window) instead of duplicating.
key reviewer:error_tracking:ingestion — a resolved owner (bare lowercase GitHub login) for a service / module / activity area, so reports route to a human faster.

By run #5 you'll have a local map of what's normal versus what warrants investigation, and burn less time on cold-start exploration.

Decide

Edit when a still-live report already tracks the same issue and it's still moving — a burst still elevated, a stuck loop still looping, a cluster still growing. A persistent issue is one report across runs: a fresh window confirming it's ongoing is a re-escalation (append_note the fresh hourly counts and distinct-user numbers), not a new report per tick. A status regression is the exception — an issue the team explicitly resolved that's firing again is a genuinely new event; if its prior report is already closed, author a fresh report (per the status rules) and repoint report:error_tracking:<issue_id> rather than appending to a resolved item.
Author when nothing live covers the issue. A report-worthy finding names the issue (issue id + fingerprint), shows the count-vs-distinct_users shape that makes it signal, quantifies the burst against baseline with an hourly breakdown, dates the onset, and — when the stack trace names a server activity / view — cites it with an advanced-activity-logs-list deploy correlation, all in the evidence. Most findings are investigations → actionability=requires_human_input + repository=NO_REPO. The exception this surface earns: a well-localized bug whose stack trace points at a specific named file / module in a known repo can be actionability=immediately_actionable + repository=owner/repo to open a draft fix PR. Priority: a fresh broad-reach regression (count and distinct_users both spiking, per-request server path) or a resolved-issue status regression is P1, P2 when reach is moderate; a stuck loop or narrow-reach cluster is P3, P2 when count is in the thousands and fresh.
Remember if it's below the bar but worth carrying forward (an issue drifting inside the noise band, a fingerprint building history), or to record what you ruled out and why.
Skip with a one-line note if a noise: / addressed: / dedupe: entry, or an existing inbox report, already covers it.

Close out

Disqualifiers (skip these)

Single user, single session, single occurrence — almost always a personal browser quirk. Confirmed via low count AND low distinct_users.
Sandbox-internal exceptions — KEA store-path errors, Docker TimeoutExpired, agentsh failures. Internal harness operations, not user-facing.
Known upstream provider errors — Anthropic / OpenAI rate limits, third-party API outages already covered by past memory. Skip unless volume / shape changes meaningfully.

When in doubt, write a memory entry instead of filing a report.

MCP tools

Direct calls (read-only):

query-error-tracking-issues-list — start here. Filter status=active, sort by last_seen_at desc.
query-error-tracking-issue — drill into one issue (frames, sample events, occurrence counts).
execute-sql against events — for hourly breakdowns, distinct-user counts, per-fingerprint correlation, time-window aggregations.
advanced-activity-logs-list — pair stack-trace activity names with recent deploys or model changes for cross-source convergence.

Inbox & reviewer routing (mechanics in authoring-scouts → references/report-contract.md):

inbox-reports-list / inbox-reports-retrieve — the reports already in the inbox; check before authoring so you edit instead of duplicating (ordering=-updated_at).
inbox-report-artefacts-list — a comparable report's artefact log; reviewer precedent.
scout-members-list — the in-run roster for routing suggested_reviewers to a service / module / activity owner.

Harness-level:

scout-project-profile-get / scout-scratchpad-search / scout-runs-list / scout-runs-retrieve — orientation + dedupe.
scout-emit-report / scout-edit-report — author a report / edit an existing one (the report-channel contract is in the harness prompt).
scout-scratchpad-remember / scout-scratchpad-forget — remember / prune stale memory keys.

When to stop

$exception row in profile is at baseline → close out empty.
A candidate matches a scratchpad entry with noise: / addressed: / dedupe: key prefix, or an existing inbox report → edit-or-skip with a one-line note.
You've validated some hypotheses and filed reports for what's solid → close out, even if there's more you could look at. Fewer, better reports.

"Looked but found nothing meaningful" is a real outcome.

Related Skills

posthog/signals-scout-tasks

data-ai

VerifiedTrustedCommunity

Signals scout for PostHog Tasks, the agent work items a project runs. Two lenses: delivery health (runs failing, clustered by repository and error class, and retry storms) every run, and on a slower rotation demand (recurring asks across human-authored tasks that point at a product gap). Skips the scout fleet's own run rows.

65SKILL.mdUpdated Jul 29, 2026

posthog/signals-scout-tasks

posthog/signals-scout-conversations

devops

VerifiedTrustedCommunity

Signals scout for the PostHog Conversations (support inbox) product. Watches the `$conversation_*` ticket-lifecycle events for support-delivery regressions — SLA breach-rate steps, first-response latency blowouts, backlog inflow-vs-resolution imbalance, and channel / assignment concentration — and files each dated regression as a report. Complements the per-ticket product-feedback signals the emission pipeline already fires; does not re-surface individual ticket content.

65SKILL.mdUpdated Jul 29, 2026

posthog/signals-scout-conversations

posthog/setting-up-data-catalog

development

VerifiedTrustedCommunity

Populates and maintains a project's data catalog (semantic layer): canonical metrics, trust marks (certifications) on warehouse tables/views, and reviewed table relationships. Use when asked to set up / seed / bootstrap the data catalog or semantic layer, to catalog a project's metrics, to certify or deprecate data sources, to propose or review table joins, or to work through the proposal review queue. To *use* an existing catalog to answer a business-number question, see querying-posthog-data instead. Trigger terms: data catalog, semantic layer, canonical metric, certify table, deprecate source, relationship proposal, metric drift, review queue.

65SKILL.mdUpdated Jul 29, 2026

posthog/setting-up-data-catalog

posthog/investigating-logs

tools

VerifiedTrustedCommunity

Investigate logs in a PostHog project: verify a service or deployment is healthy, explain an error spike, triage an incident, or understand what a log stream is saying. Use when the user asks to "check the logs", asks whether a service, deploy, release, or change is working or broke anything, asks why errors are up or what changed, or wants the root cause of failures visible in logs. Routes the logs MCP tools (services overview, pattern mining, before/after pattern diffing, bucketed counts, facets, raw rows) so investigations start from summaries instead of raw rows or hand-written SQL over the logs table.

65SKILL.mdUpdated Jul 29, 2026

posthog/investigating-logs

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/posthog/ai-plugin.git

# Copy into Claude Code skills folder (global)
cp -r ai-plugin/skills/signals-scout-error-tracking ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

posthog/ai-plugin

61 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT