skills/tuning-incremental-sync-config/SKILL.md
Change the sync configuration of an existing data warehouse schema — switch sync_type, pick a different incremental_field, set primary_key_columns, choose cdc_table_mode, or change sync_frequency. Use when the user asks "switch my orders table from full refresh to incremental", "this table is syncing too slowly / too frequently", "I need to pick a different incremental column", "set up CDC for this Postgres table", or when diagnosis of a failing sync pointed to an incremental-field or PK misconfiguration.
npx skillsauth add posthog/ai-plugin tuning-incremental-sync-configInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A sync's configuration lives on the ExternalDataSchema and can be changed any time via
external-data-schemas-partial-update. Most changes are non-destructive (take effect on the next sync), but a few
(switching sync_type, changing primary keys) require careful handling to avoid corrupting the synced data.
If the user is setting up a brand-new source, use setting-up-a-data-warehouse-source instead — configuration is
chosen at creation time there.
| Tool | Purpose |
| ------------------------------------------------------ | ------------------------------------------------------------------------- |
| external-data-schemas-retrieve | Current sync_type, incremental_field, PKs, sync_frequency |
| external-data-schemas-incremental-fields-create | Refresh candidate incremental fields from the live source |
| external-data-schemas-partial-update | Apply the config change |
| external-data-schemas-reload | Trigger a sync with the new config |
| external-data-schemas-resync | Wipe and re-import from scratch when the change invalidates existing data |
| external-data-schemas-delete-data | Drop the synced table while keeping the schema entry |
| external-data-sources-check-cdc-prerequisites-create | Pre-flight Postgres CDC (only when switching to/from CDC) |
| external-data-sources-webhook-info-retrieve | Current webhook state (when switching to/from sync_type=webhook) |
| external-data-sources-create-webhook-create | Register a webhook after switching a schema to sync_type=webhook |
| external-data-sources-update-webhook-inputs-create | Rotate a webhook signing secret |
| external-data-sources-delete-webhook-create | Unregister webhook when switching schemas off sync_type=webhook |
From the partial-update endpoint:
| Field | Values | Notes |
| ------------------------ | ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------ |
| sync_type | full_refresh, incremental, append, cdc, webhook | Source must support the target type — check via incremental-fields |
| incremental_field | Column name from the source | Must appear in incremental_fields list for the schema |
| incremental_field_type | datetime, date, timestamp, integer, numeric, objectid | Must match the column's real type |
| primary_key_columns | Array of column names | Required for CDC. Used for upsert dedup on incremental |
| cdc_table_mode | consolidated, cdc_only, both | Only meaningful when sync_type=cdc |
| sync_frequency | 1min, 5min, 15min, 30min, 1hour, 6hour, 12hour, 24hour, 7day, 30day, never | Applies to all non-CDC types |
| sync_time_of_day | HH:MM:SS | When sync_frequency is daily/weekly-scale |
| should_sync | true / false | Pause the schema without deleting it |
Always start with external-data-schemas-retrieve({id}). Understanding the current state prevents mistakes like
"fixing" an incremental_field that's actually correct.
Note:
sync_type, incremental_field, incremental_field_type, primary_key_columnsstatus (don't tune a schema that's currently Running — wait or cancel first)last_synced_at (so you can tell if the next sync worked)latest_error if present (the error often tells you exactly what to change)Call external-data-schemas-incremental-fields-create({id}). Even though the operation name says "create", it
re-reads the source and returns the current candidate fields — use it to confirm the field you want to set actually
exists on the source and which sync types are now available for this table.
The response:
{
"incremental_fields": [{"field": "updated_at", "type": "datetime", ...}, ...],
"incremental_available": true,
"append_available": true,
"cdc_available": true,
"full_refresh_available": true,
"detected_primary_keys": ["id"],
"available_columns": [...]
}
If your target incremental_field isn't in the list, tell the user — they need to either pick a different field or
change the source table to add one.
Call external-data-schemas-partial-update({id}, {...changed fields}).
Only send the fields that are actually changing. Partial update means unspecified fields stay as they are.
Examples:
// Switch from full_refresh to incremental
{
"sync_type": "incremental",
"incremental_field": "updated_at",
"incremental_field_type": "datetime"
}
// Change sync frequency to hourly
{"sync_frequency": "1hour"}
// Fix wrong PK on a CDC table
{"primary_key_columns": ["tenant_id", "order_id"]}
// Pause a schema
{"should_sync": false}
This is the step that's easy to get wrong. Some config changes invalidate the synced data; others don't.
Changes that DON'T invalidate existing data:
sync_frequency, sync_time_of_day — scheduling onlyshould_sync — on/offcdc_table_mode in most cases — next sync will start writing to the new shape, but historical consolidated rows
stay validincremental and full_refresh with the same incremental_field — next sync just re-runs
freshsync_type: "webhook" — the synced data stays valid; only the ingestion path changes.
Remember to register or unregister the webhook (see sections below) alongside the sync_type change.Changes that MAY invalidate existing data and need a resync:
incremental_field to a different column — the high-water mark is from the old column and won't match.
Without a resync you'll miss rows that were updated between the two fields' histories.primary_key_columns — existing rows may be deduplicated incorrectly against new PK definitions.full_refresh to append — the existing rows don't have the version-history shape that append
expects.append to full_refresh — opposite problem; you'll end up with duplicate historical versions.cdc — the table shape changes fundamentally.When the change invalidates data, the clean flow is:
external-data-schemas-partial-update with the new configexternal-data-schemas-resync to wipe and re-import under the new configOr equivalently, external-data-schemas-delete-data → external-data-schemas-reload. delete-data + reload is
cleaner when the table is large and the user wants to start from zero.
For non-destructive changes, call external-data-schemas-reload({id}) to pick up the new config immediately rather
than waiting for the schedule.
Wait a moment, then external-data-schemas-retrieve({id}) to confirm status = Running then Completed. Report
last_synced_at and any new latest_error.
incremental-fields-create to confirm the desired field exists and incremental_available: true.partial-update: {sync_type: "incremental", incremental_field, incremental_field_type}.external-data-sources-check-cdc-prerequisites-create on the parent source. Only proceed if valid: true.incremental-fields-create to confirm cdc_available: true and see detected_primary_keys.partial-update: {sync_type: "cdc", primary_key_columns: [...], cdc_table_mode: "consolidated"}.external-data-schemas-resync after the update.
Warn the user this wipes existing data.Source dropped the updated_at column. Sync has been failing with "column does not exist".
incremental-fields-create to see what fields remain.full_refresh if none are suitable).partial-update with the new field + type (or new sync_type).reload to retry.partial-update: {primary_key_columns: [...]}.resync, warn the user.partial-update: {sync_frequency: "1hour"}.sync_type: "webhook"Only works for sources that implement WebhookSource (today: Stripe) and tables where supports_webhooks: true
from incremental-fields-create.
incremental-fields-create to confirm supports_webhooks: true for the table.partial-update: {sync_type: "webhook"}.webhook-info-retrieve), call
external-data-sources-create-webhook-create({source_id}) to register it.sync_frequency set (e.g. 24hour) — it acts as a safety-net reconciliation in case any webhook delivery
is missed.sync_type: "webhook"partial-update: {sync_type: "incremental"} (or whatever bulk type is appropriate) with the required
incremental_field + incremental_field_type.sync_type: "webhook", call
external-data-sources-delete-webhook-create({source_id}) to unregister. Leaving an orphaned webhook
registered on the source side just means events will be received and dropped — not harmful, but messy.The source's signing secret (e.g. Stripe's whsec_...) was rotated, and payloads are now failing signature
verification.
external-data-sources-update-webhook-inputs-create({source_id}, {inputs: {signing_secret: "whsec_..."}}).partial-update: {should_sync: false}. Schema stops syncing but stays configured.partial-update: {should_sync: true}, then reload for an immediate run.partial-update doesn't complain if you set a
field to the value it already had, but you might be about to change something you didn't realize was already set.incremental-fields-create response tells you what's
available right now, which can be different from what was available at creation (e.g. CDC may have been
enabled for the team since).sync_type: "cdc" without running check-cdc-prerequisites-create
first. The sync will just fail immediately.external-data-schemas-cancel before applying the change. Updating config mid-sync can leave the incremental
high-water mark inconsistent.tools
Focused Signals scout for PostHog projects with web traffic. Watches the acquisition and site-health layer the web analytics product reports on: per-channel session volume diverging from the site's own rhythm (an acquisition source silently collapsing or surging), attribution breakage (paid/campaign traffic reclassifying into Direct or Unknown when tagging breaks), landing pages that break (bounce-rate steps, 404 spikes, entry-path cliffs), and page-performance regressions (web vitals p75 steps). Emits findings only when they clear the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet.
tools
Focused Signals scout for PostHog projects using session replay. Watches two promises the replay product makes: that sessions are actually being recorded (capture integrity — recording volume vanishing while site traffic doesn't), and that the friction evidence inside recordings gets seen (rage-click / dead-click clusters concentrating on a page or element, error-after-interaction cohorts, recurring replay vision themes nobody aggregates). Emits findings only when they clear the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet.
tools
Focused Signals scout for PostHog setup health. Reads the project's active health issues — the deterministic findings of PostHog's own health checks (no live events, outdated SDKs, missing reverse proxy, absent web vitals, ingestion warnings, failing data-warehouse models, and more) — and decides which are genuinely worth surfacing. Unlike a one-signal-per-issue push, it bundles kind-clusters into a single finding, weights by real blast radius (cross-referencing actual event volume and reach), and prioritizes issues an agent can resolve via the MCP. Emits only above the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet — no dependencies on other skills.
tools
Focused Signals scout for PostHog projects using feature flags. Watches the flag roster and the `$feature_flag_called` evaluation stream for contradictions between a flag's configured state and its real traffic: evaluation cliffs on healthy flags, ghost flags (code calling keys that no longer exist), response-distribution shifts with no corresponding flag edit, and flag debt (stale, fully-rolled-out, or dead flags still burning evaluations). Emits findings only when they clear the confidence bar; otherwise writes durable memory and closes out empty. Self-contained peer in the signals-scout-* fleet — no dependencies on other skills.