skills/clickhouse-managed-postgres-rca/SKILL.md
MUST USE when investigating performance issues on a ClickHouse-managed Postgres instance. Provides an evidence-based RCA workflow that scrapes the Prometheus endpoint for system signal, pulls per-digest evidence from the Slow Query Patterns API, and recommends (does not apply) a fix.
npx skillsauth add ClickHouse/agent-skills clickhouse-managed-postgres-rcaInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Trigger whenever a user reports slowness, high CPU, low throughput, cache thrash, or any unexplained pain on a ClickHouse-managed Postgres instance.
Two APIs on https://api.clickhouse.cloud (HTTP Basic auth
using a ClickHouse Cloud API key/secret pair):
postgresInstancePrometheusGet
under the Prometheus tag. Returns Prometheus exposition format.
System and workload metrics for one Postgres service.slowQueryPatternsGetList
under the Postgres tag. Returns per-digest latency, IO, and
call statistics for normalized query patterns. Beta.Both endpoints require an organizationId and a serviceId as
path parameters. The user must supply both, plus the API
key/secret pair.
seq_scan / idx_scan).Reason from IO and timing signals, not from a plan tree.
Six steps, in order. Do not skip ahead.
Steps 2 and 3 only share auth — no data dependency between
them. Run them in parallel (background curls, & + wait) to
cut wall time from sequential ~2s to ~1s.
These endpoints are Beta — paths, params, and JSON field names
can shift. Follow rules/openapi-discovery.md to:
https://api.clickhouse.cloud/v1.operationId:
postgresInstancePrometheusGet (Prometheus tag)slowQueryPatternsGetList (Postgres tag){ semantic role → actual field name }.Use the resolved names in every subsequent request and citation. Never hardcode field names from memory.
Follow rules/prometheus-scrape.md. One scrape, no wait.
You're after gauges (current values) that don't need a delta:
CacheHitRatio, ActiveConnections, MemoryUsedPercent,
FilesystemUsedPercent.
A CacheHitRatio well below ~95% on a workload that should
fit in cache is a real signal on its own. Climbing
ActiveConnections toward the pool ceiling is a real signal
on its own. These don't need rate-of-change.
A second scrape for counter deltas is opt-in, used only when Step 4 triage points at write-congestion (where deadlock and rollback rates matter and the Slow Query Patterns API can't substitute). For the read-path case (the most common RCA shape) the single scrape is enough.
Request the slow query patterns. Follow
rules/slow-query-patterns-fields.md for the fields that
matter and how to read them. This is the primary diagnostic —
it returns per-pattern accumulated totals (call count, runtime,
blocks, rows) over the window you request, which is the
"rate-of-change" data you'd otherwise derive from two Prom
scrapes — but per query and without waiting.
If no patterns return a meaningful totalDurationUs, the
report may be overstated or the issue isn't query-shaped.
Stop and tell the user what you looked at.
Follow rules/triage.md. Match the combined Prom + slow-query
signal to one of the heuristic shapes. Each shape points to a
specific heuristic file:
rules/heuristic-full-scan.md — read-path full scan.rules/heuristic-hot-loop.md — N+1 / hot loop from the app.rules/heuristic-write-congestion.md — deadlocks, slow
writes, high rollback rate.If the signal does not match any shape cleanly, do not invent a hypothesis. Surface the top patterns and ask the user which workload they recognize. New heuristics are welcome as PRs.
Use the format in rules/output-template.md. Always include:
symptom, evidence, hypothesis (noting any alternative cause
you cannot rule out from this surface alone), short-term fix,
and long-term follow-ups.
Follow rules/recommend-only.md. Never run DDL. Never call
pg_cancel_backend or pg_terminate_backend. Write the
recommendation, explain why, and let the human apply it.
For the complete guide with every rule expanded in a single
context load: AGENTS.md.
tools
Use when a user wants to build an application with ClickHouse, set up a local ClickHouse development environment, install ClickHouse, create a local server, create tables, or start developing with ClickHouse. Covers the full flow from zero to a working local ClickHouse setup.
tools
Use when the user wants to run SQL — especially analytical SQL — on local files (parquet/csv/json), URLs, S3 paths, or remote databases (Postgres, MySQL, MongoDB, ClickHouse Cloud, Iceberg, Delta Lake) without setting up a server. Provides chDB — embedded ClickHouse SQL in Python with 1000+ functions, Session for stateful multi-step pipelines, parametrized queries, and cross-source joins via `s3()`, `mysql()`, `postgresql()`, `iceberg()`, `deltaLake()`, `remoteSecure()` table functions. TRIGGER when: user wants SQL on parquet/csv/files or across remote analytical sources; uses ClickHouse SQL features (window functions, windowFunnel, geoToH3, JSON path ops, Session, parametrized queries); imports `chdb` or calls `chdb.query()`. SKIP this skill for pandas-style DataFrame method-chaining (use chdb-datastore instead) or ClickHouse server administration.
tools
Use when the user has tabular data (pandas DataFrame, parquet, csv, Arrow, json) and wants to filter, group, aggregate, join, or speed up slow pandas. Provides chDB DataStore — same pandas API, ClickHouse engine underneath. Also handles reading from S3, MySQL, PostgreSQL, MongoDB, ClickHouse Cloud, Iceberg, Delta Lake as DataFrames and joining across sources. TRIGGER when: user mentions DataFrame, parquet, csv, "fast pandas", "speed up pandas", or cross-source DataFrame joins; user imports `chdb.datastore` or `from datastore import DataStore`. SKIP this skill for raw SQL syntax (use chdb-sql instead), ClickHouse server administration, or non-Python DataStore API work.
tools
Write idiomatic application code with the ClickHouse Node.js client (`@clickhouse/client`). Use this skill whenever a user is *building* against the Node.js client — configuring the client, pinging, inserting rows in JSON or raw formats, selecting and parsing results, binding query parameters, managing sessions and temporary tables, working with data types or customizing JSON parsing. Do NOT use for browser/Web client code.