Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

ClickHouse/clickhouse-managed-postgres-rca

Name: clickhouse-managed-postgres-rca
Author: ClickHouse

skills/clickhouse-managed-postgres-rca/SKILL.md

npx skillsauth add ClickHouse/agent-skills clickhouse-managed-postgres-rca

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

ClickHouse Managed Postgres RCA

When to use

Trigger whenever a user reports slowness, high CPU, low throughput, cache thrash, or any unexplained pain on a ClickHouse-managed Postgres instance.

What you have access to

Two APIs on https://api.clickhouse.cloud (HTTP Basic auth using a ClickHouse Cloud API key/secret pair):

Prometheus metrics — operation postgresInstancePrometheusGet under the Prometheus tag. Returns Prometheus exposition format. System and workload metrics for one Postgres service.
Slow Query Patterns — operation slowQueryPatternsGetList under the Postgres tag. Returns per-digest latency, IO, and call statistics for normalized query patterns. Beta.

Both endpoints require an organizationId and a serviceId as path parameters. The user must supply both, plus the API key/secret pair.

What you do NOT have

Query plans / EXPLAIN output.
Per-table scan-type counters (seq_scan / idx_scan).
Autovacuum or last-ANALYZE timestamps.

Reason from IO and timing signals, not from a plan tree.

Workflow

Six steps, in order. Do not skip ahead.

Steps 2 and 3 only share auth — no data dependency between them. Run them in parallel (background curls, & + wait) to cut wall time from sequential ~2s to ~1s.

1. Discover the live API shape

These endpoints are Beta — paths, params, and JSON field names can shift. Follow rules/openapi-discovery.md to:

Fetch the OpenAPI spec from https://api.clickhouse.cloud/v1.
Locate the two operations by operationId:
- postgresInstancePrometheusGet (Prometheus tag)
- slowQueryPatternsGetList (Postgres tag)
Resolve their path templates, required query parameters, and (for the slow-query endpoint) the response schema.
Build a session-scoped role map from the schema property descriptions: { semantic role → actual field name }.

Use the resolved names in every subsequent request and citation. Never hardcode field names from memory.

2. Scrape Prom once for system gauges

Follow rules/prometheus-scrape.md. One scrape, no wait. You're after gauges (current values) that don't need a delta: CacheHitRatio, ActiveConnections, MemoryUsedPercent, FilesystemUsedPercent.

A CacheHitRatio well below ~95% on a workload that should fit in cache is a real signal on its own. Climbing ActiveConnections toward the pool ceiling is a real signal on its own. These don't need rate-of-change.

A second scrape for counter deltas is opt-in, used only when Step 4 triage points at write-congestion (where deadlock and rollback rates matter and the Slow Query Patterns API can't substitute). For the read-path case (the most common RCA shape) the single scrape is enough.

3. Pull top slow query patterns

Request the slow query patterns. Follow rules/slow-query-patterns-fields.md for the fields that matter and how to read them. This is the primary diagnostic — it returns per-pattern accumulated totals (call count, runtime, blocks, rows) over the window you request, which is the "rate-of-change" data you'd otherwise derive from two Prom scrapes — but per query and without waiting.

If no patterns return a meaningful totalDurationUs, the report may be overstated or the issue isn't query-shaped. Stop and tell the user what you looked at.

4. Triage: pick the right heuristic

Follow rules/triage.md. Match the combined Prom + slow-query signal to one of the heuristic shapes. Each shape points to a specific heuristic file:

rules/heuristic-full-scan.md — read-path full scan.
rules/heuristic-hot-loop.md — N+1 / hot loop from the app.
rules/heuristic-write-congestion.md — deadlocks, slow writes, high rollback rate.

If the signal does not match any shape cleanly, do not invent a hypothesis. Surface the top patterns and ask the user which workload they recognize. New heuristics are welcome as PRs.

5. Reason, then recommend

Use the format in rules/output-template.md. Always include: symptom, evidence, hypothesis (noting any alternative cause you cannot rule out from this surface alone), short-term fix, and long-term follow-ups.

6. Do not apply the fix

Follow rules/recommend-only.md. Never run DDL. Never call pg_cancel_backend or pg_terminate_backend. Write the recommendation, explain why, and let the human apply it.

Full Compiled Document

For the complete guide with every rule expanded in a single context load: AGENTS.md.

ClickHouse/clickhouse-managed-postgres-rca

skills/clickhouse-managed-postgres-rca/SKILL.md

MUST USE when investigating performance issues on a ClickHouse-managed Postgres instance. Provides an evidence-based RCA workflow that scrapes the Prometheus endpoint for system signal, pulls per-digest evidence from the Slow Query Patterns API, and recommends (does not apply) a fix.

459 stars

tools

Updated Jun 9, 2026

$ install --global

skillsauth

npx skillsauth add ClickHouse/agent-skills clickhouse-managed-postgres-rca

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 9, 2026, 4:24 AM56.9s13 files scanned

SKILL.md

name:: clickhouse-managed-postgres-rca
description:: MUST USE when investigating performance issues on a ClickHouse-managed Postgres instance. Provides an evidence-based RCA workflow that scrapes the Prometheus endpoint for system signal, pulls per-digest evidence from the Slow Query Patterns API, and recommends (does not apply) a fix.
license:: Apache-2.0
author:: ClickHouse Inc
version:: 0.1.0

ClickHouse Managed Postgres RCA

When to use

Trigger whenever a user reports slowness, high CPU, low throughput, cache thrash, or any unexplained pain on a ClickHouse-managed Postgres instance.

What you have access to

Two APIs on https://api.clickhouse.cloud (HTTP Basic auth using a ClickHouse Cloud API key/secret pair):

Prometheus metrics — operation postgresInstancePrometheusGet under the Prometheus tag. Returns Prometheus exposition format. System and workload metrics for one Postgres service.
Slow Query Patterns — operation slowQueryPatternsGetList under the Postgres tag. Returns per-digest latency, IO, and call statistics for normalized query patterns. Beta.

Both endpoints require an organizationId and a serviceId as path parameters. The user must supply both, plus the API key/secret pair.

What you do NOT have

Query plans / EXPLAIN output.
Per-table scan-type counters (seq_scan / idx_scan).
Autovacuum or last-ANALYZE timestamps.

Reason from IO and timing signals, not from a plan tree.

Workflow

Six steps, in order. Do not skip ahead.

Steps 2 and 3 only share auth — no data dependency between them. Run them in parallel (background curls, & + wait) to cut wall time from sequential ~2s to ~1s.

1. Discover the live API shape

These endpoints are Beta — paths, params, and JSON field names can shift. Follow rules/openapi-discovery.md to:

Fetch the OpenAPI spec from https://api.clickhouse.cloud/v1.
Locate the two operations by operationId:
- postgresInstancePrometheusGet (Prometheus tag)
- slowQueryPatternsGetList (Postgres tag)
Resolve their path templates, required query parameters, and (for the slow-query endpoint) the response schema.
Build a session-scoped role map from the schema property descriptions: { semantic role → actual field name }.

Use the resolved names in every subsequent request and citation. Never hardcode field names from memory.

2. Scrape Prom once for system gauges

3. Pull top slow query patterns

If no patterns return a meaningful totalDurationUs, the report may be overstated or the issue isn't query-shaped. Stop and tell the user what you looked at.

4. Triage: pick the right heuristic

Follow rules/triage.md. Match the combined Prom + slow-query signal to one of the heuristic shapes. Each shape points to a specific heuristic file:

rules/heuristic-full-scan.md — read-path full scan.
rules/heuristic-hot-loop.md — N+1 / hot loop from the app.
rules/heuristic-write-congestion.md — deadlocks, slow writes, high rollback rate.

If the signal does not match any shape cleanly, do not invent a hypothesis. Surface the top patterns and ask the user which workload they recognize. New heuristics are welcome as PRs.

5. Reason, then recommend

6. Do not apply the fix

Follow rules/recommend-only.md. Never run DDL. Never call pg_cancel_backend or pg_terminate_backend. Write the recommendation, explain why, and let the human apply it.

Full Compiled Document

For the complete guide with every rule expanded in a single context load: AGENTS.md.

Related Skills

ClickHouse/infra-postgres

tools

VerifiedTrustedOfficial

Sets up and manages Postgres using the clickhousectl CLI — runs a local Docker-backed Postgres for development, and creates and operates managed ClickHouse Cloud Postgres services (connections, TLS, runtime config, read replicas, failover, point-in-time restore). Use when the user wants a Postgres or PostgreSQL database for their application, a local Postgres dev environment, psql access, or a managed/production Postgres in ClickHouse Cloud, or mentions moving a local Postgres to production.

498SKILL.mdUpdated Jul 28, 2026

ClickHouse/infra-postgres

ClickHouse/infra-clickhouse

tools

VerifiedTrustedOfficial

Sets up and manages ClickHouse using the clickhousectl CLI — installs and runs a local ClickHouse server for development, and creates managed ClickHouse Cloud services for production (authentication, service creation, schema migration, application connection). Use when the user wants to build an application with ClickHouse, set up a local ClickHouse dev environment, create tables and start querying, deploy ClickHouse to production or ClickHouse Cloud, or migrate from a local setup to the cloud.

498SKILL.mdUpdated Jul 28, 2026

ClickHouse/infra-clickhouse

ClickHouse/clickstack-otel-collector

tools

VerifiedTrustedOfficial

Use when a user wants to wire an OpenTelemetry collector into a Managed ClickStack service on ClickHouse Cloud, either by deploying a new local collector (Docker run or Docker Compose) or by configuring their own existing collector, then send rich synthetic telemetry and verify it is visible in ClickStack.

489SKILL.mdUpdated Jul 14, 2026

ClickHouse/clickstack-otel-collector

ClickHouse/clickhouse-js-node-troubleshooting

tools

VerifiedTrustedOfficial

Troubleshoot and resolve common issues with the ClickHouse Node.js client (@clickhouse/client). Use this skill whenever a user reports errors, unexpected behavior, or configuration questions involving the Node.js client specifically — including socket hang-up errors, Keep-Alive problems, stream handling issues, data type mismatches, read-only user restrictions, proxy/TLS setup problems, or long-running query timeouts. Trigger even when the user hasn't precisely named the issue; vague symptoms like "my inserts keep failing" or "connection drops randomly" in a Node.js context are strong signals to use this skill. Do NOT use for browser/Web client issues.

482SKILL.mdUpdated May 19, 2026

ClickHouse/clickhouse-js-node-troubleshooting

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/ClickHouse/agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r agent-skills/skills/clickhouse-managed-postgres-rca ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

ClickHouse/agent-skills

459 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT