altinity-expert-clickhouse/skills/altinity-expert-clickhouse-merges/SKILL.md
Diagnose ClickHouse merge performance, part backlog, and 'too many parts' errors. Use for merge issues and part management problems.
npx skillsauth add altinity/skills altinity-expert-clickhouse-mergesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run all queries from checks.sql in this skill's directory (cluster-wide) one by one and produce a decision-ready report.
system.merges.system.part_log.merge_ok, merge_failed, last success/failure).merge_reason, merge_algorithm).system.merges.memory_usage, system.part_log.peak_memory_usage).database = 'system'database != 'system'Use one of these final verdicts explicitly:
PROVED: cluster-wide merge stop (no successful merges in the selected window).DECLINED: cluster-wide stop is false.PARTIAL: merges are blocked for specific table(s) while others still merge.Additional rules:
merge_ok = 0 with repeated MEMORY_LIMIT_EXCEEDED, report table-level block.Horizontal, state that planner selected horizontal merges (do not say vertical is disabled unless settings prove it).system.* tables, call out alert-source mismatch to avoid misattribution to business tables.For a specific table, run ad-hoc checks (time-bound and limited):
select
toStartOfMinute(event_time) as minute,
countIf(event_type = 'NewPart') as new_parts,
countIf(event_type = 'MergeParts') as merges,
countIf(event_type = 'MergeParts') - countIf(event_type = 'NewPart') as net_reduction
from system.part_log
where database = '{database}'
and table = '{table}'
and event_time > now() - interval 1 hour
group by minute
order by minute desc
limit 60
If net_reduction is negative consistently, inserts outpace merges.
Check system.merge_tree_settings if modified Suggest changing (reducing or increasing) in case of a problem as remediation.
Call out anti-patterns explicitly:
partition_id='all')TTL ... GROUP BY ... SET ... on a hot ingestion tableRecommended long-term direction:
-- Always include LIMIT
limit 100
-- Always time-bound historical queries
where event_time >= now() - interval 24 hour
-- For part_log, always filter event_type
where event_type in ('NewPart', 'MergeParts', 'MutatePart')
select * from system.part_log*_log tables| Finding | Load Module | Reason |
|---------|-------------|--------|
| High memory during merges / OOM | altinity-expert-clickhouse-memory | Memory limits and pressure |
| Slow merges + normal disk | altinity-expert-clickhouse-schema | ORDER BY/partitioning anti-patterns |
| Slow merges + high disk IO | altinity-expert-clickhouse-storage | Storage bottleneck |
| Merges blocked by mutations | altinity-expert-clickhouse-mutations | Mutation backlog |
| Replication lag + merge issues | altinity-expert-clickhouse-replication | Queue/replica bottlenecks |
system vs non-system)tools
Read-only ClickHouse security audit expert for live or exported systems. Use when assessing ClickHouse security posture, reviewing users, roles, grants, settings profiles, row policies, table functions, external sources, table engines, executable UDFs, audit logs, named collections, password hash hygiene, SQL SECURITY DEFINER, impersonation, TLS/network exposure, Keeper/interserver security, encryption at rest, backups, the HTTP interface surface, cluster security, or version-specific ClickHouse security behavior. Diagnoses from SQL/system tables, supplied configuration files, query logs, access metadata, and ClickHouse/Altinity documentation.
tools
Diagnose and resolve ClickHouse grant and authentication errors, especially after upgrades. Use when queries fail with ACCESS_DENIED/NOT_ENOUGH_PRIVILEGES, AUTHENTICATION_FAILED/WRONG_PASSWORD/REQUIRED_PASSWORD, or ON CLUSTER privilege errors; when system.* or INFORMATION_SCHEMA access is denied; or when grant behavior changes after version upgrades.
tools
Profile a ClickHouse cluster via MCP and emit a per-cluster "analyst" Skill the user can save in claude.ai. Activate when the user asks to "profile this ClickHouse", "generate an analyst skill", "build a schema guide", "map the data in this cluster", or regenerate an existing cluster-analyst Skill after schema changes. Works against any ClickHouse with read-only SELECT/SHOW/DESCRIBE access via an `execute_query` MCP tool (e.g. the Altinity MCP server). Outputs a 5-file markdown bundle plus a README.
tools
Diagnose ClickHouse disk usage, compression efficiency, part sizes, and storage bottlenecks. Use for disk space issues and slow IO.