altinity-expert-clickhouse/skills/altinity-expert-clickhouse-index-analysis/SKILL.md
Analyze whether ClickHouse indexes (PRIMARY KEY, ORDER BY, skipping indexes, projections) are being used effectively for actual query patterns. Use when investigating index effectiveness, ORDER BY key design, query-to-index alignment, or when queries scan more data than expected.
npx skillsauth add altinity/skills altinity-expert-clickhouse-index-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run all queries from checks.sql in this skill's directory and analyze the results.
EXPLAIN indexes = 1
{query_without_format}
Look for:
PrimaryKey condition should not be true (means no filtering)Granules: X/Y ratio shows selectivity (low X/Y = good)Skip indexes should reduce parts/granules furtherSELECT
{columns} APPLY uniq
FROM {database}.{table}
WHERE {time_column} > now() - INTERVAL {days} DAY
Optimal ORDER BY ordering: Low cardinality columns first, high cardinality last.
WITH
any(query) AS q,
arrayJoin(extractAll(query, '\\b(?:PRE)?WHERE\\s+(.*?)\\s+(?:GROUP BY|ORDER BY|UNION|SETTINGS|FORMAT|$)')) AS w,
arrayFilter(x -> (position(w, extract(x, '\\.(`[^`]+`|[^\\.]+)$')) > 0), columns) AS c,
arrayJoin(c) AS c2
SELECT
c2,
count() AS usage_count
FROM system.query_log
WHERE event_time >= now() - toIntervalDay({days})
AND arrayExists(x -> x LIKE '%{table}%', tables)
AND query ILIKE 'SELECT%'
AND type = 'QueryFinish'
GROUP BY c2
ORDER BY usage_count DESC
FORMAT PrettyCompactMonoBlock
WITH
arrayJoin(extractAll(normalizeQuery(query), '\\b(?:PRE)?WHERE\\s+(.*?)\\s+(?:GROUP BY|ORDER BY|UNION|SETTINGS|FORMAT|$)')) AS w
SELECT
w AS where_pattern,
count() AS frequency
FROM system.query_log
WHERE event_time >= now() - toIntervalDay({days})
AND arrayExists(x -> x LIKE '%{table}%', tables)
AND query ILIKE 'SELECT%'
AND type = 'QueryFinish'
GROUP BY w
ORDER BY frequency DESC
LIMIT 20
SELECT
query_id,
normalized_query_hash,
selected_parts,
selected_marks,
read_rows,
round(read_rows / nullIf(selected_marks, 0)) AS rows_per_mark,
query_duration_ms,
formatReadableSize(read_bytes) AS read_bytes
FROM system.query_log
WHERE event_time >= now() - toIntervalDay({days})
AND arrayExists(x -> x LIKE '%{table}%', tables)
AND query ILIKE 'SELECT%'
AND type = 'QueryFinish'
ORDER BY selected_marks DESC
LIMIT 20
High selected_marks / total marks = poor index utilization.
-- Table structure with ORDER BY, PRIMARY KEY, indexes
SHOW CREATE TABLE {database}.{table}
-- Skipping indexes
SELECT name, type, expr, granularity
FROM system.data_skipping_indices
WHERE database = '{database}' AND table = '{table}'
Run the WHERE column extraction and normalized pattern queries to understand:
Compare cardinalities of columns in:
| Query Pattern | Index Support | Action | |---------------|---------------|--------| | Filters on ORDER BY prefix | ✅ Good | None | | Filters on non-ORDER BY cols | ⚠️ Skip index? | Add bloom_filter or projection | | Time range + entity | ⚠️ Check order | Time in ORDER BY or partition? | | High-cardinality first in ORDER BY | ❌ Bad | Reorder (low→high cardinality) |
toDate(ts))| Anti-Pattern | Problem | Fix |
|--------------|---------|-----|
| High-cardinality UUID first | No granule skipping | Move after low-cardinality columns |
| DateTime64 microseconds first | Too granular | Use toDate() or toStartOfHour() |
| Column in WHERE not in ORDER BY | Full scan | Add to ORDER BY or create projection |
| Bloom filter on ORDER BY column | Redundant | Remove skip index |
| Time not in ORDER BY or partition | Range queries scan all | Add toDate(ts) to ORDER BY prefix |
Given cardinalities:
entity_type: 6entity: 18,588cast_hash: 335,620Recommended ORDER BY: (entity_type, entity, cast_hash, ...)
tools
Read-only ClickHouse security audit expert for live or exported systems. Use when assessing ClickHouse security posture, reviewing users, roles, grants, settings profiles, row policies, table functions, external sources, table engines, executable UDFs, audit logs, named collections, password hash hygiene, SQL SECURITY DEFINER, impersonation, TLS/network exposure, Keeper/interserver security, encryption at rest, backups, the HTTP interface surface, cluster security, or version-specific ClickHouse security behavior. Diagnoses from SQL/system tables, supplied configuration files, query logs, access metadata, and ClickHouse/Altinity documentation.
tools
Diagnose and resolve ClickHouse grant and authentication errors, especially after upgrades. Use when queries fail with ACCESS_DENIED/NOT_ENOUGH_PRIVILEGES, AUTHENTICATION_FAILED/WRONG_PASSWORD/REQUIRED_PASSWORD, or ON CLUSTER privilege errors; when system.* or INFORMATION_SCHEMA access is denied; or when grant behavior changes after version upgrades.
tools
Profile a ClickHouse cluster via MCP and emit a per-cluster "analyst" Skill the user can save in claude.ai. Activate when the user asks to "profile this ClickHouse", "generate an analyst skill", "build a schema guide", "map the data in this cluster", or regenerate an existing cluster-analyst Skill after schema changes. Works against any ClickHouse with read-only SELECT/SHOW/DESCRIBE access via an `execute_query` MCP tool (e.g. the Altinity MCP server). Outputs a 5-file markdown bundle plus a README.
tools
Diagnose ClickHouse disk usage, compression efficiency, part sizes, and storage bottlenecks. Use for disk space issues and slow IO.