skills/hotdata-analytics/SKILL.md
Use this skill when the user wants OLAP-style SQL analytics in Hotdata — aggregations, GROUP BY, JOINs, reporting, exploratory queries, query run history, stored results, or materialized follow-up tables (Chain via datasets or managed databases). Activate for "analyze", "aggregate", "rollup", "pivot", "report", "metrics", "GROUP BY", "query history", "past queries", "query runs", "stored results", "materialize", "chain", "intermediate table", or sorted indexes for filters/range scans. Do not load for BM25/vector search or geospatial SQL — use hotdata-search or hotdata-geospatial. Requires the core hotdata skill for connections, tables, datasets, and auth.
npx skillsauth add hotdata-dev/hotdata-cli hotdata-analyticsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
OLAP-style analytics in Hotdata: PostgreSQL-dialect SQL, query execution, run history, stored results, Chain materializations, and sorted indexes for filters and joins.
Prerequisites: Authenticate, workspace, and catalog discovery via the hotdata skill (connections, tables, datasets, databases).
Related skills: hotdata-search (BM25, vector, retrieval indexes), hotdata-geospatial (spatial SQL).
hotdata query "<sql>" [--workspace-id <workspace_id>] [--connection <connection_id>] [--output table|json|csv]
hotdata query status <query_run_id> [--output table|json|csv]
"CustomerName".hotdata tables list for schema discovery — not information_schema via query.<connection>.<schema>.<table>, datasets.<schema>.<table>, <database>.<schema>.<table>.query_run_id → poll with query status (exit 2 = still running). Do not re-run identical heavy SQL while polling.hotdata context list → show DATAMODEL) — see hotdata skill.Typical analytics SQL (all via hotdata query):
COUNT, SUM, AVG, MIN, MAX with GROUP BYINNER / LEFT JOIN across <connection>.<schema>.<table> namesWHERE on partition-friendly columns (consider sorted indexes below)ORDER BY on metrics or dimensionsLIMIT while iterating; widen once validatedColumn names from CSV uploads may be case-sensitive — use double quotes when not all-lowercase.
Uses the active workspace only (no --workspace-id; set with hotdata workspaces set).
hotdata queries list [--limit <int>] [--cursor <token>] [--status <csv>] [--output table|json|yaml]
hotdata queries <query_run_id> [--output table|json|yaml]
list — status, duration, row count, SQL preview (default limit 20). Filter: --status running,failed.<query_run_id> — full metadata, formatted SQL, result_id when present.WHERE / JOIN / GROUP BY patterns before adding indexes (search skill) or chains.hotdata results list [--workspace-id <workspace_id>] [--limit <int>] [--offset <int>] [--output table|json|yaml]
hotdata results <result_id> [--workspace-id <workspace_id>] [--output table|json|csv]
results <id> over re-running identical heavy queries.[result-id: rslt...]; also available from queries <query_run_id>.Pattern: run SQL → materialize a smaller table → query the materialized name.
Base query
hotdata query "SELECT ..."
hotdata query status <query_run_id> # if async
Materialize (pick one)
hotdata datasets create --name chain_slice [--description "chain slice"] --sql "SELECT ..."
hotdata datasets create --name chain_from_saved [--description "from saved"] --query-id <query_id>
Or managed parquet:
hotdata databases create --catalog analytics
hotdata databases load --catalog analytics --table slice --file ./slice.parquet
Chain query — use printed full_name or datasets list FULL NAME column:
hotdata query "SELECT * FROM datasets.main.chain_slice WHERE ..."
hotdata query "SELECT * FROM analytics.public.slice WHERE ..."
Document stable chains in context:DATAMODEL → Derived tables (Chain).
Full procedure: references/WORKFLOWS.md.
For equality, range, and sort-heavy OLAP — not full-text or vector (see hotdata-search):
hotdata indexes create --connection-id <id> --schema <schema> --table <table> \
--name idx_orders_created --column created_at --type sorted [--async]
List and delete use the same hotdata indexes commands as in the search skill; only --type sorted is the analytics focus here.
Sandbox datasets use datasets.<sandbox_id>.<table>, not datasets.main. Run queries with active sandbox config or hotdata sandbox <id> run hotdata query "...". See hotdata skill Sandboxes.
data-ai
Use this skill when the user wants full-text search, BM25 keyword search, vector similarity search, semantic search, embeddings, or retrieval indexes in Hotdata. Activate for "hotdata search", "BM25", "full-text", "vector search", "semantic search", "similarity", "embedding", "embedding provider", "create an index" (bm25 or vector), "list indexes" for search, or SQL using bm25_search or vector_distance. Do not load for general SQL analytics (aggregations, GROUP BY) or geospatial work — use hotdata-analytics or hotdata-geospatial instead. Requires the core hotdata skill for auth and workspace basics.
development
Use this skill only when the user is working with geospatial data in Hotdata (PostGIS-style SQL like ST_* functions, geometry/WKB, bbox filtering, point-in-polygon, distance/area, lat/lon, spatial joins, “geospatial”, “GIS”, “PostGIS”). Do not load this skill for non-geospatial SQL or general Hotdata usage.
tools
Use this skill when the user wants to run core hotdata CLI commands — auth, workspaces, connections, managed databases, datasets, tables, basic SQL query, sandboxes, database context (context:DATAMODEL), jobs, and skill install. Activate for "run hotdata", "list workspaces", "list connections", "create a connection", "list databases", "managed database", "load parquet", "list tables", "list datasets", "create a dataset", "execute a query", "list sandboxes", "database context", "context:DATAMODEL", or general Hotdata CLI usage. For full-text/vector search and retrieval indexes use hotdata-search; for OLAP analytics, query history, stored results, and Chain materializations use hotdata-analytics; for geospatial/GIS use hotdata-geospatial.
data-ai
Example TaskFlow authoring pattern for inbox triage. Use when messages need different treatment based on intent, with some routes notifying immediately, some waiting on outside answers, and others rolling into a later summary.