skills/hotdata/SKILL.md
Use this skill when the user wants to run core hotdata CLI commands — auth, workspaces, connections, managed databases, datasets, tables, basic SQL query, sandboxes, database context (context:DATAMODEL), jobs, and skill install. Activate for "run hotdata", "list workspaces", "list connections", "create a connection", "list databases", "managed database", "load parquet", "list tables", "list datasets", "create a dataset", "execute a query", "list sandboxes", "database context", "context:DATAMODEL", or general Hotdata CLI usage. For full-text/vector search and retrieval indexes use hotdata-search; for OLAP analytics, query history, stored results, and Chain materializations use hotdata-analytics; for geospatial/GIS use hotdata-geospatial.
npx skillsauth add hotdata-dev/hotdata-cli hotdataInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use the hotdata CLI to interact with the Hotdata service. In this project, run it as:
hotdata <command> [args]
Or if installed on PATH: hotdata <command> [args]
Install all skills with hotdata skills install. Load specialized skills only when the task needs them:
| Skill | Use for |
|-------|---------|
| hotdata (this file) | Auth, workspaces, connections, databases, datasets, tables, basic query, context, sandboxes, jobs |
| hotdata-search | BM25, vector search, hotdata search, bm25/vector indexes, embedding providers |
| hotdata-analytics | OLAP SQL, aggregations, query/results history, Chain materializations, sorted indexes |
| hotdata-geospatial | PostGIS-style ST_*, WKB, spatial joins |
Run hotdata auth login (or hotdata auth with no subcommand—same behavior) to authenticate via browser login. Config is stored in ~/.hotdata/config.yml.
API key resolution (lowest to highest priority):
hotdata auth login / hotdata auth)HOTDATA_API_KEY environment variable (or .env file)--api-key <key> flag (works on any command)API URL defaults to https://api.hotdata.dev/v1 or overridden via HOTDATA_API_URL.
Optional: pass --debug on any command to print verbose HTTP request/response details.
Commands that accept --workspace-id default to the active workspace from config when omitted. Use hotdata workspaces set to switch interactively, or hotdata workspaces set <workspace_id> for a direct choice. In hotdata workspaces list, the * marker labels the default workspace the CLI resolves to.
hotdata queries does not accept --workspace-id: query run history always uses the active workspace—set it with workspaces set first if needed.
If HOTDATA_WORKSPACE is set in the environment, the workspace is locked to that value: passing a different --workspace-id is an error, and hotdata workspaces set fails (“workspace is locked”). workspaces set is also blocked while the current process was started under hotdata sandbox run (nested workspace changes are not allowed in that tree).
Omit --workspace-id unless you need to target a specific workspace (and it is not locked by env or session).
Notation context:<STEM>: In this skill, context:DATAMODEL, context:GLOSSARY, and context:<NAME> mean the authoritative Markdown document stored on the server under that stem via the Hotdata context API (/v1/databases/{database_id}/context, hotdata context …). That is not the same as generic English (“a data model”, “a glossary”), and not the same as local ./DATAMODEL.md except as pull/push transport. CLI commands use the bare stem (no context: prefix): e.g. hotdata context show DATAMODEL, hotdata context push GLOSSARY.
Context is scoped to the active database (set via hotdata databases set <id>). All context commands operate against the database returned by the active-database config unless you pass --database-id <id> (short: -d) explicitly. The authoritative copy always lives on the server under the stem; common stems are context:DATAMODEL (semantic map) and context:GLOSSARY (glossary / runbooks).
The CLI command hotdata context push reads ./<NAME>.md and pull writes that file in the current working directory—those files exist only as a transport surface for the API, not as a second source of truth. hotdata context show <name> prints Markdown to stdout so agents can read context:<NAME> without any local file. Stems follow SQL table–identifier rules (ASCII letters, digits, underscore; no dot in the API name; max 128 characters; SQL reserved words are not allowed). For show, pull, and push, the CLI accepts a trailing .md on the argument (e.g. USER.md) and treats it as stem USER—the database still stores USER, not USER.md.
Agents: do not blindly run
hotdata context show DATAMODELon session start. Runhotdata context listfirst (optional--prefix DATAMODEL). Callhotdata context show DATAMODELonly if the list includes theDATAMODELstem. Ifshowexits 1 with no context named …, that is normal when nothing has been pushed yet—not a hard failure; do not retry in a loop, and avoid speculativeshowin parallel with other shell tools where one failure cancels sibling calls. Proceed without context:DATAMODEL until the user asks to create or load one.
Agents (Claude and similar): use database context as the only durable store for context:DATAMODEL, context:GLOSSARY, and any other context:<STEM> documents you introduce. Keep transient analysis notes in sandbox markdown or the conversation until you promote them into context:DATAMODEL when they should guide the whole database (details below).
hotdata context list (and other stems such as context:GLOSSARY as needed). Only if DATAMODEL appears in the list, load it: hotdata context show DATAMODEL. If it is absent, skip show and treat context:DATAMODEL as unset—use references/DATA_MODEL.template.md when the user wants to bootstrap, then push when ready.hotdata context push DATAMODEL. The CLI requires a local ./DATAMODEL.md for that step: write the body there (from context show, the template, or your edits), then run push from the project directory.hotdata context pull DATAMODEL when you intentionally want a local ./DATAMODEL.md copy (for example a human editor); it still reflects API state for context:DATAMODEL, not a parallel document.The standard stem for the database semantic map is DATAMODEL (skill notation context:DATAMODEL). Add other stems the same way (e.g. GLOSSARY → context:GLOSSARY) for glossary or runbooks.
Keep two layers separate:
Analysis modeling (day to day) — Understanding data for the current task: exploratory SQL, join checks, column semantics for one report, hypotheses, scratch notes. Often conversational or short-lived. Sandbox markdown (sandbox update --markdown) is the right home while you explore; it dies with the sandbox unless you copy it elsewhere.
context:DATAMODEL (Hotdata database data model) — A durable, database-scoped map stored only via the context API: entities and tables across connections, PK/FK relationships, how datasets tie back to sources, naming and query conventions the whole team should rely on. This is higher-level shared structure, not a transcript of one investigation.
Promotion: When analysis findings should outlive the sandbox or session and guide everyone, merge them into context:DATAMODEL (hotdata context list → if DATAMODEL is listed, hotdata context show DATAMODEL → reconcile → hotdata context push DATAMODEL). You do not need to update context:DATAMODEL after every ad-hoc query—only when the database story or join graph meaningfully changes.
Use references/DATA_MODEL.template.md and references/MODEL_BUILD.md for what to write inside the Markdown you store under context: stems. Never put database-specific model text inside agent skill install paths—only in database context (and transient ./<NAME>.md for push/pull when needed).
These are patterns built from the commands below—not separate CLI subcommands:
context:DATAMODEL) — The shared Markdown semantic map of the active database (entities, keys, joins across connections). Store and read it only via database context (hotdata context list, then hotdata context show DATAMODEL only when listed, context push DATAMODEL); refresh using connections, connections refresh, tables list, and datasets list. For a deep pass (connector enrichment, indexes, per-table detail), see references/MODEL_BUILD.md. Contrast analysis modeling in sandboxes or chat (see Analysis modeling vs context:DATAMODEL).hotdata-analytics and references/WORKFLOWS.md.hotdata-search.Catalog, skill decision tree, epic flows (onboard, chain, retrieval), datasets vs databases, and sandbox procedures: references/WORKFLOWS.md.
Top-level subcommands (each detailed below): auth, datasets, query, workspaces, connections, databases, tables, skills, results, jobs, indexes, embedding-providers, search, queries, sandbox, context, completions. Search, indexes (bm25/vector), and embedding providers are documented in hotdata-search; query history, results, Chain, and OLAP patterns in hotdata-analytics.
Global CLI options: --api-key, -v / --version, -h / --help. Hidden developer flag: --debug (verbose HTTP logs).
hotdata workspaces list [--output table|json|yaml]
Returns workspaces with public_id, name, active, favorite, provision_status. Table output marks the default workspace with *.
hotdata connections list [--workspace-id <workspace_id>] [--output table|json|yaml]
hotdata connections <connection_id> [--workspace-id <workspace_id>] [--output table|json|yaml]
list returns id, name, source_type for each connection.hotdata connections refresh <connection_id> [--workspace-id <workspace_id>] [--data] [--schema <name> --table <name>] [--async] [--include-uncached]
hotdata tables list and queries. Use after DDL or other changes in the source database when the workspace view is stale.--data re-syncs cached row data from the source instead of refreshing the catalog.--schema and --table narrow a data refresh to a single table (must be supplied together).--async submits a data refresh as a background job and returns a job ID; poll with hotdata jobs <job_id>. Only valid with --data — schema refresh is always synchronous.--include-uncached includes tables that haven't been cached yet in a connection-wide data refresh. Only valid with --data and no --table.hotdata connections create list [--workspace-id <workspace_id>] [--output table|json|yaml]
Returns all available connection types with name and label.
hotdata connections create list <name> [--workspace-id <workspace_id>] [--output json]
Returns config and auth JSON Schema objects describing all required and optional fields for that connection type. Use --output json to get the full schema detail.
config — connection configuration fields (host, port, database, etc.). May be null for services that need no configuration.auth — authentication fields (password, token, credentials, etc.). May be null for services that need no authentication. May be a oneOf with multiple authentication method options.hotdata connections create \
--name "my-connection" \
--type <source_type> \
--config '<json object>' \
[--workspace-id <workspace_id>] [--output table|json|yaml]
The --config JSON object must contain all required fields from config plus the auth fields merged in at the top level. Auth fields are not nested — they sit alongside config fields in the same object.
Example for PostgreSQL (required: host, port, user, database + auth field password):
hotdata connections create \
--name "my-postgres" \
--type postgres \
--config '{"host":"db.example.com","port":5432,"user":"myuser","database":"mydb","password":"..."}'
Security: never expose credentials in plain text. Passwords, tokens, API keys, and any field with "format": "password" in the schema must never be hardcoded as literal strings in CLI commands. Always use one of these safe approaches:
--config "{\"host\":\"db.example.com\",\"port\":5432,\"user\":\"myuser\",\"database\":\"mydb\",\"password\":\"$DB_PASSWORD\"}"
--config "{\"token\":\"$(cat ~/.secrets/my-token)\"}"
Field-building rules from the schema:
config.required — these are mandatory.auth with a single method (no oneOf): include all auth.required fields in the config object.auth with oneOf: pick one authentication method and include only its required fields."format": "password" are credentials — apply the security rules above."type": "integer" must be JSON numbers, not strings (e.g. "port": 5432 not "port": "5432")."type": "boolean" must be JSON booleans (e.g. "use_tls": true)."type": "array" must be JSON arrays (e.g. "spreadsheet_ids": ["abc", "def"]).oneOf fields must be a JSON object including a "type" discriminator field matching the chosen variant's const value.databases)Managed databases are Hotdata-owned catalogs you create and populate yourself — no remote source to sync. Query them in SQL as <database_id>.<schema>.<table>. Prefer hotdata databases for this workflow.
Parquet vs datasets: databases tables load accepts parquet only. For SQL-query or saved-query materializations, use hotdata datasets create.
Active database: hotdata databases set <id_or_description> saves the active database to config. All databases tables subcommands and all context commands default to the active database; pass --database <id> to override per-command.
hotdata databases list [--workspace-id <workspace_id>] [--output table|json|yaml]
hotdata databases create [--name <display_name>] [--catalog <alias>] [--table <table> ...] [--schema public] [--expires-at <duration|timestamp>] [--workspace-id <workspace_id>] [--output table|json|yaml]
hotdata databases set <id_or_name>
hotdata databases unset
hotdata databases <id_or_name> [--workspace-id <workspace_id>] [--output table|json|yaml]
hotdata databases delete <id_or_name> [--workspace-id <workspace_id>]
hotdata databases run [--database <id>] [--name <label>] [--schema public] [--table <table> ...] [--expires-at <duration|timestamp>] [--workspace-id <workspace_id>] <cmd> [args...]
hotdata databases <id> run <cmd> [args...]
# Preferred: load by catalog alias (auto-declares table if needed)
hotdata databases load --catalog <alias> --table <table> [--schema public] (--file <path> | --url <url> | --upload-id <id>) [--workspace-id <workspace_id>]
# Also available via tables subcommand
hotdata databases tables list [--database <id_or_name>] [--schema <name>] [--workspace-id <workspace_id>] [--output table|json|yaml]
hotdata databases tables load <table> [--database <id_or_name>] [--schema public] (--file <path> | --url <url> | --upload-id <id>) [--workspace-id <workspace_id>]
hotdata databases tables delete <table> [--database <id_or_name>] [--schema public] [--workspace-id <workspace_id>]
list — all managed databases in the workspace. Active database is marked with *.create — creates a new managed database. --name is an optional human-readable display name. --catalog sets the SQL alias used in queries (SELECT … FROM <catalog>.schema.table); must be [a-z_][a-z0-9_]*. --expires-at accepts relative durations (24h, 7d, 90m) or an RFC 3339 timestamp; omitting means no expiry. Repeat --table to declare tables up front.set — saves <id_or_name> as the active database. Subsequent databases tables and context commands use it automatically.unset — clears the active database from config.<id_or_name> — inspect one database (id, catalog, name, expires_at).delete — removes the managed database; clears the active-database config if it matched.load (top-level shorthand) — loads parquet into --catalog.--schema.--table. Accepts --file, --url, or --upload-id. If the table was not declared at create time, the CLI automatically deletes and recreates the database with the table declared, then retries the load.tables list — lists tables with TABLE (<catalog>.<schema>.<table>), SYNCED, LAST_SYNC. Uses active database when --database is omitted.tables load — uploads a local parquet file (--file), a remote parquet URL (--url), or a pre-staged upload (--upload-id) and publishes with replace mode.tables delete — drops a table from the managed database.run — mints a database-scoped JWT (via POST /v1/auth/database) and execs <cmd> with HOTDATA_DATABASE_TOKEN, HOTDATA_DATABASE_REFRESH_TOKEN, HOTDATA_DATABASE, HOTDATA_WORKSPACE, and HOTDATA_API_URL injected. Pass a database id as a group positional (hotdata databases <id> run ..., sandbox-style) or via --database <id>; omit both to auto-create a scratch database using --name / --schema / --table / --expires-at. Use this to launch an agent or child process whose API access is scoped to a single database. The minted JWT carries database, workspaces, permissions:["read","write"], source:"database_token". The session is persisted at ~/.hotdata/database_session.json (mode 0600); the child's exit code is propagated.Example:
hotdata databases create --catalog airbnb
hotdata databases load --catalog airbnb --table listings --url https://example.com/listings.parquet
hotdata query "SELECT count(*) FROM airbnb.public.listings"
hotdata tables list [--workspace-id <workspace_id>] [--connection-id <connection_id>] [--schema <pattern>] [--table <pattern>] [--limit <int>] [--cursor <cursor>] [--output table|json|yaml]
table.query command to query information_schema for this purpose.--connection-id: lists all tables with table, synced, last_sync. The table column is formatted as <connection>.<schema>.<table>.--connection-id: includes column definitions. Lists each column as its own row with table, column, data_type, nullable. Use this to inspect the schema before writing queries.<connection>.<schema>.<table> name when referencing tables in SQL queries.--schema and --table support SQL % wildcard patterns (e.g. --table order% matches orders, order_items, etc.).--cursor token is printed — pass it to fetch the next page.Datasets are managed files uploaded to Hotdata and queryable as tables.
hotdata datasets list [--workspace-id <workspace_id>] [--limit <int>] [--offset <int>] [--output table|json|yaml]
table.id, label, and created_at; table output includes a FULL NAME column (datasets.<schema>.<table>).--offset to fetch further pages.datasets list always returns all datasets in the workspace. To tell sandbox-scoped datasets from workspace-wide ones, read FULL NAME: the middle segment is the sandbox id (e.g. datasets.s_ufmblmvq.tac_csat) for sandbox data, and usually main (e.g. datasets.main.my_table) for ordinary uploads.hotdata datasets <dataset_id> [--workspace-id <workspace_id>] [--output table|json|yaml]
name, data_type, nullable.FULL NAME from datasets list or the full_name printed by datasets create—especially for sandbox datasets, where the schema is datasets.<sandbox_id>, not datasets.main.hotdata datasets update <dataset_id> [--description <label>] [--name <table_name>] [--workspace-id <workspace_id>] [--output table|json|yaml]
--description or --name.hotdata datasets create --name <table_name> [--description "My Dataset"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>]
--name (required) — SQL table name the dataset is addressable as (e.g. my_view).--description (optional) — human-readable display label; defaults to --name when omitted.--sql or --query-id is required:
--sql — create from an inline SQL query result.--query-id — create from a previously saved query.hotdata databases tables load instead.datasets create, the CLI prints a full_name line (e.g. datasets.main.my_view). Always use that full_name in SQL—do not assume datasets.main.hotdata datasets refresh <dataset_id> [--workspace-id <workspace_id>] [--async]
400 directly.--async submits the refresh as a background job and returns a job_id; poll with hotdata jobs <job_id>.Qualified dataset tables are datasets.<schema>.<table_name>: main for workspace-scoped datasets (created outside a sandbox), or the sandbox id for sandbox-created data (e.g. datasets.s_ufmblmvq.tac_csat). The create output’s full_name is authoritative—copy it into FROM / JOIN clauses instead of guessing datasets.main.….
Example (workspace dataset on main):
hotdata query "SELECT * FROM datasets.main.my_dataset LIMIT 10"
Use hotdata datasets <dataset_id> to inspect schema and names before writing queries.
Reads and writes database-scoped context API documents. Context is tied to the active database (set via hotdata databases set); pass --database-id <id> (short: -d) to target a specific database. show needs no local file; push / pull use ./<NAME>.md in the current directory only as the CLI transport format. See Database context (API).
hotdata context list [--database-id <id>] [--prefix <stem>] [--output table|json|yaml]
hotdata context show <name> [--database-id <id>]
hotdata context pull <name> [--database-id <id>] [--force] [--dry-run]
hotdata context push <name> [--database-id <id>] [--dry-run]
list — names, updated_at, and character counts for each stored context in the active database. Use --prefix to narrow names (case-sensitive). Agents: call list before show for DATAMODEL (or any stem) so you do not rely on show failing when the document does not exist yet.show — print the Markdown body to stdout (use this when there is no local ./<NAME>.md; ideal for agents). Errors if no context with that name exists (exit 1)—expected for a new database; use list first to avoid that path.pull — download context name to ./<NAME>.md. Refuses to overwrite an existing file unless --force. --dry-run prints target path and size only.push — upload ./<NAME>.md to upsert context name on the server. --dry-run prints size only. Body size must stay within the API limit (order of 512k characters).Convention: context:DATAMODEL is the primary database semantic map; context:GLOSSARY (or other context:<STEM> docs) for additional narrative context. Same identifier rules as SQL table names. CLI: hotdata context show DATAMODEL (bare stem).
hotdata query "<sql>" [--workspace-id <workspace_id>] [--connection <connection_id>] [--output table|json|csv]
hotdata query status <query_run_id> [--output table|json|csv]
table (row count and execution time).hotdata tables list for discovery — not information_schema via query.query_run_id → poll with query status (do not re-run the same heavy SQL).hotdata-analytics skill.hotdata-search skill.To create a dataset from a saved query: hotdata datasets create --query-id <saved_query_id>.
hotdata jobs list [--workspace-id <workspace_id>] [--job-type <type>] [--status <status>] [--all] [--limit <n>] [--offset <n>] [--output table|json|yaml]
hotdata jobs <job_id> [--workspace-id <workspace_id>] [--output table|json|yaml]
list shows only active jobs (pending, running) by default. Use --all to see all jobs.--job-type: data_refresh_table, data_refresh_connection, dataset_refresh, create_index, create_dataset_index.--status: pending, running, succeeded, partially_succeeded, failed.hotdata jobs <job_id> to inspect a specific job's status, error, and result.skills)Bundled Markdown skills (hotdata, hotdata-search, hotdata-analytics, hotdata-geospatial) ship with the CLI release tarball.
hotdata skills install [--project]
hotdata skills status
install — Downloads and installs skills to ~/.hotdata/skills/<skill>, then symlinks into ~/.agents/skills and into ~/.claude/skills / ~/.pi/skills when those directories exist. --project instead copies into ./.agents/skills/<skill> in the current directory (and links ./.claude / ./.pi when present). The CLI may auto-refresh skills after an upgrade when appropriate.status — Reports installed vs current CLI version and where skills are linked.hotdata completions <bash|zsh|fish>
Writes completion script for the chosen shell to stdout (redirect into your shell’s completion path as usual).
hotdata auth login # Browser-based login (same as: hotdata auth)
hotdata auth # Browser-based login (same as: hotdata auth login)
hotdata auth status # Check current auth status
hotdata auth logout # Remove saved auth for the default profile
Sandboxes are for ad-hoc, exploratory work that does not need to be long-lived. They group related CLI activity (queries, dataset operations, etc.) under a single context so it can be tracked and cleaned up together. Datasets created inside a sandbox are tied to that sandbox and will be removed when the sandbox ends. If you need data to persist beyond the sandbox, create datasets outside of a sandbox context.
Active sandbox in config vs sandbox run: If you already have the right sandbox selected (hotdata sandbox new or hotdata sandbox set <sandbox_id> shows it with * in sandbox list), run follow-up commands directly (hotdata datasets create …, hotdata query …, etc.). The CLI attaches the sandbox from saved config to API requests. hotdata sandbox run <cmd> with no sandbox ID before run always creates a brand-new sandbox and runs the child under that new ID—it does not reuse the active sandbox from config. To wrap a command in an existing sandbox, use hotdata sandbox <sandbox_id> run <cmd> [args…].
IMPORTANT: If
HOTDATA_SANDBOXis set in the environment, you are inside an active sandbox. NEVER attempt to unset, override, or work around this variable. Do not clear it, do not start a new sandbox, do not runsandbox runorsandbox neworsandbox set. All your work should be attributed to the current sandbox. Attempting to nest or escape a sandbox will fail with an error.
hotdata sandbox list [--workspace-id <workspace_id>] [--output table|json|yaml]
hotdata sandbox <sandbox_id> [--workspace-id <workspace_id>] [--output table|json|yaml]
hotdata sandbox new [--name "Sandbox Name"] [--output table|json|yaml]
hotdata sandbox set [<sandbox_id>]
hotdata sandbox read
hotdata sandbox update [<sandbox_id>] [--name "New Name"] [--markdown "..."] [--output table|json|yaml]
hotdata sandbox run <cmd> [args...]
hotdata sandbox <sandbox_id> run <cmd> [args...]
list shows all sandboxes with a * marker on the active one.new creates a sandbox and sets it as active. Blocked inside an existing sandbox.set switches the active sandbox. Omit the ID to clear. Blocked inside an existing sandbox.read prints the markdown content of the current sandbox. Use this to retrieve sandbox state at the start of work or between steps.update modifies a sandbox's name or markdown. Defaults to the active sandbox if no ID is given. The --markdown field is for writing details about the work being done in the sandbox — observations, intermediate findings, next steps, etc. This state persists for the life of the sandbox and is the primary way to record context that should survive across commands or agent invocations within the sandbox.run launches a command with HOTDATA_SANDBOX and HOTDATA_WORKSPACE set in the child process environment. hotdata sandbox run <cmd> (no ID before run) always POSTs a new sandbox; it never picks up the active sandbox from sandbox set / sandbox new. Use hotdata sandbox <sandbox_id> run <cmd> to run under an existing sandbox. Blocked inside an existing sandbox.HOTDATA_SANDBOX is set or a sandbox is the saved default (sandbox new / sandbox set), the CLI includes sandbox scope on API calls — no extra sandbox flags on query, datasets, etc.Sandbox-scoped data access: Queries and other operations against sandbox-only resources must run with sandbox context attached—either the active sandbox in config (sandbox set) or a child process started with hotdata sandbox <sandbox_id> run … (which sets HOTDATA_SANDBOX). Running hotdata query or similar with no sandbox in config and not under sandbox … run can produce access denied for tables or datasets that exist only inside a sandbox.
Use a sandbox to explore tables and capture analysis-oriented notes in sandbox markdown (keys, joins, open questions)—day-to-day modeling for this investigation, not context:DATAMODEL until you promote it.
hotdata sandbox new --name "Sales pipeline"
hotdata tables list --connection-id <connection_id>
hotdata query "SELECT DISTINCT status FROM sales.public.deals LIMIT 20"
hotdata query "SELECT count(*), count(DISTINCT account_id) FROM sales.public.deals"
hotdata sandbox update --markdown "## sales pipeline model
### deals (sales.public.deals)
- PK: id
- FK: account_id -> accounts.id
- status: open | won | lost
- ~50k rows, one row per deal
### accounts (sales.public.accounts)
- PK: id
- name, industry, created_at
- ~12k rows, one row per company
### TODO
- check how line_items joins to deals
- confirm revenue column semantics"
./DATAMODEL.md in the project directory and run hotdata context push DATAMODEL (if context:DATAMODEL already exists on the server, merge with hotdata context show DATAMODEL first—confirm DATAMODEL appears in hotdata context list before show).Also available: hotdata connections new — interactive connection wizard (no substitute for the programmatic connections create flow above).
hotdata context list first; only if DATAMODEL is listed, run hotdata context show DATAMODEL to load context:DATAMODEL. If it is not listed, do not run show—ad-hoc analysis does not require populated context:DATAMODEL.hotdata connections list
hotdata tables list
hotdata tables list --connection-id <connection_id>
hotdata query "SELECT 1"
hotdata query "SELECT \"CustomerName\" FROM datasets.main.my_csv LIMIT 10"
hotdata databases create --catalog mydb
hotdata databases load --catalog mydb --table events --file ./events.parquet
hotdata databases load --catalog mydb --table events --url https://example.com/events.parquet
hotdata databases tables list
hotdata query "SELECT * FROM mydb.public.events LIMIT 10"
For CSV/JSON file uploads, use hotdata datasets create instead.
hotdata connections create list
hotdata connections create list <type_name> --output json
hotdata connections create --name "my-connection" --type <type_name> --config '<json>'
data-ai
Use this skill when the user wants full-text search, BM25 keyword search, vector similarity search, semantic search, embeddings, or retrieval indexes in Hotdata. Activate for "hotdata search", "BM25", "full-text", "vector search", "semantic search", "similarity", "embedding", "embedding provider", "create an index" (bm25 or vector), "list indexes" for search, or SQL using bm25_search or vector_distance. Do not load for general SQL analytics (aggregations, GROUP BY) or geospatial work — use hotdata-analytics or hotdata-geospatial instead. Requires the core hotdata skill for auth and workspace basics.
data-ai
Use this skill when the user wants OLAP-style SQL analytics in Hotdata — aggregations, GROUP BY, JOINs, reporting, exploratory queries, query run history, stored results, or materialized follow-up tables (Chain via datasets or managed databases). Activate for "analyze", "aggregate", "rollup", "pivot", "report", "metrics", "GROUP BY", "query history", "past queries", "query runs", "stored results", "materialize", "chain", "intermediate table", or sorted indexes for filters/range scans. Do not load for BM25/vector search or geospatial SQL — use hotdata-search or hotdata-geospatial. Requires the core hotdata skill for connections, tables, datasets, and auth.
development
Use this skill only when the user is working with geospatial data in Hotdata (PostGIS-style SQL like ST_* functions, geometry/WKB, bbox filtering, point-in-polygon, distance/area, lat/lon, spatial joins, “geospatial”, “GIS”, “PostGIS”). Do not load this skill for non-geospatial SQL or general Hotdata usage.
tools
Use when work should span one or more detached tasks but still behave like one job with a single owner context. TaskFlow is the durable flow substrate under authoring layers like Lobster, ACPX, plugins, or plain code. Keep conditional logic in the caller; use TaskFlow for flow identity, child-task linkage, waiting state, revision-checked mutations, and user-facing emergence.