skills/metabase-database-metadata/SKILL.md
Understands the Metabase Database Metadata Format — a YAML-based on-disk representation of databases, tables, and fields synced from a Metabase instance. Use when the user needs to read, edit, or understand metadata files produced by `@metabase/database-metadata`, or when reasoning about a project's schema (columns, types, FK relationships) through the `.metadata/databases` folder.
npx skillsauth add metabase/agent-skills metabase-database-metadataInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Metabase represents database metadata — synced databases, their tables, and their fields — as a tree of YAML files. Files are diff-friendly: numeric IDs are omitted entirely, and foreign keys use natural-key tuples like ["Sample Database", "PUBLIC", "ORDERS"] instead of database identifiers.
The format is defined by a specification bundled alongside this file as spec.md (upstream source: metabase/database-metadata). The same project ships a CLI (@metabase/database-metadata on npm) that converts the raw JSON exported from a Metabase instance into the YAML tree described by the spec.
All metadata for a project lives under a top-level .metadata/ directory:
.metadata/databases/ — the YAML tree. This is the canonical source for the agent. Read these files to understand the schema, columns, types, and FK relationships..metadata/table_metadata.json — the raw JSON exported from the Metabase instance. Potentially multi-megabyte (or multi-gigabyte) JSON with flat databases / tables / fields arrays. Never open, grep, or pass it to tools. It exists only as input to the extractor.The .metadata/ directory should be gitignored. On large warehouses the extracted metadata can reach gigabytes — committing it would make the repo painful or unusable.
Do not run any of the steps below proactively at session start. Only run them when the user explicitly asks to fetch metadata, set up the workflow, or requests something that plainly requires knowledge of the database schema (e.g. "write a query against ORDERS", "describe what tables exist").
When setup is triggered:
.metadata/ is gitignoredRead the repo's .gitignore and confirm .metadata/ is listed. If it isn't, ask the user before modifying .gitignore — e.g.:
.metadata/is not in.gitignore. Committing it would bloat the repo (metadata can be gigabytes). Shall I add it?
Only edit .gitignore after the user confirms.
Fetch table_metadata.json by calling POST /api/ee/serialization/metadata/export on the Metabase instance and writing the response to .metadata/table_metadata.json. The endpoint accepts three boolean query parameters that opt sections in or out — they all default to false, so requests must explicitly set the sections they want:
with-databases — include the databases array.with-tables — include the tables array.with-fields — include the fields array.A typical full export sets all three to true. The user supplies the base URL and an API key (e.g. via METABASE_URL and METABASE_API_KEY env vars):
mkdir -p .metadata
curl -sf -X POST "$METABASE_URL/api/ee/serialization/metadata/export?with-databases=true&with-tables=true&with-fields=true" \
-H "X-API-Key: $METABASE_API_KEY" \
-o .metadata/table_metadata.json
If the user has not provided credentials, ask for them before running the call.
Once .metadata/table_metadata.json is in place:
rm -rf .metadata/databases
npx @metabase/database-metadata extract-table-metadata .metadata/table_metadata.json .metadata/databases
Then read the YAML tree under .metadata/databases/ to answer the user's question.
At the start of a session, do not run any fetch commands. Just observe what's on disk:
.metadata/table_metadata.json and .metadata/databases/ both exist, assume the tree is sufficiently up to date and use it directly. Do not refetch.If something in the tree looks stale or inconsistent while you're using it, mention it to the user and let them decide whether to refetch. Never refresh silently.
If the user explicitly asks to refresh metadata, re-run the export call to overwrite .metadata/table_metadata.json, then re-run the extract step. Always remove .metadata/databases before re-extracting so stale files are not left behind.
Three entity types, two file types:
| Entity | File | Description |
|--------|------|-------------|
| Database | .metadata/databases/{db}/{db}.yaml | A connected data source (Postgres, MySQL, BigQuery, etc.). Identified by name. |
| Table | .metadata/databases/{db}/schemas/{schema}/tables/{table}.yaml (or .../tables/{table}.yaml for schemaless DBs) | A physical table or view. Contains a fields array with all its columns nested inline. |
| Field | (nested inside a Table YAML, no separate file) | A column. Includes base_type, database_type, and optionally effective_type, semantic_type, coercion_strategy, parent_id, fk_target_field_id. |
Foreign keys use natural-key tuples, not numeric IDs:
"Sample Database"[database, schema_or_null, table] — e.g. ["Sample Database", "PUBLIC", "ORDERS"][database, schema_or_null, table, field, ...nested_field_names] — e.g. ["Sample Database", "PUBLIC", "EVENTS", "DATA", "user", "name"] for a JSON-unfolded column DATA.user.nameField-level FKs show up as parent_id (nested field parent) and fk_target_field_id (referenced PK for FK columns).
database_type — the raw native type string from the driver (BIGINT, VARCHAR, JSONB, etc.). Database-specific.base_type — the Metabase type matching the native type (type/BigInteger, type/Text, type/Structured, etc.).effective_type — the type Metabase treats the column as at query time. Only emitted when it differs from base_type (i.e. coercion is configured).coercion_strategy — the rule producing effective_type from base_type (e.g. Coercion/ISO8601->DateTime, Coercion/UNIXMilliSeconds->DateTime).semantic_type — business-domain label (type/PK, type/FK, type/Email, type/Category, type/Latitude, etc.). Drives UI and some analytical behavior.See the bundled spec for the full type hierarchy and available coercion strategies.
This skill ships with a local snapshot of the spec as spec.md, alongside SKILL.md.
Read it on demand, not eagerly. Open spec.md only when you actually need detail beyond what SKILL.md summarizes — e.g. the full base-type / semantic-type hierarchy, the complete list of coercion strategies, or the exact folder-path rules. Do not open it at session start, and do not open it for tasks unrelated to the metadata tree.
If the bundled copy looks out of date with the upstream package, the skill's own README.md documents how to refresh it with extract-spec.
tools
Drive a Metabase instance from the terminal via the `mb` CLI. Authenticate with named profiles; inspect databases (list, get, full metadata rollup, schemas, tables in a schema) and trigger manual schema sync / field-values rescan; inspect tables, fields; list/get/create/update/archive cards (questions, models, metrics) and run them as JSON/CSV/XLSX; list/get/create/update dashboards and patch dashcards; list/get/create collections and traverse the hierarchy by id, entity_id, or "root"/"trash" (with items and recursive tree); list/get/create/update/archive native query snippets, segments, and measures; author/update/run transforms and schedule transform-jobs; read/update settings; search content (cards, dashboards, collections, transforms, metrics); manage Enterprise workspaces; git-sync to/from a git remote (status, dirty, import, export, branches, stash, add/remove a collection from sync). Use whenever the user wants to interact with a Metabase from the terminal — "log into metabase", "what profiles do I have", "list cards", "run card 42 as CSV", "create a transform", "list dashboards", "move a dashcard", "list collections", "what's in collection 4", "show the collection tree", "list snippets", "create a segment", "archive a measure", "search metabase for X", "spin up a workspace", "import the latest changes", "add a directory to git sync", "set a setting", "what schemas are in this database", "trigger a sync", "rescan field values", or anything hitting `mb <verb>`.
development
Runs the Metabase semantic checker against a tree of Representation Format YAML files to verify that all references resolve — cross-entity references (collection_id, dashboard_id, parent_id, parameter source cards, snippet references, transform tags, etc.) and references to columns inside MBQL and native queries. Slow (≥1 min per run). Only use when the user explicitly asks to verify entity references or column references in MBQL/SQL queries; in most cases this runs as a CI step, not locally. Requires database metadata on disk (by default `.metadata/table_metadata.json`).
development
Understands the Metabase Representation Format — a YAML-based serialization format for Metabase content (collections, cards, dashboards, documents, segments, measures, snippets, transforms). Use when the user needs to create, edit, understand, or validate Metabase representation YAML files, or when working with Metabase serialization/deserialization (serdes). Covers entity schemas, MBQL and native queries, visualization settings, parameters, and folder structure.
development
Migrates a project from Metabase static embedding to guest embeds (web components via embed.js). Use when the user wants to migrate/convert/switch/upgrade from static embedding to guest embeds, from signed embed iframes to web components, or replace /embed/ iframes with metabase-dashboard/metabase-question components.