config/agents/skills/databricks/SKILL.md
Databricks Expert Engineer Skill - Comprehensive guide for data engineering, machine learning infrastructure, and permission design Use when: - Running databricks CLI commands (auth, api) - Executing SQL queries via Databricks SQL Warehouse - Working with Unity Catalog permissions - Managing Lakeflow Jobs or Delta Lake
npx skillsauth add kumewata/dotfiles databricksInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill provides a comprehensive guide for Databricks development.
When auth_type=databricks-cli in profile, run U2M authentication first
databricks auth login --host https://xxx.cloud.databricks.com --profile PROFILE_NAME
Check authentication status
databricks auth profiles
# Execute query
databricks api post /api/2.0/sql/statements --profile "DEFAULT" --json '{
"warehouse_id": "xxxxxxxxxx",
"catalog": "catalog_name",
"schema": "schema_name",
"statement": "select * from table_name limit 10"
}'
# Get results (statement_id is returned from execution)
databricks api get /api/2.0/sql/statements/{statement_id} --profile "DEFAULT"
Query execution flow
post executes query -> returns statement_idget retrieves results (wait until state is SUCCEEDED)sleep and retryError handling
state: CLOSED: Result retrieval was too slow. Get earlierstate: FAILED: SQL error. Check error_messagestate: RUNNING: Still executing. Wait and retry getlimit to verifyReading results
data_array: Actual data (2D array)schema.columns: Column names and type infototal_row_count: Total count (shown even with limit)state: Query execution stateParameterized queries
databricks api post /api/2.0/sql/statements --profile "DEFAULT" --json '{
"warehouse_id": "xxxxxxxxxx",
"statement": "select * from table where date >= :start_date",
"parameters": [{"name": "start_date", "value": "2025-01-01", "type": "DATE"}]
}'
Consists of 7 pillars:
Policies and practices to securely manage data and AI assets. Minimize data copies with unified governance solution.
Consistent user experience and seamless integration with external systems.
Processes supporting continuous production operations.
Implement safeguards against threats.
Ensure disaster recovery capabilities.
Adaptability to workload changes.
Cost management to maximize value delivery.
3-level namespace: catalog.schema.table
-- Check permissions
SHOW GRANTS ON SCHEMA main.default;
-- Grant permissions
GRANT CREATE TABLE ON SCHEMA main.default TO `finance-team`;
-- Revoke permissions
REVOKE CREATE TABLE ON SCHEMA main.default FROM `finance-team`;
Unifies data ingestion, transformation, and orchestration.
Task types:
Triggers:
Limits:
-- Check table columns first
DESCRIBE TABLE catalog.schema.table_name;
-- Then write your query using verified column names
SELECT column_name FROM catalog.schema.table_name;
-- Basic column info
DESCRIBE TABLE catalog.schema.table_name;
-- Extended info (types, nullability, comments)
DESCRIBE EXTENDED catalog.schema.table_name;
-- List tables in schema
SHOW TABLES IN catalog.schema;
-- Table properties and metadata
DESCRIBE DETAIL catalog.schema.table_name;
| Issue | Cause | Prevention | | ------------------- | ------------------------------ | ----------------------------- | | Column name case | Databricks preserves case | Use DESCRIBE before query | | Data type mismatch | Implicit conversion fails | Check column types explicitly | | NULL handling | Unexpected NULL in aggregation | Use COALESCE or filter NULLs | | Timestamp precision | TIMESTAMP vs TIMESTAMP_NTZ | Verify type before comparison |
When encountering schema-related issues, update this skill with:
NOTE: Do not include project-specific table names or business logic. Keep entries generalizable across environments.
tools
Use when creating a new skill or making a substantial change to an existing skill and you also need to design, update, or review Waza-based executable evaluations. This includes deciding whether Waza is warranted, mapping `evals.json` cases into Waza tasks, choosing fixtures and graders, selecting a valid model with `waza models --json`, and running a local-first `waza run` workflow. Do NOT use for installing the Waza CLI itself or for general skill-authoring advice that does not involve Waza; use `skill-creator` for skill design and this skill for the Waza execution layer. Trigger especially when the user mentions Waza, `waza run`, `waza models`, executable evals, compare, graders, fixtures, or wants to validate a skill change with model-backed evaluation.
tools
Use when the user wants Codex to ask Claude Code for a second opinion or review on code, docs, diffs, PR changes, or design notes without modifying files. This delegates bounded review-only analysis through the Claude Code CLI (`claude -p`). Do NOT use for implementation or file edits; keep this skill review-only. Trigger especially when the user says ask Claude, ask Claude Code, cc-delegate, Claude review, second opinion from Claude, compare Codex and Claude, or review this diff/document with Claude Code.
tools
Airflow DAG development skill for writing, reviewing, testing, and debugging Apache Airflow workflows. Use whenever the user mentions Airflow, DAGs, tasks, operators, sensors, schedules, retries, catchup, DAG import errors, DAG parse performance, or workflow orchestration in Python. Also use for Amazon MWAA / Managed Workflows for Apache Airflow work, including MWAA DAG deployment, requirements.txt, plugins.zip, aws-mwaa-docker-images, S3 DAG folders, CloudWatch logs, and MWAA-specific dependency or IAM issues.
development
Use when the user asks for help drafting a GitHub PR description, a PR review comment, or a Slack post in their own tone (i.e., their personal writing voice). The skill detects the context (formal for PR / review, casual for Slack) and target_type (pr_description, pr_review, slack), drafts the body with an explicit reflection step that avoids verbose, mechanical phrasing, and stages the draft to `~/.local/state/tone/drafts/` via `tone-stage-draft.sh`. The user later runs `/tone-capture <url>` after posting, which pairs the staged draft with the final body to build a corpus for future tone tuning. Trigger especially when the user mentions PR description, PR review comment, Slack post, または「文を書いて」「文面を作って」「自分らしく」「トーン」「tone」.