skills/codex/databricks-asset-bundles/SKILL.md
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-asset-bundles --- # Databricks Asset Bundle (DABs) Writer ## Overview Create DABs for multi-environment deployment (dev/staging/prod). ## Reference Files - **[SDP_guidance.md](SDP_guidance.md)** - Spark Declarative Pipeline configurations - **[alerts_guidance.md](alerts_guidance.md)** - SQL Alert schemas (critical - API differs) ## Bundle Structure ``` project/ ├── databricks.yml # Main config + targe
npx skillsauth add frank-luongt/faos-skills-marketplace skills/codex/databricks-asset-bundlesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Create DABs for multi-environment deployment (dev/staging/prod).
project/
├── databricks.yml # Main config + targets
├── resources/*.yml # Resource definitions
└── src/ # Code/dashboard files
bundle:
name: project-name
include:
- resources/*.yml
variables:
catalog:
default: 'default_catalog'
schema:
default: 'default_schema'
warehouse_id:
lookup:
warehouse: 'Shared SQL Warehouse'
targets:
dev:
default: true
mode: development
workspace:
profile: dev-profile
variables:
catalog: 'dev_catalog'
schema: 'dev_schema'
prod:
mode: production
workspace:
profile: prod-profile
variables:
catalog: 'prod_catalog'
schema: 'prod_schema'
Support for dataset_catalog and dataset_schema parameters added in Databricks CLI 0.281.0 (January 2026)
resources:
dashboards:
dashboard_name:
display_name: '[${bundle.target}] Dashboard Title'
file_path: ../src/dashboards/dashboard.lvdash.json # Relative to resources/
warehouse_id: ${var.warehouse_id}
dataset_catalog: ${var.catalog} # Default catalog used by all datasets in the dashboard if not otherwise specified in the query
dataset_schema: ${var.schema} # Default schema used by all datasets in the dashboard if not otherwise specified in the query
permissions:
- level: CAN_RUN
group_name: 'users'
Permission levels: CAN_READ, CAN_RUN, CAN_EDIT, CAN_MANAGE
See SDP_guidance.md for pipeline configuration
See alerts_guidance.md - Alert schema differs significantly from other resources
resources:
jobs:
job_name:
name: '[${bundle.target}] Job Name'
tasks:
- task_key: 'main_task'
notebook_task:
notebook_path: ../src/notebooks/main.py # Relative to resources/
new_cluster:
spark_version: '13.3.x-scala2.12'
node_type_id: 'i3.xlarge'
num_workers: 2
schedule:
quartz_cron_expression: '0 0 9 * * ?'
timezone_id: 'America/Los_Angeles'
permissions:
- level: CAN_VIEW
group_name: 'users'
Permission levels: CAN_VIEW, CAN_MANAGE_RUN, CAN_MANAGE
⚠️ Cannot modify "admins" group permissions on jobs - verify custom groups exist before use
⚠️ Critical: Paths depend on file location:
| File Location | Path Format | Example |
| ------------------------ | ------------ | ----------------------------- |
| resources/*.yml | ../src/... | ../src/dashboards/file.json |
| databricks.yml targets | ./src/... | ./src/dashboards/file.json |
Why: resources/ files are one level deep, so use ../ to reach bundle root. databricks.yml
is at root, so use ./
resources:
volumes:
my_volume:
catalog_name: ${var.catalog}
schema_name: ${var.schema}
name: 'volume_name'
volume_type: 'MANAGED'
⚠️ Volumes use grants not permissions - different format from other resources
Apps resource support added in Databricks CLI 0.239.0 (January 2025)
Apps in DABs have a minimal configuration - environment variables are defined in app.yaml in the
source directory, NOT in databricks.yml.
# Generate bundle config from existing CLI-deployed app
databricks bundle generate app --existing-app-name my-app --key my_app --profile DEFAULT
# This creates:
# - resources/my_app.app.yml (minimal resource definition)
# - src/app/ (downloaded source files including app.yaml)
resources/my_app.app.yml:
resources:
apps:
my_app:
name: my-app-${bundle.target} # Environment-specific naming
description: 'My application'
source_code_path: ../src/app # Relative to resources/ dir
src/app/app.yaml: (Environment variables go here)
command:
- 'python'
- 'dash_app.py'
env:
- name: USE_MOCK_BACKEND
value: 'false'
- name: DATABRICKS_WAREHOUSE_ID
value: 'your-warehouse-id'
- name: DATABRICKS_CATALOG
value: 'main'
- name: DATABRICKS_SCHEMA
value: 'my_schema'
databricks.yml:
bundle:
name: my-bundle
include:
- resources/*.yml
variables:
warehouse_id:
default: 'default-warehouse-id'
targets:
dev:
default: true
mode: development
workspace:
profile: dev-profile
variables:
warehouse_id: 'dev-warehouse-id'
| Aspect | Apps | Other Resources |
| -------------------- | --------------------------------- | ---------------------------------- |
| Environment vars | In app.yaml (source dir) | In databricks.yml or resource file |
| Configuration | Minimal (name, description, path) | Extensive (tasks, clusters, etc.) |
| Source path | Points to app directory | Points to specific files |
⚠️ Important: When source code is in project root (not src/app), use source_code_path: .. in
the resource file
DABs supports schemas, models, experiments, clusters, warehouses, etc. Use
databricks bundle schema to inspect schemas.
Reference: DABs Resource Types
databricks bundle validate # Validate default target
databricks bundle validate -t prod # Validate specific target
databricks bundle deploy # Deploy to default target
databricks bundle deploy -t prod # Deploy to specific target
databricks bundle deploy --auto-approve # Skip confirmation prompts
databricks bundle deploy --force # Force overwrite remote changes
databricks bundle run resource_name # Run a pipeline or job
databricks bundle run pipeline_name -t prod # Run in specific environment
# Apps require bundle run to start after deployment
databricks bundle run app_resource_key -t dev # Start/deploy the app
View application logs (for Apps resources):
# View logs for deployed apps
databricks apps logs <app-name> --profile <profile-name>
# Examples:
databricks apps logs my-dash-app-dev -p DEFAULT
databricks apps logs my-streamlit-app-prod -p DEFAULT
What logs show:
[SYSTEM] - Deployment progress, file updates, dependency installation[APP] - Application output (print statements, errors)Key log patterns to look for:
Deployment successful - Confirms deployment completedApp started successfully - App is runningInitialized real backend - Backend connected to Unity CatalogError: - Look for error messages and stack tracesRequirements installed - Dependencies loaded correctlydatabricks bundle destroy -t dev
databricks bundle destroy -t prod --auto-approve
| Issue | Solution |
| --------------------------------------- | ------------------------------------------------------------------------------------------------------ |
| App deployment fails | Check logs: databricks apps logs <app-name> for error details |
| App not connecting to Unity Catalog | Check logs for backend connection errors; verify warehouse ID and permissions |
| Wrong permission level | Dashboards: CAN_READ/RUN/EDIT/MANAGE; Jobs: CAN_VIEW/MANAGE_RUN/MANAGE |
| Path resolution fails | Use ../src/ in resources/*.yml, ./src/ in databricks.yml |
| Catalog doesn't exist | Create catalog first or update variable |
| "admins" group error on jobs | Cannot modify admins permissions on jobs |
| Volume permissions | Use grants not permissions for volumes |
| Hardcoded catalog in dashboard | Use dataset_catalog parameter (CLI v0.281.0+), create environment-specific files, or parameterize JSON |
| App not starting after deploy | Apps require databricks bundle run <resource_key> to start |
| App env vars not working | Environment variables go in app.yaml (source dir), not databricks.yml |
| Wrong app source path | Use ../ from resources/ dir if source is in project root |
| Debugging any app issue | First step: databricks apps logs <app-name> to see what went wrong |
../src/ in resources/*.yml, ./src/ in databricks.ymldevelopment for dev/staging, production for prod"users" for all workspace usersdevelopment
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-mlflow-evaluation --- # MLflow 3 GenAI Evaluation ## Before Writing Any Code 1. **Read GOTCHAS.md** - 15+ common mistakes that cause failures 2. **Read CRITICAL-interfaces.md** - Exact API signatures and data schemas ## End-to-End Workflows Follow these workflows based on your goal. Each step indicates which reference files to read. ### Workflow 1: First-Time Evaluation Setup For users new to MLflow GenAI evalu
development
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-lakebase-provisioned --- # Lakebase Provisioned Patterns and best practices for using Lakebase Provisioned (Databricks managed PostgreSQL) for OLTP workloads. ## When to Use Use this skill when: - Building applications that need a PostgreSQL database for transactional workloads - Adding persistent state to Databricks Apps - Implementing reverse ETL from Delta Lake to an operational database - Storing chat/agent m
tools
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-jobs --- # Databricks Lakeflow Jobs ## Overview Databricks Jobs orchestrate data workflows with multi-task DAGs, flexible triggers, and comprehensive monitoring. Jobs support diverse task types and can be managed via Python SDK, CLI, or Asset Bundles. ## Reference Files | Use Case | Reference File | | ----------------------
development
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-genie --- # Databricks Genie Create and query Databricks Genie Spaces - natural language interfaces for SQL-based data exploration. ## Overview Genie Spaces allow users to ask natural language questions about structured data in Unity Catalog. The system translates questions into SQL queries, executes them on a SQL warehouse, and presents results conversationally. ## When to Use This Skill Use this skill when: -