skills/data-engineering/dbt/SKILL.md
dbt is a tool for transforming data in warehouses using SQL-based models, enabling testing and documentation.
npx skillsauth add alphaonedev/openclaw-graph dbtInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
dbt (data build tool) is a command-line tool for transforming data in warehouses using SQL-based models. It enables developers to write, test, and document data transformations, ensuring reliable ETL processes.
Use dbt when building SQL data models for warehouses like Snowflake, BigQuery, or Redshift. Apply it for incremental data loading, schema evolution, or automated testing in data pipelines. Avoid it for real-time processing or non-SQL data sources; opt for dbt when you need version-controlled SQL code with built-in validation.
{{ var('date') }} for parameter injection).not_null or unique tests).dbt docs generate, outputting HTML with model dependencies and descriptions.is_incremental() macro to process only new data, reducing warehouse load.Follow this workflow: 1) Initialize a project with dbt init. 2) Write models in the models/ directory as SQL files. 3) Configure connections in profiles.yml. 4) Run and test models iteratively. 5) Use seeds for static data and snapshots for slowly changing dimensions. For CI/CD, integrate dbt into scripts: run dbt run in a Docker container with mounted volumes. Always specify targets like --target dev to switch environments.
dbt is primarily CLI-based; use it directly or via scripts. Key commands:
dbt init --name project_name: Create a new project; specify a directory with --project-dir /path.
Example:
dbt init --name sales_dbt --project-dir ./sales_project
dbt run --models model_name --select tag:nightly: Execute models; use --full-refresh to rebuild all data.
Example:
dbt run --models orders --target prod --threads 8
dbt test --models model_name: Run tests; add flags like --store-failures to log errors.
Example:
dbt test --select source:raw_data
dbt docs generate && dbt docs serve: Build and serve documentation; integrate with CI for auto-deployment.
For API-like usage, wrap dbt in Python scripts using subprocess: subprocess.run(['dbt', 'run', '--models', 'my_model']). Use environment variables for profiles, e.g., set $DBT_PROFILES_DIR to /path/to/profiles.yml.
Integrate dbt with warehouses by configuring profiles.yml. For Snowflake, use:
your_profile:
target: dev
outputs:
dev:
type: snowflake
account: your_account
user: your_user
password: "{{ env_var('SNOWFLAKE_PASSWORD') }}"
database: your_db
schema: your_schema
For BigQuery, specify:
bigquery_profile:
target: dev
outputs:
dev:
type: bigquery
method: service-account
project: your-gcp-project
dataset: your_dataset
keyfile: "{{ env_var('GOOGLE_KEYFILE') }}"
Use env vars for secrets (e.g., export SNOWFLAKE_PASSWORD=your_key). Integrate with Git for version control, and tools like Airflow for orchestration: call dbt run as a task. For VS Code, install the dbt extension for syntax highlighting.
Handle errors by first running dbt debug to validate connections and profiles. For SQL compilation errors, check model files for syntax (e.g., missing semicolons) and use dbt compile to preview. If tests fail, inspect output logs for details like "column not found"; fix by updating schemas in .yml files. Use --fail-fast in dbt run to halt on first error. Common patterns: wrap commands in try-catch for scripts (e.g., in Python: try: subprocess.run(['dbt', 'run']) except Exception as e: log_error(e)). For authentication failures, ensure env vars like $SNOWFLAKE_PASSWORD are set; test with dbt run --target dev --log-level debug.
Building a simple incremental model: Create models/orders.sql with:
{{ config(materialized='incremental') }}
select order_id, customer_id from source_orders
where order_date > (select max(order_date) from {{ this }})
Then run: dbt run --models orders --target dev. This processes only new orders.
Running and testing a data model: Define a test in models/schema.yml:
models:
- name: customers
columns:
- name: customer_id
tests:
- unique
- not_null
Execute: dbt run --models customers && dbt test --models customers. This builds the model and verifies no duplicates or nulls.
tools
Root web development: project structure, tooling selection, deployment decisions
development
WebAssembly: Rust/Go/C to WASM, wasm-bindgen, Emscripten, WASM Component Model
development
Vue 3: Composition API script setup, Pinia, Vue Router 4, SFCs, Vite, Nuxt 3
tools
Tailwind CSS 4: utility classes, config, JIT, arbitrary values, darkMode, plugins, shadcn/ui