
Starlake configuration reference — environment variables, application structure, attribute types, storage patterns, and best practices
Generate orchestration DAGs (Airflow/Dagster) from your Starlake project
Manage GizmoSQL processes — start, stop, list, and stop-all DuckLake-backed SQL servers
Run a job (alias for transform)
Migrate project configuration to the latest Starlake version
Check for files available for loading in the landing/pending area
Run the Starlake HTTP server
Move files from the landing area to the pending area
Display table summary and statistics
Convert Excel job definitions to Starlake YAML
Data Engineer agent — builds and maintains ETL/ELT pipelines with Starlake. Use when the user says "data-engineer" or "talk to the data-engineer".
Review and document data lineage across pipeline stages. Use when the user says "review lineage" or "trace data flow".
Design SQL transformations for data pipelines with quality checks and dependency management. Use when the user says "design transforms" or "create SQL transformations".
Generate task dependency graphs (data lineage)
Generate ACL (Access Control List) dependencies graph
Automatically infer schemas and load data from the incoming directory
Create a new Starlake project from a template
Load files (Parquet/CSV/JSON) into a JDBC table
Create or modify database connections in application.sl.yml
Deploy generated DAGs to a target directory
Data quality expectations syntax, built-in macros, and validation patterns
Extract both schema and data from a JDBC source
Extract data from database tables to CSV/Parquet files
Check data freshness and last update timestamps
Apply IAM (Identity and Access Management) policies
Index data in Elasticsearch (alias for esload)
Ingest data from specific paths into a domain/table
Load data from the pending area into the data warehouse
Compute statistical metrics on table data
Convert Parquet files to CSV format
Apply Row Level Security (RLS) and Column Level Security (CLS) policies
Print project settings or test a database connection
Generate project documentation website
Review data pipeline configuration and SQL for correctness, performance, and best practices. Use when the user says "review pipeline" or "review this data code".
Design orchestration DAGs for scheduling and managing data pipeline execution. Use when the user says "design orchestration" or "create DAG configuration".
Design the overall data architecture including layers, storage, engines, and governance. Use when the user says "create data architecture" or "design the data platform".
Data Quality Engineer agent — ensures data integrity with expectations, lineage, and governance. Use when the user says "data-quality-engineer" or "talk to the data-quality-engineer".
Review and design data quality expectations for Starlake pipelines. Use when the user says "review data quality" or "check expectations".
Platform Engineer agent — manages infrastructure, orchestration, and deployment for data pipelines. Use when the user says "platform-engineer" or "talk to the platform-engineer".
Analyze data sources in depth: schema, quality, volume, and extraction strategy. Use when the user says "analyze data source" or "profile this data source".
Run integration tests for your Starlake project
Run SQL or Python transformation tasks
Convert Starlake YAML definitions to Excel spreadsheets
Data Architect agent — designs data platforms, schemas, and pipeline architecture. Use when the user says "data-architect" or "talk to the data-architect".
Get table information from BigQuery
Generate column-level lineage for a specific task
Compare two versions of a Starlake project
Start the Starlake interactive REPL console
Load data into Elasticsearch
Extract schemas directly from BigQuery datasets
Extract database schemas into Starlake YAML configuration files
Generate extraction scripts from Mustache/SSP templates
Infer a Starlake schema from a data file
Load or offload data to/from Kafka topics
Generate table dependency graph based on foreign key relationships
Create a complete pipeline specification covering extract, load, transform, and orchestrate. Use when the user says "create pipeline spec" or "design a data pipeline".
Validate project configuration, YAML files, and connections
Business Data Analyst agent — guides domain discovery and source analysis. Use when the user says "data-analyst" or "talk to the data-analyst".
Convert Excel domain/schema definitions to Starlake YAML
Generate SQL DDL statements from Starlake YAML definitions
Implement a data pipeline from a pipeline specification, generating Starlake configuration files. Use when the user says "implement pipeline" or "dev this pipeline".
Discover and document data domains, sources, and ownership. Use when the user says "discover data domains" or "map data sources".
Analyzes current state and user query to answer Starflow questions or recommend the next workflow. Use when user asks what to do next or asks about Starflow.
Design Starlake-compatible table schemas with types, constraints, privacy, and expectations. Use when the user says "design schema" or "create table definition".
Plan and track sprint progress for data pipeline implementation. Use when the user says "sprint planning" or "plan data sprint".