.agents/starflow/skills/starflow-create-data-architecture/SKILL.md
Design the overall data architecture including layers, storage, engines, and governance. Use when the user says "create data architecture" or "design the data platform".
npx skillsauth add starlake-ai/starlake-skills starflow-create-data-architectureInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Guides the creation of a comprehensive data architecture document covering data layers (landing, staging, warehouse, mart), engine selection, storage strategy, governance framework, and environment configuration. The output drives all downstream Starlake configuration and pipeline design decisions.
Role Guidance: Act as a Data Architect with expertise in modern data stack, warehouse design patterns, and Starlake's declarative pipeline platform.
Design Rationale: A solid data architecture prevents ad-hoc pipeline sprawl and establishes conventions that the entire team follows. Starlake enforces many of these patterns through its directory structure and configuration hierarchy.
{planning_artifacts}/domain-discovery-*.md if available.Define the data layers and their purpose:
| Layer | Starlake Stage | Purpose | Write Strategy | |-------|---------------|---------|----------------| | Landing | incoming/pending | Raw data as-is from source | N/A (file staging) | | Bronze / Raw | accepted | Validated, typed, privacy-applied | APPEND or OVERWRITE | | Silver / Curated | transform (business) | Cleaned, deduplicated, conformed | UPSERT_BY_KEY or SCD2 | | Gold / Mart | transform (business) | Business-ready aggregations | OVERWRITE or OVERWRITE_BY_PARTITION |
Define the metadata directory structure:
metadata/
application.sl.yml # Global config, connections, defaults
env.sl.yml # Base environment variables
env.PROD.sl.yml # Production overrides
types/
default.sl.yml # Built-in types
custom.sl.yml # Project-specific types (regex patterns)
load/
{domain}/
_config.sl.yml # Domain defaults (incoming dir, connection)
{table}.sl.yml # Per-table schema and load config
transform/
{domain}/
{task}.sl.yml # Transform config (write strategy, sink)
{task}.sql # SQL transformation
extract/
{source}.sl.yml # JDBC/API extraction config
dags/
{schedule}.sl.yml # Orchestration DAG definitions
expectations/
{domain}.j2 # Reusable data quality macros
SL_ENV and env.{ENV}.sl.yml filesconnectionRef) switchable per environmentGenerate the data architecture document and save to {planning_artifacts}/data-architecture-{{project_name}}.md.
A comprehensive data architecture document covering layers, engines, Starlake project structure, governance, and environment strategy — ready to guide all pipeline implementation.
development
Design SQL transformations for data pipelines with quality checks and dependency management. Use when the user says "design transforms" or "create SQL transformations".
devops
Plan and track sprint progress for data pipeline implementation. Use when the user says "sprint planning" or "plan data sprint".
testing
Analyze data sources in depth: schema, quality, volume, and extraction strategy. Use when the user says "analyze data source" or "profile this data source".
data-ai
Design Starlake-compatible table schemas with types, constraints, privacy, and expectations. Use when the user says "design schema" or "create table definition".