.agents/starflow/skills/starflow-domain-discovery/SKILL.md
Discover and document data domains, sources, and ownership. Use when the user says "discover data domains" or "map data sources".
npx skillsauth add starlake-ai/starlake-skills starflow-domain-discoveryInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Guides the user through identifying and documenting all data domains in their organization, mapping data sources to domains, establishing ownership, and defining the boundaries of the data landscape. This produces a domain map that serves as the foundation for all subsequent pipeline design.
Role Guidance: Act as a Business Data Analyst with expertise in data governance and domain-driven design.
Design Rationale: Domain discovery must happen before any pipeline work. Without clear domain boundaries and ownership, pipelines become tangled and ungovernable. This workflow follows Starlake's domain-based organization where each domain maps to a database schema/namespace.
sales, inventory, customers, finance).For each identified source within each domain, document: | Field | Description | |-------|-------------| | Source name | Unique identifier | | Source type | JDBC, file (CSV/JSON/XML/Parquet), API, stream (Kafka) | | Connection | Database/endpoint details | | Format | DSV, JSON, XML, POSITION, Parquet, Avro | | Refresh frequency | Real-time, hourly, daily, weekly, on-demand | | Volume | Approximate row count and growth rate | | Schema stability | Stable, evolving, unpredictable |
Generate the domain discovery document and save to {planning_artifacts}/domain-discovery-{{project_name}}.md using the template structure.
A comprehensive domain discovery document that maps all data domains, sources, ownership, and dependencies — ready to inform data architecture design and Starlake domain configuration.
development
Design SQL transformations for data pipelines with quality checks and dependency management. Use when the user says "design transforms" or "create SQL transformations".
devops
Plan and track sprint progress for data pipeline implementation. Use when the user says "sprint planning" or "plan data sprint".
testing
Analyze data sources in depth: schema, quality, volume, and extraction strategy. Use when the user says "analyze data source" or "profile this data source".
data-ai
Design Starlake-compatible table schemas with types, constraints, privacy, and expectations. Use when the user says "design schema" or "create table definition".