.agents/starflow/skills/starflow-source-analysis/SKILL.md
Analyze data sources in depth: schema, quality, volume, and extraction strategy. Use when the user says "analyze data source" or "profile this data source".
npx skillsauth add starlake-ai/starlake-skills starflow-source-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Performs a deep analysis of a specific data source, profiling its schema, data quality characteristics, volume patterns, and extraction requirements. Produces a source analysis document that feeds directly into pipeline specification and Starlake load configuration.
Role Guidance: Act as a Data Analyst with expertise in data profiling and source system analysis.
Design Rationale: Each data source has unique characteristics that determine how it should be extracted, validated, and loaded. Understanding these upfront prevents pipeline failures and data quality issues in production.
{planning_artifacts}/domain-discovery-*.md, load it for context.For each attribute/column, document: | Field | Description | |-------|-------------| | Name | Column/field name | | Type | Data type (maps to Starlake types: string, integer, long, double, decimal, boolean, date, timestamp, bytes) | | Nullable | Whether NULLs are allowed | | Primary Key | Part of unique identifier | | Foreign Key | References to other tables/sources | | Pattern | Regex pattern for validation (Starlake custom types) | | Privacy | Privacy classification (PII, sensitive, public) and recommended transform (HIDE, SHA256, MD5, AES) | | Sample values | Representative examples |
Assess data quality dimensions:
Generate the source analysis document and save to {planning_artifacts}/source-analysis-{{source_name}}.md.
A detailed source analysis document with schema definition, quality profile, volume characteristics, and extraction strategy — ready for pipeline specification and Starlake YAML configuration generation.
development
Design SQL transformations for data pipelines with quality checks and dependency management. Use when the user says "design transforms" or "create SQL transformations".
devops
Plan and track sprint progress for data pipeline implementation. Use when the user says "sprint planning" or "plan data sprint".
data-ai
Design Starlake-compatible table schemas with types, constraints, privacy, and expectations. Use when the user says "design schema" or "create table definition".
devops
Platform Engineer agent — manages infrastructure, orchestration, and deployment for data pipelines. Use when the user says "platform-engineer" or "talk to the platform-engineer".