Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

starlake-ai/starflow-schema-design

Name: starflow-schema-design
Author: starlake-ai

.agents/starflow/skills/starflow-schema-design/SKILL.md

npx skillsauth add starlake-ai/starlake-skills starflow-schema-design

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Schema Design

Overview

Guides the design of Starlake-compatible table schemas including attribute definitions, custom types with regex validation, privacy annotations, and data quality expectations. Outputs ready-to-use .sl.yml configuration files for Starlake load operations.

Role Guidance: Act as a Data Architect with deep knowledge of Starlake's schema definition format and data typing system.

Design Rationale: Schemas are the contract between data producers and consumers. Starlake enforces schemas at load time, rejecting records that don't conform. Well-designed schemas prevent bad data from entering the pipeline.

Steps

Step 1: Input Gathering

Load source analysis from {planning_artifacts}/source-analysis-*.md if available.
Load data architecture from {planning_artifacts}/data-architecture-*.md if available.
Identify the domain and table(s) to define.

Step 2: Custom Type Definition

Define project-specific types in metadata/types/custom.sl.yml:

version: 1
types:
  - name: "email"
    pattern: "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"
    primitiveType: "string"
  - name: "phone"
    pattern: "^\\+?[0-9\\s\\-\\.\\(\\)]{7,20}$"
    primitiveType: "string"
  - name: "sku"
    pattern: "^[A-Z]{2,4}-[0-9]{4,8}$"
    primitiveType: "string"

Step 3: Table Schema Definition

For each table, create the .sl.yml file with:

metadata: name, pattern (file matching regex), write strategy, sink config
attributes: name, type, required, privacy, comment, array/struct support
merge keys: for UPSERT and SCD2 strategies
expectations: inline data quality checks

Example structure:

version: 1
table:
  name: "customers"
  pattern: "customers.*\\.csv"
  writeStrategy:
    type: "UPSERT_BY_KEY_AND_TIMESTAMP"
    key: ["customer_id"]
    timestamp: "updated_at"
  attributes:
    - name: "customer_id"
      type: "long"
      required: true
    - name: "email"
      type: "email"
      required: true
      privacy: "SHA256"
    - name: "name"
      type: "string"
      required: true
    - name: "created_at"
      type: "timestamp"
      required: true
    - name: "updated_at"
      type: "timestamp"
      required: true

Step 4: Expectations Design

Define data quality expectations as Jinja2 macros:

{# Reusable macro for non-null check #}
{% macro not_null(column) %}
  SELECT COUNT(*) = 0 FROM SL_THIS WHERE {{ column }} IS NULL
{% endmacro %}

{# Domain-specific check #}
{% macro valid_email(column) %}
  SELECT COUNT(*) = 0 FROM SL_THIS
  WHERE {{ column }} NOT LIKE '%@%.%'
{% endmacro %}

Step 5: Domain Configuration

Create the domain _config.sl.yml:

version: 1
load:
  metadata:
    directory: "{incoming_dir}/{domain_name}"
    multiline: false
    encoding: "UTF-8"
    withHeader: true
    separator: ","
    quote: "\""

Step 6: Output Generation

Generate:

Schema documentation to {planning_artifacts}/schema-design-{{domain_name}}.md
Ready-to-use .sl.yml files to {implementation_artifacts}/schemas/

Related Starlake Skills

Use the config skill for the complete attribute types catalog (string, int, long, date, timestamp, etc.)
Use the load skill for write strategy reference (APPEND, OVERWRITE, SCD2, UPSERT, etc.)
Use the infer-schema skill to auto-infer schemas from existing data files
Use the expectations skill for Jinja2 macro syntax when defining quality checks

Outcome

Complete Starlake-compatible schema definitions with custom types, privacy annotations, and expectations — ready to be placed in the metadata/load/ directory.

starlake-ai/starflow-schema-design

.agents/starflow/skills/starflow-schema-design/SKILL.md

Design Starlake-compatible table schemas with types, constraints, privacy, and expectations. Use when the user says "design schema" or "create table definition".

1 stars

data-ai

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add starlake-ai/starlake-skills starflow-schema-design

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 3:37 AM8.0s1 file scanned

SKILL.md

name:: starflow-schema-design
description:: Design Starlake-compatible table schemas with types, constraints, privacy, and expectations. Use when the user says "design schema" or "create table definition".

Schema Design

Overview

Role Guidance: Act as a Data Architect with deep knowledge of Starlake's schema definition format and data typing system.

Steps

Step 1: Input Gathering

Load source analysis from {planning_artifacts}/source-analysis-*.md if available.
Load data architecture from {planning_artifacts}/data-architecture-*.md if available.
Identify the domain and table(s) to define.

Step 2: Custom Type Definition

Define project-specific types in metadata/types/custom.sl.yml:

version: 1
types:
  - name: "email"
    pattern: "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"
    primitiveType: "string"
  - name: "phone"
    pattern: "^\\+?[0-9\\s\\-\\.\\(\\)]{7,20}$"
    primitiveType: "string"
  - name: "sku"
    pattern: "^[A-Z]{2,4}-[0-9]{4,8}$"
    primitiveType: "string"

Step 3: Table Schema Definition

For each table, create the .sl.yml file with:

metadata: name, pattern (file matching regex), write strategy, sink config
attributes: name, type, required, privacy, comment, array/struct support
merge keys: for UPSERT and SCD2 strategies
expectations: inline data quality checks

Example structure:

version: 1
table:
  name: "customers"
  pattern: "customers.*\\.csv"
  writeStrategy:
    type: "UPSERT_BY_KEY_AND_TIMESTAMP"
    key: ["customer_id"]
    timestamp: "updated_at"
  attributes:
    - name: "customer_id"
      type: "long"
      required: true
    - name: "email"
      type: "email"
      required: true
      privacy: "SHA256"
    - name: "name"
      type: "string"
      required: true
    - name: "created_at"
      type: "timestamp"
      required: true
    - name: "updated_at"
      type: "timestamp"
      required: true

Step 4: Expectations Design

Define data quality expectations as Jinja2 macros:

{# Reusable macro for non-null check #}
{% macro not_null(column) %}
  SELECT COUNT(*) = 0 FROM SL_THIS WHERE {{ column }} IS NULL
{% endmacro %}

{# Domain-specific check #}
{% macro valid_email(column) %}
  SELECT COUNT(*) = 0 FROM SL_THIS
  WHERE {{ column }} NOT LIKE '%@%.%'
{% endmacro %}

Step 5: Domain Configuration

Create the domain _config.sl.yml:

version: 1
load:
  metadata:
    directory: "{incoming_dir}/{domain_name}"
    multiline: false
    encoding: "UTF-8"
    withHeader: true
    separator: ","
    quote: "\""

Step 6: Output Generation

Generate:

Schema documentation to {planning_artifacts}/schema-design-{{domain_name}}.md
Ready-to-use .sl.yml files to {implementation_artifacts}/schemas/

Related Starlake Skills

Use the config skill for the complete attribute types catalog (string, int, long, date, timestamp, etc.)
Use the load skill for write strategy reference (APPEND, OVERWRITE, SCD2, UPSERT, etc.)
Use the infer-schema skill to auto-infer schemas from existing data files
Use the expectations skill for Jinja2 macro syntax when defining quality checks

Outcome

Complete Starlake-compatible schema definitions with custom types, privacy annotations, and expectations — ready to be placed in the metadata/load/ directory.

Related Skills

starlake-ai/starflow-transform-design

development

VerifiedTrustedCommunity

Design SQL transformations for data pipelines with quality checks and dependency management. Use when the user says "design transforms" or "create SQL transformations".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-transform-design

starlake-ai/starflow-sprint-planning

devops

VerifiedTrustedCommunity

Plan and track sprint progress for data pipeline implementation. Use when the user says "sprint planning" or "plan data sprint".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-sprint-planning

starlake-ai/starflow-source-analysis

testing

VerifiedTrustedCommunity

Analyze data sources in depth: schema, quality, volume, and extraction strategy. Use when the user says "analyze data source" or "profile this data source".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-source-analysis

starlake-ai/starflow-platform-engineer

devops

VerifiedTrustedCommunity

Platform Engineer agent — manages infrastructure, orchestration, and deployment for data pipelines. Use when the user says "platform-engineer" or "talk to the platform-engineer".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-platform-engineer

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/starlake-ai/starlake-skills.git

# Copy into Claude Code skills folder (global)
cp -r starlake-skills/.agents/starflow/skills/starflow-schema-design ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

starlake-ai/starlake-skills

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT