Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

starlake-ai/starflow-transform-design

Name: starflow-transform-design
Author: starlake-ai

.agents/starflow/skills/starflow-transform-design/SKILL.md

npx skillsauth add starlake-ai/starlake-skills starflow-transform-design

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Transform Design

Overview

Guides the design of SQL transformations that turn raw/curated data into business-ready datasets. Produces Starlake-compatible .sql files and their companion .sl.yml configuration files, including write strategies, sink configuration, and data quality expectations.

Role Guidance: Act as a Data Engineer with expertise in SQL analytics, data modeling, and Starlake's transform engine.

Design Rationale: Transformations are where business logic lives. They must be pure SQL (or Python for complex cases), testable locally on DuckDB, and deployable to any target engine via Starlake's SQL transpiler. Dependencies are inferred from SQL table references.

Steps

Step 1: Context Loading

Load available artifacts:
- {planning_artifacts}/data-architecture-*.md
- {implementation_artifacts}/pipeline-spec-*.md
- {planning_artifacts}/schema-design-*.md
Identify the transform domain and target tables.

Step 2: Transformation Inventory

List all transformations needed: | Task Name | Source Tables | Target Table | Write Strategy | Schedule | |-----------|-------------|--------------|----------------|----------| | e.g., orders_daily_agg | sales.orders, ref.products | analytics.orders_daily | OVERWRITE_BY_PARTITION | Daily |

Step 3: SQL Design

For each transformation, write the SQL:

-- Task: orders_daily_agg
-- Dependencies: sales.orders, ref.products (auto-inferred)
SELECT
  DATE(o.order_date) AS order_date,
  p.category,
  COUNT(*) AS order_count,
  SUM(o.amount) AS total_amount,
  AVG(o.amount) AS avg_amount
FROM sales.orders o
JOIN ref.products p ON o.product_id = p.product_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '1 day'
GROUP BY DATE(o.order_date), p.category

Guidelines:

Use standard SQL (Starlake transpiles across engines)
Reference source tables with {domain}.{table} notation
Use SL_THIS to reference the current task's output table in expectations
For incremental processing, use partition pruning and date filters
Avoid engine-specific syntax unless necessary

Step 4: Task Configuration

For each task, create the .sl.yml:

version: 1
transform:
  name: "orders_daily_agg"
  writeStrategy:
    type: "OVERWRITE_BY_PARTITION"
    key: ["order_date"]
  sink:
    partition: ["order_date"]
    clustering: ["category"]
    connectionRef: "warehouse"
  expectations:
    - query: "SELECT COUNT(*) > 0 FROM SL_THIS"
      name: "not_empty"
      failSeverity: "ERROR"
    - query: "SELECT COUNT(*) = 0 FROM SL_THIS WHERE total_amount < 0"
      name: "no_negative_amounts"
      failSeverity: "ERROR"

Step 5: Dependency Graph

Document the transformation DAG:

List each task and its dependencies (from SQL table references)
Identify the execution order
Flag any circular dependencies
Determine which tasks support --recursive execution

Step 6: Output Generation

Generate:

Transform design document to {implementation_artifacts}/transform-design-{{domain}}.md
SQL files to {implementation_artifacts}/transforms/{domain}/
Task YAML files alongside the SQL files

Related Starlake Skills

Use the transform skill for SQL/Python transformation execution options
Use the expectations skill for transform-level data quality checks
Use the config skill for available SQL functions and type reference

Outcome

Complete SQL transformations with Starlake task configurations, quality expectations, and a documented dependency graph — ready for implementation and testing.

starlake-ai/starflow-transform-design

.agents/starflow/skills/starflow-transform-design/SKILL.md

Design SQL transformations for data pipelines with quality checks and dependency management. Use when the user says "design transforms" or "create SQL transformations".

1 stars

development

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add starlake-ai/starlake-skills starflow-transform-design

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 3:37 AM5.7s1 file scanned

SKILL.md

name:: starflow-transform-design
description:: Design SQL transformations for data pipelines with quality checks and dependency management. Use when the user says "design transforms" or "create SQL transformations".

Transform Design

Overview

Role Guidance: Act as a Data Engineer with expertise in SQL analytics, data modeling, and Starlake's transform engine.

Steps

Step 1: Context Loading

Load available artifacts:
- {planning_artifacts}/data-architecture-*.md
- {implementation_artifacts}/pipeline-spec-*.md
- {planning_artifacts}/schema-design-*.md
Identify the transform domain and target tables.

Step 2: Transformation Inventory

Step 3: SQL Design

For each transformation, write the SQL:

-- Task: orders_daily_agg
-- Dependencies: sales.orders, ref.products (auto-inferred)
SELECT
  DATE(o.order_date) AS order_date,
  p.category,
  COUNT(*) AS order_count,
  SUM(o.amount) AS total_amount,
  AVG(o.amount) AS avg_amount
FROM sales.orders o
JOIN ref.products p ON o.product_id = p.product_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '1 day'
GROUP BY DATE(o.order_date), p.category

Guidelines:

Use standard SQL (Starlake transpiles across engines)
Reference source tables with {domain}.{table} notation
Use SL_THIS to reference the current task's output table in expectations
For incremental processing, use partition pruning and date filters
Avoid engine-specific syntax unless necessary

Step 4: Task Configuration

For each task, create the .sl.yml:

version: 1
transform:
  name: "orders_daily_agg"
  writeStrategy:
    type: "OVERWRITE_BY_PARTITION"
    key: ["order_date"]
  sink:
    partition: ["order_date"]
    clustering: ["category"]
    connectionRef: "warehouse"
  expectations:
    - query: "SELECT COUNT(*) > 0 FROM SL_THIS"
      name: "not_empty"
      failSeverity: "ERROR"
    - query: "SELECT COUNT(*) = 0 FROM SL_THIS WHERE total_amount < 0"
      name: "no_negative_amounts"
      failSeverity: "ERROR"

Step 5: Dependency Graph

Document the transformation DAG:

List each task and its dependencies (from SQL table references)
Identify the execution order
Flag any circular dependencies
Determine which tasks support --recursive execution

Step 6: Output Generation

Generate:

Transform design document to {implementation_artifacts}/transform-design-{{domain}}.md
SQL files to {implementation_artifacts}/transforms/{domain}/
Task YAML files alongside the SQL files

Related Starlake Skills

Use the transform skill for SQL/Python transformation execution options
Use the expectations skill for transform-level data quality checks
Use the config skill for available SQL functions and type reference

Outcome

Complete SQL transformations with Starlake task configurations, quality expectations, and a documented dependency graph — ready for implementation and testing.

Related Skills

starlake-ai/starflow-sprint-planning

devops

VerifiedTrustedCommunity

Plan and track sprint progress for data pipeline implementation. Use when the user says "sprint planning" or "plan data sprint".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-sprint-planning

starlake-ai/starflow-source-analysis

testing

VerifiedTrustedCommunity

Analyze data sources in depth: schema, quality, volume, and extraction strategy. Use when the user says "analyze data source" or "profile this data source".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-source-analysis

starlake-ai/starflow-schema-design

data-ai

VerifiedTrustedCommunity

Design Starlake-compatible table schemas with types, constraints, privacy, and expectations. Use when the user says "design schema" or "create table definition".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-schema-design

starlake-ai/starflow-platform-engineer

devops

VerifiedTrustedCommunity

Platform Engineer agent — manages infrastructure, orchestration, and deployment for data pipelines. Use when the user says "platform-engineer" or "talk to the platform-engineer".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-platform-engineer

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/starlake-ai/starlake-skills.git

# Copy into Claude Code skills folder (global)
cp -r starlake-skills/.agents/starflow/skills/starflow-transform-design ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

starlake-ai/starlake-skills

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT