Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

starlake-ai/transform

Name: transform
Author: starlake-ai

.agents/skills/transform/SKILL.md

npx skillsauth add starlake-ai/starlake-skills transform

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Transform Skill

Runs a defined transformation task. Tasks are SQL or Python scripts that read from source tables and write results to a target table. Tasks can have dependencies, support multiple write strategies, and can be executed recursively with all their upstream dependencies.

Usage

starlake transform [options]

Options

--name <value>: Task name in the form domain.task (required unless --tags is used)
--compile: Return the final compiled SQL query without executing it
--sync-apply: Update YAML attributes to match the SQL query columns
--sync-preview: Preview YAML attribute changes that would match the SQL query
--query <value>: Run this SQL query instead of the one defined in the task file
--dry-run: Dry run only — compile and validate without executing (BigQuery support)
--tags <value>: Run all tasks matching these tags
--format: Pretty-print the final SQL query and update the .sql file
--interactive <value>: Run query and display results without sinking. Format: csv, json, table, json-array
--reload: Reload YAML files from disk before execution (used in server mode)
--truncate: Force target table truncation before insert
--pageSize <value>: Number of records per page (for interactive mode)
--pageNumber <value>: Page number to display (for interactive mode)
--recursive: Execute all upstream dependencies recursively before this task
--test: Run in test mode without committing changes
--accessToken <value>: Access token for authentication (e.g. GCP)
--options k1=v1,k2=v2: Substitution arguments for the SQL template
--scheduledDate <value>: Scheduled date for the job, format: yyyy-MM-dd'T'HH:mm:ss.SSSZ
--reportFormat <value>: Report output format: console, json, or html

Configuration Context

Transform tasks are defined in the metadata/transform/ directory.

Transform Domain Config (`metadata/transform/{domain}/_config.sl.yml`)

Sets default properties for all tasks in the domain:

# metadata/transform/kpi/_config.sl.yml
version: 1
transform:
  default:
    writeStrategy:
      type: OVERWRITE

SQL Transform File (`metadata/transform/{domain}/{task}.sql`)

Contains the SQL query. Use {{domain}}.table syntax to reference source tables:

-- metadata/transform/kpi/revenue_summary.sql
SELECT
    o.order_id,
    o.timestamp AS order_date,
    SUM(ol.quantity * ol.sale_price) AS total_revenue
FROM
    starbake.orders o
    JOIN starbake.order_lines ol ON o.order_id = ol.order_id
GROUP BY
    o.order_id, o.timestamp

Task with Dependencies

Tasks can reference outputs of other tasks to form a DAG:

-- metadata/transform/kpi/order_summary.sql
SELECT
    ps.order_id,
    ps.order_date,
    rs.total_revenue,
    ps.profit,
    ps.total_units_sold
FROM
    kpi.product_summary ps
    JOIN kpi.revenue_summary rs ON ps.order_id = rs.order_id

Task YAML Configuration (`metadata/transform/{domain}/{task}.sl.yml`)

Optional YAML file to configure write strategy, sink, expectations, and more:

# metadata/transform/analytics/daily_sales.sl.yml
version: 1
task:
  domain: "analytics"
  table: "daily_sales"
  writeStrategy:
    type: "OVERWRITE_BY_PARTITION"
  sink:
    partition:
      - "report_date"
    clustering:
      - "region"
  connectionRef: "bigquery"
  expectations:
    - expect: "is_row_count_to_be_between(1, 1000000) => result(0) == 1"
      failOnError: true
  dagRef: "daily_analytics_dag"

Python Transform (`metadata/transform/{domain}/{task}.py`)

Python transforms must create a temporary view named SL_THIS:

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, sum, count

spark = SparkSession.builder.getOrCreate()
df = spark.sql("SELECT * FROM sales.orders")
result = df.groupBy("customer_id").agg(
    count("*").alias("order_count"),
    sum("total_amount").alias("total_spent")
)
result.createOrReplaceTempView("SL_THIS")

Examples

Run a Single Task

starlake transform --name kpi.revenue_summary

Compile SQL Only (Debug)

View the final compiled SQL without executing:

starlake transform --name kpi.order_summary --compile

Run with Recursive Dependencies

Execute the task and all its upstream dependencies:

starlake transform --name kpi.order_summary --recursive

Interactive Query (Preview Results)

Run and display results as a table without writing to the target:

starlake transform --name kpi.revenue_summary --interactive table

Interactive with Pagination

starlake transform --name kpi.revenue_summary --interactive json --pageSize 50 --pageNumber 1

Dry Run (BigQuery)

starlake transform --name kpi.order_summary --dry-run

Run All Tasks with a Tag

starlake transform --tags daily

Run with Custom Options

Pass substitution variables to the SQL template:

starlake transform --name kpi.revenue_summary --options start_date=2024-01-01,end_date=2024-03-31

Sync YAML Attributes from SQL

Automatically update the task YAML attributes to match the SQL query output columns:

starlake transform --name kpi.revenue_summary --sync-apply

Pretty-Print and Format SQL

starlake transform --name kpi.revenue_summary --format

Test Transform

starlake transform --name kpi.revenue_summary --test

Related Skills

load - Load raw data before transforming
lineage - Visualize task dependency graphs
col-lineage - Column-level lineage for a task
dag-generate - Generate orchestration DAGs
test - Run integration tests for transforms
config - Configuration reference (write strategies, connections)

starlake-ai/transform

.agents/skills/transform/SKILL.md

Run SQL or Python transformation tasks

1 stars

development

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add starlake-ai/starlake-skills transform

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 3:37 AM10.6s1 file scanned

SKILL.md

name:: transform
description:: Run SQL or Python transformation tasks

Transform Skill

Usage

starlake transform [options]

Options

--name <value>: Task name in the form domain.task (required unless --tags is used)
--compile: Return the final compiled SQL query without executing it
--sync-apply: Update YAML attributes to match the SQL query columns
--sync-preview: Preview YAML attribute changes that would match the SQL query
--query <value>: Run this SQL query instead of the one defined in the task file
--dry-run: Dry run only — compile and validate without executing (BigQuery support)
--tags <value>: Run all tasks matching these tags
--format: Pretty-print the final SQL query and update the .sql file
--interactive <value>: Run query and display results without sinking. Format: csv, json, table, json-array
--reload: Reload YAML files from disk before execution (used in server mode)
--truncate: Force target table truncation before insert
--pageSize <value>: Number of records per page (for interactive mode)
--pageNumber <value>: Page number to display (for interactive mode)
--recursive: Execute all upstream dependencies recursively before this task
--test: Run in test mode without committing changes
--accessToken <value>: Access token for authentication (e.g. GCP)
--options k1=v1,k2=v2: Substitution arguments for the SQL template
--scheduledDate <value>: Scheduled date for the job, format: yyyy-MM-dd'T'HH:mm:ss.SSSZ
--reportFormat <value>: Report output format: console, json, or html

Configuration Context

Transform tasks are defined in the metadata/transform/ directory.

Transform Domain Config (`metadata/transform/{domain}/_config.sl.yml`)

Sets default properties for all tasks in the domain:

# metadata/transform/kpi/_config.sl.yml
version: 1
transform:
  default:
    writeStrategy:
      type: OVERWRITE

SQL Transform File (`metadata/transform/{domain}/{task}.sql`)

Contains the SQL query. Use {{domain}}.table syntax to reference source tables:

-- metadata/transform/kpi/revenue_summary.sql
SELECT
    o.order_id,
    o.timestamp AS order_date,
    SUM(ol.quantity * ol.sale_price) AS total_revenue
FROM
    starbake.orders o
    JOIN starbake.order_lines ol ON o.order_id = ol.order_id
GROUP BY
    o.order_id, o.timestamp

Task with Dependencies

Tasks can reference outputs of other tasks to form a DAG:

-- metadata/transform/kpi/order_summary.sql
SELECT
    ps.order_id,
    ps.order_date,
    rs.total_revenue,
    ps.profit,
    ps.total_units_sold
FROM
    kpi.product_summary ps
    JOIN kpi.revenue_summary rs ON ps.order_id = rs.order_id

Task YAML Configuration (`metadata/transform/{domain}/{task}.sl.yml`)

Optional YAML file to configure write strategy, sink, expectations, and more:

# metadata/transform/analytics/daily_sales.sl.yml
version: 1
task:
  domain: "analytics"
  table: "daily_sales"
  writeStrategy:
    type: "OVERWRITE_BY_PARTITION"
  sink:
    partition:
      - "report_date"
    clustering:
      - "region"
  connectionRef: "bigquery"
  expectations:
    - expect: "is_row_count_to_be_between(1, 1000000) => result(0) == 1"
      failOnError: true
  dagRef: "daily_analytics_dag"

Python Transform (`metadata/transform/{domain}/{task}.py`)

Python transforms must create a temporary view named SL_THIS:

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, sum, count

spark = SparkSession.builder.getOrCreate()
df = spark.sql("SELECT * FROM sales.orders")
result = df.groupBy("customer_id").agg(
    count("*").alias("order_count"),
    sum("total_amount").alias("total_spent")
)
result.createOrReplaceTempView("SL_THIS")

Examples

Run a Single Task

starlake transform --name kpi.revenue_summary

Compile SQL Only (Debug)

View the final compiled SQL without executing:

starlake transform --name kpi.order_summary --compile

Run with Recursive Dependencies

Execute the task and all its upstream dependencies:

starlake transform --name kpi.order_summary --recursive

Interactive Query (Preview Results)

Run and display results as a table without writing to the target:

starlake transform --name kpi.revenue_summary --interactive table

Interactive with Pagination

starlake transform --name kpi.revenue_summary --interactive json --pageSize 50 --pageNumber 1

Dry Run (BigQuery)

starlake transform --name kpi.order_summary --dry-run

Run All Tasks with a Tag

starlake transform --tags daily

Run with Custom Options

Pass substitution variables to the SQL template:

starlake transform --name kpi.revenue_summary --options start_date=2024-01-01,end_date=2024-03-31

Sync YAML Attributes from SQL

Automatically update the task YAML attributes to match the SQL query output columns:

starlake transform --name kpi.revenue_summary --sync-apply

Pretty-Print and Format SQL

starlake transform --name kpi.revenue_summary --format

Test Transform

starlake transform --name kpi.revenue_summary --test

Related Skills

load - Load raw data before transforming
lineage - Visualize task dependency graphs
col-lineage - Column-level lineage for a task
dag-generate - Generate orchestration DAGs
test - Run integration tests for transforms
config - Configuration reference (write strategies, connections)

Related Skills

starlake-ai/starflow-transform-design

development

VerifiedTrustedCommunity

Design SQL transformations for data pipelines with quality checks and dependency management. Use when the user says "design transforms" or "create SQL transformations".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-transform-design

starlake-ai/starflow-sprint-planning

devops

VerifiedTrustedCommunity

Plan and track sprint progress for data pipeline implementation. Use when the user says "sprint planning" or "plan data sprint".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-sprint-planning

starlake-ai/starflow-source-analysis

testing

VerifiedTrustedCommunity

Analyze data sources in depth: schema, quality, volume, and extraction strategy. Use when the user says "analyze data source" or "profile this data source".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-source-analysis

starlake-ai/starflow-schema-design

data-ai

VerifiedTrustedCommunity

Design Starlake-compatible table schemas with types, constraints, privacy, and expectations. Use when the user says "design schema" or "create table definition".

1SKILL.mdUpdated Apr 16, 2026

starlake-ai/starflow-schema-design

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/starlake-ai/starlake-skills.git

# Copy into Claude Code skills folder (global)
cp -r starlake-skills/.agents/skills/transform ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

starlake-ai/starlake-skills

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

starlake-ai/transform

$ install --global

Security Scan Results

SKILL.md

Transform Skill

Usage

Options

Configuration Context

Transform Domain Config (metadata/transform/{domain}/_config.sl.yml)

SQL Transform File (metadata/transform/{domain}/{task}.sql)

Task with Dependencies

Task YAML Configuration (metadata/transform/{domain}/{task}.sl.yml)

Python Transform (metadata/transform/{domain}/{task}.py)

Examples

Run a Single Task

Compile SQL Only (Debug)

Run with Recursive Dependencies

Interactive Query (Preview Results)

Interactive with Pagination

Dry Run (BigQuery)

Run All Tasks with a Tag

Run with Custom Options

Sync YAML Attributes from SQL

Pretty-Print and Format SQL

Test Transform

Related Skills

Related Skills

starlake-ai/starflow-transform-design

starlake-ai/starflow-sprint-planning

starlake-ai/starflow-source-analysis

starlake-ai/starflow-schema-design

starlake-ai/transform

$ install --global

Security Scan Results

SKILL.md

Transform Skill

Usage

Options

Configuration Context

Transform Domain Config (metadata/transform/{domain}/_config.sl.yml)

SQL Transform File (metadata/transform/{domain}/{task}.sql)

Task with Dependencies

Task YAML Configuration (metadata/transform/{domain}/{task}.sl.yml)

Python Transform (metadata/transform/{domain}/{task}.py)

Examples

Run a Single Task

Compile SQL Only (Debug)

Run with Recursive Dependencies

Interactive Query (Preview Results)

Interactive with Pagination

Dry Run (BigQuery)

Run All Tasks with a Tag

Run with Custom Options

Sync YAML Attributes from SQL

Pretty-Print and Format SQL

Test Transform

Related Skills

Related Skills

starlake-ai/starflow-transform-design

starlake-ai/starflow-sprint-planning

starlake-ai/starflow-source-analysis

starlake-ai/starflow-schema-design

Transform Domain Config (`metadata/transform/{domain}/_config.sl.yml`)

SQL Transform File (`metadata/transform/{domain}/{task}.sql`)

Task YAML Configuration (`metadata/transform/{domain}/{task}.sl.yml`)

Python Transform (`metadata/transform/{domain}/{task}.py`)

Transform Domain Config (`metadata/transform/{domain}/_config.sl.yml`)

SQL Transform File (`metadata/transform/{domain}/{task}.sql`)

Task YAML Configuration (`metadata/transform/{domain}/{task}.sl.yml`)

Python Transform (`metadata/transform/{domain}/{task}.py`)