Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

dtsong/schema-evaluation

Name: schema-evaluation
Author: dtsong

skills/council/alchemist/schema-evaluation/SKILL.md

npx skillsauth add dtsong/my-claude-setup schema-evaluation

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Schema Evaluation

Purpose

Evaluate and design data warehouse schemas for analytical workloads. Covers star schemas, snowflake schemas, data vault, and One Big Table (OBT) patterns. Assesses grain definition, normalization trade-offs, slowly changing dimension strategies, and data contracts between producers and consumers.

Scope Constraints

Reads schema definitions, DDL, ERDs, data dictionaries, and query patterns for analysis. Does not execute queries, modify databases, or manage pipeline orchestration.

Inputs

Business domain and key entities (e.g., e-commerce: orders, products, customers)
Analytical queries the schema must support (e.g., "revenue by product category by month")
Data volume estimates (row counts, growth rate)
Source systems and their update patterns (CDC, full refresh, event stream)
Existing schema (if evaluating rather than designing from scratch)

Input Sanitization

No user-provided values are used in commands or file paths. All inputs are treated as read-only analysis targets.

Procedure

Progress Checklist

[ ] Step 1: Define the grain
[ ] Step 2: Identify facts and dimensions
[ ] Step 3: Choose a modeling approach
[ ] Step 4: Design SCD strategy
[ ] Step 5: Define data contracts
[ ] Step 6: Validate against query patterns
[ ] Step 7: Document the schema

Step 1: Define the Grain

Identify the grain of each fact table — what does one row represent? A single transaction? A daily snapshot? A session event? The grain determines everything downstream. Document the grain as a clear English sentence: "One row = one order line item" or "One row = one daily active user per product."

Step 2: Identify Facts and Dimensions

Separate measurable facts (revenue, quantity, duration, count) from descriptive dimensions (customer, product, date, geography). For each:

Facts: Data type, aggregation method (SUM, AVG, COUNT DISTINCT), nullability
Dimensions: Cardinality, hierarchy levels, whether it changes over time (SCD candidate)

Step 3: Choose a Modeling Approach

Evaluate which pattern fits the requirements:

Star schema — Simple, fast queries, denormalized dimensions. Best for straightforward BI with a known query pattern.
Snowflake schema — Normalized dimensions for storage efficiency and consistency. Best when dimensions are large or shared across many facts.
Data vault — Hub, link, satellite pattern for auditability and flexibility. Best when source systems change frequently or full history is required.
One Big Table (OBT) — Fully denormalized single table. Best for small teams, simple analytics, or when query simplicity outweighs storage concerns.

Document the chosen approach and the reasoning behind it.

Step 4: Design Slowly Changing Dimension Strategy

For each dimension that changes over time, specify the SCD type:

Type 1 — Overwrite. No history. Simple, but you lose the old value.
Type 2 — Add new row with effective dates. Full history, but increases row count and query complexity.
Type 3 — Add column for previous value. Limited history (only one prior value), but simple to query.
Type 6 — Hybrid (1+2+3). Current value column plus history rows. Best of both but most complex.

Document which SCD type applies to each changing attribute and why.

Step 5: Define Data Contracts

For each source-to-warehouse interface, specify the contract:

Schema expectations (required fields, data types, allowed values)
Freshness SLA (how soon after source update must the warehouse reflect it?)
Quality thresholds (max null rate, uniqueness constraints, referential integrity)
Breaking change policy (how are schema changes communicated and handled?)

Step 6: Validate Against Query Patterns

Test the proposed schema against the required analytical queries:

Can each query be answered with at most 2-3 joins?
Are the most common filters (date, category, status) on dimension keys?
Is the grain appropriate — not too fine (wasteful) or too coarse (lossy)?
Are aggregate tables or materialized views needed for high-frequency dashboards?

Step 7: Document the Schema

Produce a complete schema specification with DDL, relationships, and usage notes.

Compaction resilience: If context was lost during a long session, re-read the Inputs section to reconstruct what system is being analyzed, check the Progress Checklist for completed steps, then resume from the earliest incomplete step.

Handoff

Hand off to pipeline-design if the evaluation reveals ETL/ELT orchestration or data flow architecture needs.
Hand off to guardian/compliance-review if schema design surfaces data governance, PII handling, or regulatory compliance concerns.

Output Format

# Schema Evaluation: [Domain/Project Name]

## Grain Definitions

| Fact Table | Grain (one row = ...) | Estimated Rows | Growth Rate |
|------------|----------------------|----------------|-------------|
| ...        | ...                  | ...            | ...         |

## Entity Relationship Summary

[ASCII diagram showing fact and dimension relationships]


## Modeling Approach

**Chosen:** [Star / Snowflake / Data Vault / OBT]
**Rationale:** [1-2 sentences]

## Fact Tables

### fct_[name]
| Column | Type | Description | Aggregation |
|--------|------|-------------|-------------|
| ...    | ...  | ...         | SUM/AVG/... |

## Dimension Tables

### dim_[name]
| Column | Type | Description | SCD Type |
|--------|------|-------------|----------|
| ...    | ...  | ...         | 1/2/3    |

## Slowly Changing Dimensions

| Dimension | Attribute | SCD Type | Rationale |
|-----------|-----------|----------|-----------|
| ...       | ...       | ...      | ...       |

## Data Contracts

| Source → Target | Freshness SLA | Quality Checks | Breaking Change Policy |
|-----------------|---------------|----------------|----------------------|
| ...             | ...           | ...            | ...                  |

## Query Validation

| Query Pattern | Tables Involved | Join Count | Performance Notes |
|---------------|----------------|------------|-------------------|
| ...           | ...            | ...        | ...               |

Quality Checks

[ ] Every fact table has a clearly stated grain in plain English
[ ] Facts and dimensions are properly separated — no business logic in fact tables
[ ] SCD strategy is specified for every dimension attribute that changes
[ ] Data contracts define freshness SLA, schema expectations, and quality thresholds
[ ] The schema supports all required analytical queries with minimal joins
[ ] Cardinality estimates are documented for key dimensions
[ ] Surrogate keys vs natural keys decision is documented and consistent
[ ] Indexing strategy is defined for common filter and join columns

Evolution Notes

dtsong/schema-evaluation

skills/council/alchemist/schema-evaluation/SKILL.md

Use when evaluating or designing data warehouse schemas for analytical workloads. Covers star schemas, snowflake schemas, data vault, OBT patterns, grain definition, SCD strategies, normalization trade-offs, and data contracts between producers and consumers. Do not use for pipeline orchestration or ETL flow design (use pipeline-design).

4 stars

devops

Updated Apr 26, 2026

$ install --global

skillsauth

npx skillsauth add dtsong/my-claude-setup schema-evaluation

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 26, 2026, 4:02 AM58.7s1 file scanned

SKILL.md

name:: schema-evaluation
department:: alchemist
description:: Use when evaluating or designing data warehouse schemas for analytical workloads. Covers star schemas, snowflake schemas, data vault, OBT patterns, grain definition, SCD strategies, normalization trade-offs, and data contracts between producers and consumers. Do not use for pipeline orchestration or ETL flow design (use pipeline-design).
version:: 1

Schema Evaluation

Purpose

Scope Constraints

Reads schema definitions, DDL, ERDs, data dictionaries, and query patterns for analysis. Does not execute queries, modify databases, or manage pipeline orchestration.

Inputs

Business domain and key entities (e.g., e-commerce: orders, products, customers)
Analytical queries the schema must support (e.g., "revenue by product category by month")
Data volume estimates (row counts, growth rate)
Source systems and their update patterns (CDC, full refresh, event stream)
Existing schema (if evaluating rather than designing from scratch)

Input Sanitization

No user-provided values are used in commands or file paths. All inputs are treated as read-only analysis targets.

Procedure

Progress Checklist

[ ] Step 1: Define the grain
[ ] Step 2: Identify facts and dimensions
[ ] Step 3: Choose a modeling approach
[ ] Step 4: Design SCD strategy
[ ] Step 5: Define data contracts
[ ] Step 6: Validate against query patterns
[ ] Step 7: Document the schema

Step 1: Define the Grain

Step 2: Identify Facts and Dimensions

Separate measurable facts (revenue, quantity, duration, count) from descriptive dimensions (customer, product, date, geography). For each:

Facts: Data type, aggregation method (SUM, AVG, COUNT DISTINCT), nullability
Dimensions: Cardinality, hierarchy levels, whether it changes over time (SCD candidate)

Step 3: Choose a Modeling Approach

Evaluate which pattern fits the requirements:

Star schema — Simple, fast queries, denormalized dimensions. Best for straightforward BI with a known query pattern.
Snowflake schema — Normalized dimensions for storage efficiency and consistency. Best when dimensions are large or shared across many facts.
Data vault — Hub, link, satellite pattern for auditability and flexibility. Best when source systems change frequently or full history is required.
One Big Table (OBT) — Fully denormalized single table. Best for small teams, simple analytics, or when query simplicity outweighs storage concerns.

Document the chosen approach and the reasoning behind it.

Step 4: Design Slowly Changing Dimension Strategy

For each dimension that changes over time, specify the SCD type:

Type 1 — Overwrite. No history. Simple, but you lose the old value.
Type 2 — Add new row with effective dates. Full history, but increases row count and query complexity.
Type 3 — Add column for previous value. Limited history (only one prior value), but simple to query.
Type 6 — Hybrid (1+2+3). Current value column plus history rows. Best of both but most complex.

Document which SCD type applies to each changing attribute and why.

Step 5: Define Data Contracts

For each source-to-warehouse interface, specify the contract:

Schema expectations (required fields, data types, allowed values)
Freshness SLA (how soon after source update must the warehouse reflect it?)
Quality thresholds (max null rate, uniqueness constraints, referential integrity)
Breaking change policy (how are schema changes communicated and handled?)

Step 6: Validate Against Query Patterns

Test the proposed schema against the required analytical queries:

Can each query be answered with at most 2-3 joins?
Are the most common filters (date, category, status) on dimension keys?
Is the grain appropriate — not too fine (wasteful) or too coarse (lossy)?
Are aggregate tables or materialized views needed for high-frequency dashboards?

Step 7: Document the Schema

Produce a complete schema specification with DDL, relationships, and usage notes.

Compaction resilience: If context was lost during a long session, re-read the Inputs section to reconstruct what system is being analyzed, check the Progress Checklist for completed steps, then resume from the earliest incomplete step.

Handoff

Hand off to pipeline-design if the evaluation reveals ETL/ELT orchestration or data flow architecture needs.
Hand off to guardian/compliance-review if schema design surfaces data governance, PII handling, or regulatory compliance concerns.

Output Format

# Schema Evaluation: [Domain/Project Name]

## Grain Definitions

| Fact Table | Grain (one row = ...) | Estimated Rows | Growth Rate |
|------------|----------------------|----------------|-------------|
| ...        | ...                  | ...            | ...         |

## Entity Relationship Summary

[ASCII diagram showing fact and dimension relationships]


## Modeling Approach

**Chosen:** [Star / Snowflake / Data Vault / OBT]
**Rationale:** [1-2 sentences]

## Fact Tables

### fct_[name]
| Column | Type | Description | Aggregation |
|--------|------|-------------|-------------|
| ...    | ...  | ...         | SUM/AVG/... |

## Dimension Tables

### dim_[name]
| Column | Type | Description | SCD Type |
|--------|------|-------------|----------|
| ...    | ...  | ...         | 1/2/3    |

## Slowly Changing Dimensions

| Dimension | Attribute | SCD Type | Rationale |
|-----------|-----------|----------|-----------|
| ...       | ...       | ...      | ...       |

## Data Contracts

| Source → Target | Freshness SLA | Quality Checks | Breaking Change Policy |
|-----------------|---------------|----------------|----------------------|
| ...             | ...           | ...            | ...                  |

## Query Validation

| Query Pattern | Tables Involved | Join Count | Performance Notes |
|---------------|----------------|------------|-------------------|
| ...           | ...            | ...        | ...               |

Quality Checks

[ ] Every fact table has a clearly stated grain in plain English
[ ] Facts and dimensions are properly separated — no business logic in fact tables
[ ] SCD strategy is specified for every dimension attribute that changes
[ ] Data contracts define freshness SLA, schema expectations, and quality thresholds
[ ] The schema supports all required analytical queries with minimal joins
[ ] Cardinality estimates are documented for key dimensions
[ ] Surrogate keys vs natural keys decision is documented and consistent
[ ] Indexing strategy is defined for common filter and join columns

Evolution Notes

Related Skills

dtsong/enterprise-search-strategy

development

VerifiedTrustedCommunity

Use when the council needs to surface organizational knowledge buried across multiple internal sources (wikis, design docs, ADRs, past tickets, postmortems, chat archives, code repos). Plans where to look, what to cross-reference, and how to synthesize findings into evidence the council can act on. Do not use for external market research (use competitive-analysis), library evaluation (use library-evaluation), or technology trend assessment (use technology-radar).

5SKILL.mdUpdated Jun 23, 2026

dtsong/enterprise-search-strategy

dtsong/docx-to-pdf

testing

VerifiedTrustedCommunity

Use to convert a Word .docx file to PDF and/or verify its page count. Triggers on: converting docx to pdf, rendering a document, checking how many pages a docx produces, or asserting a page-count constraint (e.g. a resume must stay 2 pages). Wraps LibreOffice headless conversion.

5SKILL.mdUpdated Jun 11, 2026

dtsong/web-security-hardening

development

VerifiedTrustedCommunity

Security audit checklist for web applications. Use when reviewing, auditing, or hardening a web app's security posture. Covers rate limiting, auth headers, IP blocking, CORS, security middleware, input validation, file upload limits, ORM usage, and password hashing. Triggers on requests like "review security", "harden this app", "security audit", "check for vulnerabilities", or when building/reviewing API endpoints.

5SKILL.mdUpdated Apr 28, 2026

dtsong/web-security-hardening

dtsong/prompt-wizard

development

VerifiedTrustedCommunity

Interactive wizard to craft effective prompts using Claude Code best practices

5SKILL.mdUpdated Apr 28, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/dtsong/my-claude-setup.git

# Copy into Claude Code skills folder (global)
cp -r my-claude-setup/skills/council/alchemist/schema-evaluation ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

dtsong/my-claude-setup

4 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT