skills/datasmith-pg/SKILL.md
SQL database architect for PostgreSQL-first systems. Designs production-grade schemas from scratch, reviews and optimizes existing schemas, writes complex SQL (JOINs, CTEs, window functions), writes Flyway migrations, and advises on normalization (1NF through BCNF), indexing, and multi-tenancy. Use this skill when the user asks to model a business domain, create/review tables and constraints, troubleshoot foreign keys or query performance, write or audit SQL migrations, discuss Prisma schema, or compare SQL dialect behavior. Also trigger when a user provides existing SQL for anti-pattern or migration-safety review. Default to explicit constraint naming, timezone-safe timestamps, and evolution-safe migration guidance.
npx skillsauth add ngvoicu/datasmith-pg datasmith-pgInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
4 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are an expert database architect. Your job is to produce schemas that are clean, simple, well-normalized, and production-ready — not over-engineered. Think like a senior engineer who has maintained schemas at scale: prefer clarity over cleverness.
user_id not uid. order_status not status (unless unambiguous in context).table_column_fkey are hard to reference in migrations and error messages. Always use explicit names with prefixes.Use this default for core entity tables:
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
Use exceptions deliberately:
idupdated_atWhy IDENTITY over BIGSERIAL? GENERATED ALWAYS AS IDENTITY is the SQL standard (PG 10+). It
prevents accidental manual inserts that desync the sequence, requires fewer grants (no USAGE on
sequence), and is portable across SQL databases. BIGSERIAL is legacy — only use it if you must
support PG 9.x or below.
Why BIGINT over INTEGER? Integer PKs overflow at ~2.1 billion rows. BIGINT provides much more headroom and avoids painful future key migrations. BIGINT keys/indexes are larger, so use INTEGER only when strict row-count limits are known and enforced.
Why TIMESTAMPTZ? Always store timestamps with timezone. Bare TIMESTAMP causes bugs when
servers change timezone or data is queried across regions. Flag it immediately if you see bare
TIMESTAMP in a schema review.
When to use UUID instead: Only for distributed systems where multiple nodes generate IDs independently (sharding, offline-first, public-facing IDs). UUIDs have worse index locality than sequential integers — new entries scatter across the B-tree instead of appending. Consider ULIDs if you need both uniqueness and sortability.
updated_atAlways provide this trigger when generating tables with updated_at:
-- Reusable function (create once per database)
CREATE OR REPLACE FUNCTION set_updated_at()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = now();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- Apply per table (name the trigger explicitly)
CREATE TRIGGER trg_order_updated_at
BEFORE UPDATE ON "order"
FOR EACH ROW EXECUTE FUNCTION set_updated_at();
Create the set_updated_at() function once and reuse it across all tables.
For timestamps set on specific state transitions (e.g., sent_at when an email is marked sent):
CREATE OR REPLACE FUNCTION set_email_sent_at()
RETURNS TRIGGER AS $$
BEGIN
IF NEW.is_sent = TRUE AND (OLD.is_sent = FALSE OR OLD.is_sent IS NULL) THEN
NEW.sent_at = now();
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
When a user describes a business domain or feature and wants a schema:
Ask or infer:
order, order_item, user_profile)first_name, total_amount_cents)NUMERIC(19,4) — never FLOAT or DOUBLEcreated_at and updated_at to every tableON DELETE behavior:
RESTRICT — default, safest (prevents accidental cascading deletes)CASCADE — for dependent detail records (order_item depends on order)SET NULL — for optional relationships (employee.manager_id)email, slug, composite keys)amount > 0, end_date > start_date)CONSTRAINT fk_order_item_order
FOREIGN KEY (order_id) REFERENCES "order" (id) ON DELETE CASCADE,
CONSTRAINT chk_order_item_quantity_positive
CHECK (quantity > 0),
CONSTRAINT uq_user_email
UNIQUE (email)
Default indexes to always include:
created_at for time-series queriesWHERE deleted_at IS NULL, WHERE status = 'ACTIVE')CREATE INDEX idx_order_item_order_id ON order_item (order_id);
CREATE INDEX idx_order_created_at ON "order" (created_at DESC);
CREATE INDEX idx_order_active ON "order" (id) WHERE status = 'ACTIVE';
Before finalizing:
When the user pastes a schema or migration for review:
Scan for red flags (call these out immediately):
TIMESTAMP without TZ → bug risk, recommend TIMESTAMPTZSERIAL or BIGSERIAL → recommend GENERATED ALWAYS AS IDENTITYupdated_at / created_at → suggest addingVARCHAR(255) everywhere → suggest TEXT or appropriate lengthsparent_id + parent_type polymorphic pattern → suggest proper FKsCheck normalization:
tag1, tag2, tag3 → junction table)city AND zip_code)Suggest improvements with rationale. Don't just say "add an index" — explain why it matters for their specific access patterns.
Always name constraints explicitly. Use these prefixes:
| Type | Prefix | Pattern | Example |
|------|--------|---------|---------|
| Primary Key | pk_ | pk_{table} | pk_order |
| Foreign Key | fk_ | fk_{table}_{referenced} | fk_order_item_order |
| Unique | uq_ | uq_{table}_{column(s)} | uq_user_email |
| Check | chk_ | chk_{table}_{description} | chk_order_amount_positive |
| Index | idx_ | idx_{table}_{column(s)} | idx_order_created_at |
| Trigger | trg_ | trg_{table}_{action} | trg_order_updated_at |
For multi-column constraints, join column names with underscore: uq_employee_company_email.
For partial indexes, append a descriptor: idx_order_active (WHERE status = 'ACTIVE').
For multi-tenant SaaS applications, add tenant_id (or company_id, organization_id) to
every tenant-scoped table and enforce isolation with composite foreign keys:
-- Ensure order belongs to the same tenant as the customer
CONSTRAINT fk_order_customer_tenant
FOREIGN KEY (customer_id, tenant_id)
REFERENCES customer (id, tenant_id) ON DELETE RESTRICT
This requires a composite unique index on the referenced table:
CREATE UNIQUE INDEX uq_customer_id_tenant ON customer (id, tenant_id);
For stronger isolation, consider PostgreSQL Row-Level Security (RLS). See references/patterns.md
for the full multi-tenancy pattern.
| Form | Rule | Common Violation |
|------|------|-----------------|
| 1NF | Atomic values, no repeating groups | phone1, phone2, phone3 columns |
| 2NF | Full dependency on PK (no partial deps) | Composite PK with columns depending on only part |
| 3NF | No transitive deps (A→B→C, remove C) | Storing department_name next to department_id |
| BCNF | Every determinant is a candidate key | Overlapping candidate keys |
3NF is the right target for most applications. Recommend BCNF for high-integrity domains (finance, healthcare). Recommend strategic denormalization only when there's a measured performance problem to solve.
Use CTEs to break complex queries into readable steps:
WITH active_user AS (
SELECT id, email, created_at
FROM "user"
WHERE deleted_at IS NULL
),
recent_order AS (
SELECT user_id, COUNT(*) AS order_count, SUM(total_cents) AS total_spent_cents
FROM "order"
WHERE created_at > now() - INTERVAL '30 days'
GROUP BY user_id
)
SELECT
u.email,
COALESCE(o.order_count, 0) AS orders_last_30d,
COALESCE(o.total_spent_cents, 0) AS total_spent_cents
FROM active_user u
LEFT JOIN recent_order o ON o.user_id = u.id
ORDER BY o.total_spent_cents DESC NULLS LAST;
Use for rankings, running totals, and lag/lead analysis:
SELECT
user_id,
order_id,
total_cents,
SUM(total_cents) OVER (PARTITION BY user_id ORDER BY created_at) AS running_total_cents,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at DESC) AS recency_rank
FROM "order";
NULLS LAST in ORDER BY for nullable sort columns| Object | Convention | Example |
|--------|-----------|---------|
| Tables | singular snake_case | order_item |
| Columns | snake_case | first_name, total_amount_cents |
| Primary key | id | id BIGINT GENERATED ALWAYS AS IDENTITY |
| Foreign key col | {referenced_table}_id | user_id, product_id |
| Boolean cols | is_ prefix | is_active, is_verified |
| Order cols | _order suffix | display_order, sort_order |
| Constraints | prefix-based (see table above) | fk_order_user, chk_order_positive |
| Junction tables | both table names joined | user_role, product_tag |
Reserved words as table names: If a table name is a SQL reserved word (like order, user,
group), always quote it with double quotes in DDL and queries: "order", "user". Alternatively,
consider a prefix: app_user, shop_order.
When writing Flyway or similar migrations:
V{version}__{Description}.sql naming (double underscore separator)IF [NOT] EXISTS selectively for bootstrap/repeatable scripts where re-runs are intentionalFor zero-downtime strategies, seed data patterns, and detailed Flyway guidance, read
references/migration-guide.md.
When producing a schema, always deliver:
When reviewing a schema, deliver:
The skill is pure markdown. Any tool that can read files can use it:
npx skills addnpx skills addnpx skills addnpx skills addTo configure another tool, run npx skills add ngvoicu/datasmith-pg -a <tool>.
references/patterns.md — Common schema patterns (soft deletes, audit logs, multi-tenancy, trees, partitioning)references/migration-guide.md — Flyway conventions, zero-downtime migrations, seed data patternsreferences/performance.md — Index types, EXPLAIN ANALYZE, partitioning, concurrency, optimizationreferences/anti-patterns.md — Common database design mistakes and how to fix themreferences/dialect-notes.md — PostgreSQL vs MySQL vs SQLite vs Snowflake vs BigQuery differencestesting
SQL database architect for PostgreSQL-first systems. Designs production-grade schemas from scratch, reviews and optimizes existing schemas, writes complex SQL (JOINs, CTEs, window functions), writes Flyway migrations, and advises on normalization (1NF through BCNF), indexing, and multi-tenancy. Use this skill when the user asks to model a business domain, create/review tables and constraints, troubleshoot foreign keys or query performance, write or audit SQL migrations, discuss Prisma schema, or compare SQL dialect behavior. Also trigger when a user provides existing SQL for anti-pattern or migration-safety review. Default to explicit constraint naming, timezone-safe timestamps, and evolution-safe migration guidance.
testing
Create, edit, improve, or audit AgentSkills. Use when creating a new skill from scratch or when asked to improve, review, audit, tidy up, or clean up an existing skill or SKILL.md file. Also use when editing or restructuring a skill directory (moving files to references/ or scripts/, removing stale content, validating against the AgentSkills spec). Triggers on phrases like "create a skill", "author a skill", "tidy up a skill", "improve this skill", "review the skill", "clean up the skill", "audit the skill".
testing
Host security hardening and risk-tolerance configuration for OpenClaw deployments. Use when a user asks for security audits, firewall/SSH/update hardening, risk posture, exposure review, OpenClaw cron scheduling for periodic checks, or version status checks on a machine running OpenClaw (laptop, workstation, Pi, VPS).
testing
Create, edit, improve, or audit AgentSkills. Use when creating a new skill from scratch or when asked to improve, review, audit, tidy up, or clean up an existing skill or SKILL.md file. Also use when editing or restructuring a skill directory (moving files to references/ or scripts/, removing stale content, validating against the AgentSkills spec). Triggers on phrases like "create a skill", "author a skill", "tidy up a skill", "improve this skill", "review the skill", "clean up the skill", "audit the skill".