engineering/database-engineering/skills/migrations/SKILL.md
This skill should be used when the user asks about "database migration", "schema migration", "ALTER TABLE", "adding a column", "dropping a column", "renaming a column", "changing column type", "adding an index", "dropping an index", "zero-downtime migration", "blue-green migration", "expand-contract", "backfill", "lock contention", "table lock", "migration strategy", "forward migration", "rollback migration", "Flyway", "Liquibase", "Alembic", "Prisma migrate", "Rails migration", "Django migration". Also trigger for "how do I safely rename a column", "how to add NOT NULL constraint without downtime", "my migration is locking the table", or "apply migration without downtime".
npx skillsauth add harsh040506/claude-code-unified-skill-plugin-library migrationsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Safely evolve database schemas in production with zero or minimal downtime.
For any migration that would break the running application, use the three-phase expand-contract pattern:
Phase 1: EXPAND — Add new structure without removing old
Phase 2: MIGRATE — Backfill data; update app to use new structure
Phase 3: CONTRACT — Remove old structure after app fully migrated
Each phase is a separate deployment. This eliminates the window where the migration breaks the live application.
| Operation | Lock? | Risk | Strategy |
|-----------|-------|------|----------|
| CREATE TABLE | No | Safe | Apply directly |
| ADD COLUMN ... DEFAULT NULL | Brief | Safe | Apply directly |
| ADD COLUMN ... NOT NULL DEFAULT x | Table lock (small tables only) | Risky | Add nullable, backfill, add constraint |
| ADD COLUMN ... NOT NULL (no default) | Table lock | High | Expand-contract |
| DROP COLUMN | Brief | App risk | Remove code first, then drop |
| RENAME COLUMN | Brief | App breaks | Expand-contract |
| ALTER COLUMN TYPE | Table lock | High | Expand-contract |
| CREATE INDEX (naive) | Table lock | High | CREATE INDEX CONCURRENTLY |
| CREATE UNIQUE INDEX (naive) | Table lock | High | CREATE UNIQUE INDEX CONCURRENTLY |
| DROP INDEX | Brief | Safe | Apply directly |
| ADD FOREIGN KEY | Lock, full scan | Risky | NOT VALID, then VALIDATE separately |
| ADD CHECK CONSTRAINT | Table lock | High | NOT VALID, then VALIDATE |
| DROP TABLE | Brief | Data loss! | Remove code references first |
-- Safe: adds column with no default or a constant default
-- PostgreSQL 11+: instant even with DEFAULT for non-volatile defaults
-- Earlier versions: instant only for NULL default
BEGIN;
ALTER TABLE orders ADD COLUMN notes TEXT;
COMMIT;
-- Rollback:
ALTER TABLE orders DROP COLUMN notes;
-- WRONG: locks table for duration of backfill (dangerous on large tables)
ALTER TABLE orders ADD COLUMN tier TEXT NOT NULL DEFAULT 'standard';
-- RIGHT: Three steps
-- Step 1: Add nullable column (fast, no lock)
ALTER TABLE orders ADD COLUMN tier TEXT;
-- Step 2: Backfill in batches (no locks held, doesn't block reads/writes)
-- Run this from application code or a migration script
DO $$
DECLARE
batch_size INT := 1000;
last_id UUID;
rows_updated INT;
BEGIN
SELECT id INTO last_id FROM orders ORDER BY id LIMIT 1;
LOOP
UPDATE orders
SET tier = 'standard'
WHERE id IN (
SELECT id FROM orders
WHERE tier IS NULL
ORDER BY id
LIMIT batch_size
);
GET DIAGNOSTICS rows_updated = ROW_COUNT;
EXIT WHEN rows_updated = 0;
PERFORM pg_sleep(0.01); -- Small pause between batches
END LOOP;
END $$;
-- Step 3: Add NOT NULL constraint (PostgreSQL validates without full scan if you do this right)
-- First, add a NOT VALID check to avoid table lock during validation
ALTER TABLE orders ADD CONSTRAINT orders_tier_not_null CHECK (tier IS NOT NULL) NOT VALID;
-- Then validate in a separate transaction (uses a lower lock level)
ALTER TABLE orders VALIDATE CONSTRAINT orders_tier_not_null;
-- Finally (optional), drop the check and add the real NOT NULL — this is a quick catalog update
-- in PostgreSQL 12+ after validation proves all rows satisfy it
-- WRONG: locks table for entire index build
CREATE INDEX idx_orders_customer ON orders(customer_id);
-- RIGHT: CONCURRENTLY (no locks, can take longer)
-- Must be run OUTSIDE a transaction
CREATE INDEX CONCURRENTLY idx_orders_customer ON orders(customer_id);
-- If CONCURRENT build fails (e.g., connection dropped), clean up invalid index first:
DROP INDEX CONCURRENTLY IF EXISTS idx_orders_customer;
-- Then retry
-- WRONG: Validates entire table with share lock
ALTER TABLE orders ADD CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id) REFERENCES customers(id);
-- RIGHT: Two steps (PostgreSQL)
-- Step 1: Add constraint, skip validation (extremely fast)
ALTER TABLE orders ADD CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id) REFERENCES customers(id)
NOT VALID;
-- Step 2: Validate in a separate transaction (lower lock level, can proceed with reads)
ALTER TABLE orders VALIDATE CONSTRAINT fk_orders_customer;
This is one of the most dangerous operations because renaming immediately breaks running code.
Full expand-contract for column rename:
-- Step 1 (Migration A): Add new column
ALTER TABLE orders ADD COLUMN customer_reference_id UUID;
CREATE INDEX CONCURRENTLY idx_orders_customer_ref ON orders(customer_reference_id);
-- Step 2 (Code change A): App dual-writes to both columns
-- Write to: order.customer_id AND order.customer_reference_id
-- Read from: order.customer_id (still using old one)
-- Step 3 (Migration B): Backfill old → new
UPDATE orders SET customer_reference_id = customer_id
WHERE customer_reference_id IS NULL;
-- Step 4 (Code change B): Switch reads to new column
-- Read from: order.customer_reference_id
-- Write to: both
-- Step 5 (Code change C): Stop writing to old column
-- Read from: order.customer_reference_id
-- Write to: order.customer_reference_id only
-- Step 6 (Migration C): Drop old column
-- Verify no code references customer_id first!
ALTER TABLE orders DROP COLUMN customer_id;
-- WRONG: locks table, doesn't work if existing data can't cast
ALTER TABLE users ALTER COLUMN age TYPE BIGINT;
-- RIGHT: Expand-contract
-- Step 1: Add new column
ALTER TABLE users ADD COLUMN age_v2 BIGINT;
-- Step 2: Populate via trigger (keep in sync during backfill)
CREATE OR REPLACE FUNCTION sync_age_v2()
RETURNS TRIGGER AS $$
BEGIN
NEW.age_v2 := NEW.age::BIGINT;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_sync_age_v2
BEFORE INSERT OR UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION sync_age_v2();
-- Step 3: Backfill
UPDATE users SET age_v2 = age::BIGINT WHERE age_v2 IS NULL;
-- Step 4: Add NOT NULL constraint after confirming backfill complete
ALTER TABLE users ALTER COLUMN age_v2 SET NOT NULL;
-- Step 5: Switch app to use age_v2, remove trigger, drop age
DROP TRIGGER trg_sync_age_v2 ON users;
ALTER TABLE users DROP COLUMN age;
ALTER TABLE users RENAME COLUMN age_v2 TO age;
20240115_143000_add_customer_tier_to_orders.sql
YYYYMMDD_HHMMSS_short_description.sql
-- Migration: Add customer_tier column to orders
-- Author: eng-team
-- Created: 2024-01-15
-- Risk: LOW — nullable column addition
-- Estimated time: <1s (instant DDL)
-- Rollback: See rollback section below
-- === FORWARD MIGRATION ===
BEGIN;
ALTER TABLE orders ADD COLUMN customer_tier TEXT
CHECK (customer_tier IN ('standard', 'premium', 'enterprise'));
COMMENT ON COLUMN orders.customer_tier IS 'Customer tier at time of order. NULL = pre-tier-feature orders.';
COMMIT;
-- === ROLLBACK ===
-- BEGIN;
-- ALTER TABLE orders DROP COLUMN customer_tier;
-- COMMIT;
Migrations should be safe to apply twice:
-- Idempotent index creation
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_orders_tier ON orders(customer_tier);
-- Idempotent column addition
ALTER TABLE orders ADD COLUMN IF NOT EXISTS customer_tier TEXT;
-- Idempotent constraint
DO $$ BEGIN
IF NOT EXISTS (
SELECT 1 FROM pg_constraint WHERE conname = 'orders_tier_valid'
) THEN
ALTER TABLE orders ADD CONSTRAINT orders_tier_valid
CHECK (customer_tier IN ('standard', 'premium', 'enterprise'));
END IF;
END $$;
Before running a migration in production:
CONCURRENTLY is used for all new indexesNOT VALID is used for new FK constraints-- Monitor for blocking locks during migration
SELECT
blocked_locks.pid AS blocked_pid,
blocked_activity.query AS blocked_query,
blocking_locks.pid AS blocking_pid,
blocking_activity.query AS blocking_query,
blocked_activity.wait_event_type,
blocked_activity.wait_event
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks ON (
blocking_locks.locktype = blocked_locks.locktype
AND blocking_locks.relation = blocked_locks.relation
AND blocking_locks.pid != blocked_locks.pid
)
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;
If the migration is holding locks longer than expected:
SELECT pg_cancel_backend(pid)SELECT pg_terminate_backend(pid)For zero-downtime migration playbooks and lock mitigation SQL, see:
references/lock-mitigation.md — lock type reference table, lock monitoring queries, and step-by-step procedures for common high-lock operations (adding NOT NULL, renaming columns)references/zero-downtime-patterns.md — expand-contract pattern, online index builds, table rewrites with pg_repack, and blue-green schema migration strategytesting
Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations. Use when users request QC analysis, filtering low-quality cells, assessing data quality, or following scverse/scanpy best practices for single-cell analysis.
tools
Deep learning for single-cell analysis using scvi-tools. This skill should be used when users need (1) data integration and batch correction with scVI/scANVI, (2) ATAC-seq analysis with PeakVI, (3) CITE-seq multi-modal analysis with totalVI, (4) multiome RNA+ATAC analysis with MultiVI, (5) spatial transcriptomics deconvolution with DestVI, (6) label transfer and reference mapping with scANVI/scArches, (7) RNA velocity with veloVI, or (8) any deep learning-based single-cell method. Triggers include mentions of scVI, scANVI, totalVI, PeakVI, MultiVI, DestVI, veloVI, sysVI, scArches, variational autoencoder, VAE, batch correction, data integration, multi-modal, CITE-seq, multiome, reference mapping, latent space.
testing
This skill should be used when scientists need help with research problem selection, project ideation, troubleshooting stuck projects, or strategic scientific decisions. Use this skill when users ask to pitch a new research idea, work through a project problem, evaluate project risks, plan research strategy, navigate decision trees, or get help choosing what scientific problem to work on. Typical requests include "I have an idea for a project", "I'm stuck on my research", "help me evaluate this project", "what should I work on", or "I need strategic advice about my research".
development
Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or public datasets from GEO/SRA. Triggers on nf-core, Nextflow, FASTQ analysis, variant calling, gene expression, differential expression, GEO reanalysis, GSE/GSM/SRR accessions, or samplesheet creation.