postgresql-administration/SKILL.md
PostgreSQL DBA operations sourced from "Introduction to PostgreSQL for the Data Professional" (Booz & Fritchey, 2024). Covers backup strategies (pg_dump, pg_basebackup, WAL archiving, PITR), replication (streaming physical, logical, replication slots), high availability, monitoring (pg_stat_activity, pg_stat_user_tables, error logs, cumulative stats system), cloud PaaS options (AWS RDS/Aurora, Azure, GCP), and zero-downtime schema changes. Companion to postgresql-performance and postgresql-fundamentals.
npx skillsauth add peterbamuhigire/skills-web-dev postgresql-administrationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
postgresql-administration or would be better handled by a more specific companion skill.references only as needed.SKILL.md first, then load only the referenced deep-dive files that are necessary for the task.references/ directory for deep detail after reading the core workflow below.| Mechanism | Tool | Use Case | Restore Time |
|---|---|---|---|
| SQL dump | pg_dump | Single database, schema-level portability | Medium (replay all SQL) |
| File system | pg_basebackup | Full cluster backup, base for PITR | Fast (file copy) |
| WAL archiving + PITR | continuous archiving | Zero data loss, point-in-time recovery | Varies |
| Cloud managed | AWS/Azure/GCP native | Managed HA, automated backups | Per-provider |
# Custom format (recommended — allows parallel restore and selective restore)
pg_dump -U postgres -d mydb -F c -f mydb.dump
# Plain SQL (portable, human-readable)
pg_dump -U postgres -d mydb -F p -f mydb.sql
# Compressed plain SQL
pg_dump -U postgres -d mydb | gzip > mydb.sql.gz
# Schema only (no data)
pg_dump -U postgres -d mydb --schema-only -F c -f mydb_schema.dump
# Specific tables
pg_dump -U postgres -d mydb -t orders -t order_items -F c -f orders.dump
# All databases (cluster dump)
pg_dumpall -U postgres -f cluster.sql
# Backup with connection string
pg_dump "postgresql://user:pass@host:5432/mydb" -F c -f mydb.dump
# Restore custom format
pg_restore -U postgres -d mydb -F c mydb.dump
# Parallel restore (faster for large databases)
pg_restore -U postgres -d mydb -F c -j 4 mydb.dump
# Restore to a new database
createdb -U postgres newdb
pg_restore -U postgres -d newdb -F c mydb.dump
# Restore single table
pg_restore -U postgres -d mydb -F c -t orders mydb.dump
# Plain SQL restore
psql -U postgres -d mydb -f mydb.sql
# Full cluster backup (used as base for streaming replication or PITR)
pg_basebackup -h localhost -U replicator -D /backup/base -P -Xs -R
# With WAL streaming included
pg_basebackup -h localhost -U replicator -D /backup/base \
--checkpoint=fast --wal-method=stream --progress
# Compressed
pg_basebackup -h localhost -U replicator -D /backup/base \
--format=tar --compress=9 --wal-method=fetch
# postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'cp %p /archive/wal/%f'
# Or with AWS S3:
# archive_command = 'aws s3 cp %p s3://my-bucket/wal/%f'
restore_command = 'cp /archive/wal/%f %p'
# Take a base backup
pg_basebackup -D /var/lib/postgresql/pitr_base -Xs -P -R
# Create recovery target file (postgresql.conf or recovery.conf pre-v12)
# PostgreSQL 12+: add to postgresql.conf
recovery_target_time = '2026-04-07 14:30:00 UTC'
recovery_target_action = 'promote'
# Restore procedure:
# 1. Stop PostgreSQL
systemctl stop postgresql
# 2. Replace data directory with base backup
rm -rf /var/lib/postgresql/data
cp -a /backup/base /var/lib/postgresql/data
# 3. Ensure recovery.signal exists (PostgreSQL 12+)
touch /var/lib/postgresql/data/recovery.signal
# 4. Start — PostgreSQL replays WAL to the target time
systemctl start postgresql
Primary server setup:
# postgresql.conf
wal_level = replica
max_wal_senders = 5
wal_keep_size = 1GB # keep WAL for replicas that fall behind
-- Create replication role
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'repl_pass';
# pg_hba.conf (add entry for replica)
host replication replicator 192.168.1.0/24 scram-sha-256
Standby server setup:
# Create base backup on standby
pg_basebackup -h primary_host -U replicator -D $PGDATA -P -Xs -R
# -R writes postgresql.auto.conf with primary_conninfo and standby.signal
systemctl start postgresql # starts as hot standby (read-only)
Monitoring replication:
-- On primary
SELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn,
(sent_lsn - replay_lsn) AS replication_lag_bytes
FROM pg_stat_replication;
-- On standby
SELECT pg_is_in_recovery(); -- returns true on standby
SELECT pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn();
Replicates specific tables/operations. Works across major versions and between different schemas.
Publisher (source):
# postgresql.conf
wal_level = logical
-- Create publication
CREATE PUBLICATION orders_pub FOR TABLE orders, order_items;
-- Or all tables:
CREATE PUBLICATION all_pub FOR ALL TABLES;
Subscriber (destination — table must exist):
-- Create subscription
CREATE SUBSCRIPTION orders_sub
CONNECTION 'host=primary_host port=5432 dbname=mydb user=replicator password=repl_pass'
PUBLICATION orders_pub;
-- Monitor
SELECT * FROM pg_stat_subscription;
-- Drop subscription
DROP SUBSCRIPTION orders_sub;
Replication slots ensure WAL is retained until a subscriber has consumed it. Risk: if a subscriber goes down, WAL accumulates and disk can fill.
-- Physical slot (for streaming replication)
SELECT pg_create_physical_replication_slot('replica_slot');
-- Logical slot
SELECT pg_create_logical_replication_slot('analytics_slot', 'pgoutput');
-- Monitor slot lag (CRITICAL — alert if > threshold)
SELECT slot_name, active, pg_size_pretty(
pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)
) AS lag
FROM pg_replication_slots;
-- Drop an inactive slot to free WAL
SELECT pg_drop_replication_slot('analytics_slot');
-- Long-running queries
SELECT pid, age(clock_timestamp(), query_start) AS duration, usename, state, query
FROM pg_stat_activity
WHERE state != 'idle' AND query NOT ILIKE '%pg_stat_activity%'
ORDER BY duration DESC;
-- Cancel a query (graceful)
SELECT pg_cancel_backend(pid);
-- Terminate a connection (forceful)
SELECT pg_terminate_backend(pid);
-- Blocking locks
SELECT blocked.pid, blocked.query, blocking.pid AS blocking_pid, blocking.query AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
WHERE cardinality(pg_blocking_pids(blocked.pid)) > 0;
-- Table health overview
SELECT relname,
n_live_tup, n_dead_tup,
last_vacuum, last_autovacuum,
last_analyze, last_autoanalyze,
seq_scan, idx_scan
FROM pg_stat_user_tables ORDER BY n_dead_tup DESC;
-- Database size
SELECT pg_size_pretty(pg_database_size(current_database()));
-- Table sizes
SELECT relname, pg_size_pretty(pg_total_relation_size(oid)) AS total_size
FROM pg_class WHERE relkind IN ('r', 'm')
ORDER BY pg_total_relation_size(oid) DESC LIMIT 20;
-- Index sizes and usage
SELECT indexrelname, idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes ORDER BY pg_relation_size(indexrelid) DESC;
# postgresql.conf — structured logging
log_destination = 'csvlog'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_rotation_size = 100MB
log_min_duration_statement = 1000 # log queries taking > 1s
log_lock_waits = on # log lock waits
log_checkpoints = on # log checkpoint stats
log_connections = on
log_disconnections = on
-- Requires: shared_preload_libraries = 'pg_stat_statements' in postgresql.conf
SELECT query,
calls,
ROUND(total_exec_time::numeric, 2) AS total_ms,
ROUND(mean_exec_time::numeric, 2) AS avg_ms,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC LIMIT 20;
-- Adding nullable column — instant (PostgreSQL 11+: even with DEFAULT)
ALTER TABLE orders ADD COLUMN notes TEXT;
-- Adding index concurrently — no table lock
CREATE INDEX CONCURRENTLY orders_customer_idx ON orders (customer_id);
-- Dropping an index concurrently
DROP INDEX CONCURRENTLY orders_customer_idx;
-- RISKY: adding NOT NULL column on large table (full rewrite pre-PG 11)
-- SAFE pattern (expand-contract):
-- Step 1: Add nullable with default
ALTER TABLE orders ADD COLUMN priority TEXT DEFAULT 'normal';
-- Step 2: Backfill (batched)
DO $$
DECLARE batch_size INT := 1000;
BEGIN
LOOP
UPDATE orders SET priority = 'normal' WHERE priority IS NULL LIMIT batch_size;
EXIT WHEN NOT FOUND;
PERFORM pg_sleep(0.1); -- throttle
END LOOP;
END $$;
-- Step 3: Add constraint (validate without locking — PostgreSQL 12+)
ALTER TABLE orders ADD CONSTRAINT orders_priority_not_null
CHECK (priority IS NOT NULL) NOT VALID;
ALTER TABLE orders VALIDATE CONSTRAINT orders_priority_not_null;
-- RISKY: renaming a column referenced by application code
-- SAFE: add new column, dual-write, migrate, drop old column
| Provider | Service | Notes | |---|---|---| | AWS | RDS PostgreSQL | Managed, automated backups, Multi-AZ HA | | AWS | Aurora PostgreSQL | Higher throughput, up to 15 read replicas, global tables | | Azure | Azure Database for PostgreSQL | Flexible Server preferred over Single Server | | GCP | Cloud SQL for PostgreSQL | HA via regional failover, Cloud Spanner for global scale | | Self-hosted | Supabase, Neon, Timescale Cloud | PostgreSQL with extras (edge functions, branching) |
pg_hba.conf managed via UI/API — use SSL required connectionsmax_connections is limited — always use PgBouncer or built-in poolingpg_upgradeCONCURRENTLY for index operations — always in productionscram-sha-256 in pg_hba.conf — never md5 or trust for network connectionslog_min_duration_statement = 1000 minimumarchive_command is reliablepg_dump as the only backup strategy for production — no PITR capabilityVACUUM FULL during business hours — acquires exclusive lock, blocks all DMLNOT NULL constraint on large tables without NOT VALID + VALIDATE patternpostgres superuser from application — always use restricted rolepg_stat_statements — flying blind on query performancedata-ai
Use when adding AI-powered analytics to a SaaS platform — semantic search over business data, natural language queries, trend detection, anomaly alerts, and AI-generated insights for dashboards. Covers embeddings, NL2SQL, and per-tenant analytics...
data-ai
Design AI-powered analytics dashboards — what metrics to show, how to display AI predictions and confidence, drill-down patterns, KPI cards, trend visualisation, AI Insights panels, export design, and role-based dashboard variants. Invoke when...
development
Use when designing, building, reviewing, or upgrading production software systems that must be secure, performant, maintainable, scalable, and user-centered. Apply before writing specs, code, architecture, APIs, databases, mobile apps, SaaS platforms, or ERP systems.
development
Professional web app UI using commercial templates (Tabler/Bootstrap 5) with strong frontend design direction when needed. Use for CRUD interfaces, dashboards, admin panels with SweetAlert2, DataTables, Flatpickr. Clone seeder-page.php, use...