postgresql-ai-platform/SKILL.md
PostgreSQL as an AI data platform, sourced from "Building a Data and AI Platform with PostgreSQL" (2025). Covers pgvector for embeddings storage and ANN search (IVFFlat/HNSW), Retrieval-Augmented Generation (RAG) pipeline design, AI application design patterns (schema-aware LLM, NL2SQL guardrails, chunking strategies), 16 critical AI build fault lines, pgai extension, and sovereign data platform principles. Companion to ai-rag-patterns, postgresql-fundamentals, and postgresql-advanced-sql.
npx skillsauth add peterbamuhigire/skills-web-dev postgresql-ai-platformInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
postgresql-ai-platform or would be better handled by a more specific companion skill.SKILL.md first, then load only the referenced deep-dive files that are necessary for the task.CREATE EXTENSION IF NOT EXISTS vector;
-- Documents with embeddings (1536 dims = OpenAI text-embedding-3-small)
CREATE TABLE documents (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
content TEXT NOT NULL,
metadata JSONB,
embedding vector(1536),
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- 3072 dims for text-embedding-3-large
-- 768 dims for nomic-embed-text
-- 384 dims for all-MiniLM-L6-v2
-- Insert with embedding (from application layer)
INSERT INTO documents (content, metadata, embedding)
VALUES (
'PostgreSQL supports full-text search natively.',
'{"source": "docs", "section": "fts"}',
'[0.021, -0.034, ...]'::vector -- array from embedding API
);
-- Cosine distance (recommended for text embeddings — angle, not magnitude)
SELECT id, content, metadata,
1 - (embedding <=> '[0.021, -0.034, ...]'::vector) AS similarity
FROM documents
ORDER BY embedding <=> '[0.021, -0.034, ...]'::vector
LIMIT 10;
-- L2 distance (Euclidean — good for image embeddings)
SELECT id, content, embedding <-> query_vec AS distance
FROM documents
ORDER BY embedding <-> '[...]'::vector LIMIT 10;
-- Inner product (for normalized vectors)
SELECT id, content, (embedding <#> query_vec) * -1 AS score
FROM documents ORDER BY embedding <#> '[...]'::vector LIMIT 10;
-- Combine semantic search with structured filter (critical for multi-tenant)
SELECT id, content,
1 - (embedding <=> $1::vector) AS score
FROM documents
WHERE metadata->>'tenant_id' = $2 -- hard filter
AND metadata->>'section' = 'faq' -- structured filter
ORDER BY embedding <=> $1::vector
LIMIT 10;
-- IVFFlat: faster build, lower memory, slightly less accurate
-- lists: sqrt(row_count) is a good starting point
CREATE INDEX documents_embedding_ivf ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- HNSW: faster queries, higher recall, more memory
-- Recommended for production: better accuracy/speed tradeoff
CREATE INDEX documents_embedding_hnsw ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- Adjust search quality at query time (HNSW)
SET hnsw.ef_search = 100; -- higher = more accurate, slower
-- Adjust probe count at query time (IVFFlat)
SET ivfflat.probes = 10;
| | IVFFlat | HNSW | |---|---|---| | Build speed | Fast | Slower | | Memory | Low | Higher | | Query recall | Good | Excellent | | Recommended | Dev / low-memory | Production |
User Query
│
▼
Embed Query (LLM API)
│
▼
Vector Search (pgvector) + Metadata Filters
│
▼
Retrieve Top-K Chunks
│
▼
Build Prompt (system + chunks + user query)
│
▼
LLM Generation
│
▼
Response to User
Chunk quality determines retrieval quality. Rules from the book:
| Strategy | When to Use | |---|---| | Fixed-size with overlap | Homogeneous text (logs, support tickets) | | Sentence / paragraph | Articles, documentation — preserve semantic units | | Section-based | Structured docs (manuals, legal) — chunk at headings | | Semantic chunking | Use embedding similarity to find natural break points |
-- Store chunk metadata for context reconstruction
CREATE TABLE chunks (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
document_id BIGINT REFERENCES documents(id) ON DELETE CASCADE,
chunk_index INT NOT NULL,
content TEXT NOT NULL,
token_count INT,
embedding vector(1536),
metadata JSONB, -- {section, page, heading, source_url}
UNIQUE (document_id, chunk_index)
);
CREATE INDEX chunks_embedding_hnsw ON chunks
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- Track embedding generation status
ALTER TABLE chunks ADD COLUMN embedded_at TIMESTAMPTZ;
ALTER TABLE chunks ADD COLUMN embedding_model TEXT;
-- Queue-based re-embedding on content change
CREATE FUNCTION queue_reembedding() RETURNS trigger LANGUAGE plpgsql AS $$
BEGIN
IF NEW.content IS DISTINCT FROM OLD.content THEN
NEW.embedding := NULL;
NEW.embedded_at := NULL;
END IF;
RETURN NEW;
END;
$$;
CREATE TRIGGER chunks_reembed
BEFORE UPDATE ON chunks
FOR EACH ROW EXECUTE FUNCTION queue_reembedding();
-- Application worker polls:
SELECT id, content FROM chunks WHERE embedding IS NULL LIMIT 100;
-- Retrieve with full context for prompt building
SELECT
c.content,
c.metadata->>'heading' AS section,
c.metadata->>'source_url' AS source,
d.metadata->>'document_title' AS doc_title,
1 - (c.embedding <=> $1::vector) AS score
FROM chunks c
JOIN documents d ON d.id = c.document_id
WHERE d.metadata->>'tenant_id' = $2
ORDER BY c.embedding <=> $1::vector
LIMIT 5;
Provide schema context to the LLM so it generates valid SQL:
-- Store schema snapshots for LLM context
CREATE TABLE schema_context (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
table_name TEXT NOT NULL,
description TEXT NOT NULL, -- business-level description
columns JSONB NOT NULL, -- [{name, type, description}]
sample_data JSONB, -- 2-3 representative rows
updated_at TIMESTAMPTZ DEFAULT NOW()
);
NL2SQL guardrails (mandatory):
-- Validate generated SQL before execution
-- 1. Only allow SELECT — reject DML
-- 2. Enforce row limit
-- 3. Timeout guard
-- 4. Schema-bound: only allow known tables
SET statement_timeout = '5s';
SET row_security = on; -- RLS as second safety layer
-- Never pass LLM output directly to SQL — always parameterised
-- Application must extract structured fields, not interpolate raw text
-- Structured AI output schema
CREATE TABLE ai_insights (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
source_query TEXT NOT NULL,
model_used TEXT NOT NULL,
insight TEXT NOT NULL,
confidence NUMERIC(3,2),
raw_response JSONB, -- full API response for debugging
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE EXTENSION IF NOT EXISTS ai;
-- Generate embeddings via pgai (calls OpenAI internally)
SELECT ai.openai_embed('text-embedding-3-small', content) AS embedding
FROM documents;
-- Generate text
SELECT ai.openai_chat_complete(
'gpt-4o',
jsonb_build_array(
jsonb_build_object('role', 'user', 'content', 'Summarise: ' || content)
)
)->>'content' AS summary
FROM documents WHERE id = 1;
From "Building a Data and AI Platform with PostgreSQL":
ai_feedback table)CREATE TABLE llm_call_log (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
feature TEXT NOT NULL, -- 'rag_qa', 'nl2sql', 'summarise'
model TEXT NOT NULL,
prompt_tokens INT,
completion_tokens INT,
total_tokens INT,
latency_ms INT,
cost_usd NUMERIC(10,6),
success BOOLEAN NOT NULL,
error_message TEXT,
user_id BIGINT,
tenant_id BIGINT,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX llm_log_tenant_feature ON llm_call_log (tenant_id, feature, created_at DESC);
CREATE EXTENSION vector; -- pgvector: ANN search
CREATE EXTENSION ai; -- pgai: LLM calls from SQL
CREATE EXTENSION pg_trgm; -- fuzzy text search (hybrid retrieval)
CREATE EXTENSION pg_stat_statements; -- query observability
CREATE EXTENSION pg_cron; -- schedule embedding jobs
-- Optional
CREATE EXTENSION postgis; -- geospatial AI (location-aware RAG)
CREATE EXTENSION timescaledb; -- time-series AI (sensor + event data)
statement_timeout on NL2SQL execution — runaway queries possibledata-ai
Use when adding AI-powered analytics to a SaaS platform — semantic search over business data, natural language queries, trend detection, anomaly alerts, and AI-generated insights for dashboards. Covers embeddings, NL2SQL, and per-tenant analytics...
data-ai
Design AI-powered analytics dashboards — what metrics to show, how to display AI predictions and confidence, drill-down patterns, KPI cards, trend visualisation, AI Insights panels, export design, and role-based dashboard variants. Invoke when...
development
Use when designing, building, reviewing, or upgrading production software systems that must be secure, performant, maintainable, scalable, and user-centered. Apply before writing specs, code, architecture, APIs, databases, mobile apps, SaaS platforms, or ERP systems.
development
Professional web app UI using commercial templates (Tabler/Bootstrap 5) with strong frontend design direction when needed. Use for CRUD interfaces, dashboards, admin panels with SweetAlert2, DataTables, Flatpickr. Clone seeder-page.php, use...