plugins/databases-on-aws/skills/dsql/SKILL.md
Build with Aurora DSQL — manage schemas, execute queries, handle migrations, diagnose query plans, load data, and develop applications with a serverless, distributed SQL database. Covers IAM auth, multi-tenant patterns, MySQL-to-DSQL migration, DDL operations, query plan explainability, SQL compatibility validation, and bulk data loading. Triggers on phrases like: DSQL, Aurora DSQL, create DSQL table, DSQL schema, migrate to DSQL, distributed SQL database, serverless PostgreSQL-compatible database, DSQL query plan, DSQL EXPLAIN ANALYZE, why is my DSQL query slow, aurora-dsql-loader, load CSV into DSQL.
npx skillsauth add awslabs/agent-plugins dsqlInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Aurora DSQL is a serverless, PostgreSQL-compatible distributed SQL database. This skill covers direct query execution via MCP tools, schema management, migrations, multi-tenant isolation, IAM auth, and bulk data loading via aurora-dsql-loader.
Load these files as needed for detailed guidance:
When: ALWAYS load before implementing schema changes or database operations Contains: Best Practices, DDL rules, connection patterns, transaction limits, data type serialization patterns, application-layer referential integrity instructions, security best practices
When: Always load for guidance using or updating the DSQL MCP server Contains: Instructions for setting up the DSQL MCP server with 2 configuration options as sampled in .mcp.json
When: Load when you need detailed MCP tool syntax and examples. PREFER MCP tools for ad-hoc queries — execute directly rather than writing scripts. Contains: Tool parameters, detailed examples, usage patterns, input validation
When: MUST load when making language-specific implementation choices. ALWAYS prefer DSQL Connector when available. Contains: Driver selection, framework patterns, connection code for Python/JS/Go/Java/Rust
When: Load when looking for specific implementation examples Contains: Code examples, repository patterns, multi-tenant implementations
When: Load when debugging errors or unexpected behavior. SHOULD always consult for OCC errors, connection failures, or unexpected query results. Contains: Common pitfalls, error messages, solutions
When: Load when planning or running bulk loads with aurora-dsql-loader, or diagnosing slow load times.
Contains: Fresh-vs-warm partition behavior, resume/retry mechanics (--manifest-dir, --resume-job-id), --on-conflict do-nothing semantics, schema inference caveats, index-count throughput impact, diagnostic decision tree
When: User explicitly requests to "Get started with DSQL" or similar phrase Contains: Interactive step-by-step guide for new users
When: MUST load when creating database roles, granting permissions, setting up schemas for applications, or handling sensitive data. ALWAYS use scoped roles for applications — create database roles with dsql:DbConnect.
Contains: Scoped role setup, IAM-to-database role mapping, schema separation for sensitive data, role design patterns
When: MUST load when performing DROP COLUMN, RENAME COLUMN, ALTER COLUMN TYPE, or DROP CONSTRAINT Contains: Table recreation pattern overview, transaction rules, common verify & swap pattern
When: Load for DROP COLUMN, ALTER COLUMN TYPE, SET/DROP NOT NULL, SET/DROP DEFAULT migrations Contains: Step-by-step migration patterns for column-level changes
When: Load for ADD/DROP CONSTRAINT, MODIFY PRIMARY KEY, column split/merge migrations Contains: Step-by-step migration patterns for constraint and structural changes
When: Load when migrating tables exceeding 3,000 rows Contains: OFFSET-based and cursor-based batching patterns, progress tracking, error handling
When: MUST load when migrating MySQL schemas to DSQL Contains: MySQL data type mappings, feature alternatives, DDL operation mapping
When: Load when translating MySQL DDL operations to DSQL equivalents Contains: ALTER COLUMN, DROP COLUMN, AUTO_INCREMENT, ENUM, SET, FOREIGN KEY migration patterns
When: Load when migrating a complete MySQL table to DSQL Contains: End-to-end MySQL CREATE TABLE migration example with decision summary
When: MUST load all four at Workflow 9 Phase 0 — query-plan/plan-interpretation.md, query-plan/catalog-queries.md, query-plan/guc-experiments.md, query-plan/report-format.md Contains: DSQL node types + Node Duration math + estimation-error bands, pg_class/pg_stats/pg_indexes SQL + correlated-predicate verification, GUC experiment procedures + 30-second skip protocol, required report structure + element checklist + support request template
When: MUST load before running dsql_lint, processing externally-sourced SQL (pg_dump, ORM migrations, user-pasted DDL), or resolving fixed_with_warning / unfixable diagnostics
Contains: dsql_lint MCP tool reference, fix statuses, ORM integration, unfixable error resolution
The aurora-dsql MCP server provides these tools:
Database Operations:
SQL Validation:
Documentation & Knowledge:
Note: There is no list_tables tool. Use readonly_query with information_schema.
See mcp-setup.md for detailed setup instructions. See mcp-tools.md for detailed usage and examples.
awsknowledge)Consult for verifying DSQL service limits before advising users. The numeric limits below are defaults that may change — when a user's decision depends on an exact limit, verify it first:
| Limit | Default | Verify query |
| ------------------------------ | ------------- | ---------------------------------- |
| Max rows per transaction | 3,000 | aurora dsql transaction limits |
| Max data size per transaction | 10 MiB | aurora dsql transaction limits |
| Max transaction duration | 5 minutes | aurora dsql transaction limits |
| Max connections per cluster | 10,000 | aurora dsql connection limits |
| Auth token expiry | 15 minutes | aurora dsql authentication token |
| Max connection duration | 60 minutes | aurora dsql connection limits |
| Max indexes per table | 24 | aurora dsql index limits |
| Max columns per index | 8 | aurora dsql index limits |
| IDENTITY/SEQUENCE CACHE values | 1 or >= 65536 | aurora dsql sequence cache |
| Supported column data types | See docs | aurora dsql supported data types |
When to verify: Before recommending batch sizes, connection pool settings, or schema designs where hitting a limit would cause failures; any time the exact number can affect user decision.
Fallback: If awsknowledge is unavailable, use the defaults above and flag that limits should be verified against DSQL documentation.
Bash scripts in scripts/ for cluster management (create, delete, list, cluster info), psql connection, and bulk data loading from local/s3 csv/tsv/parquet files. See scripts/README.md for usage and hook configuration.
readonly_query with information_schema to list tables. Use get_schema for table structure.readonly_query for SELECT queries. MUST include tenant_id in WHERE for multi-tenant apps. MUST build SQL with safe_query.build().transact with one DDL per transaction. MUST batch DML under 3,000 rows. MUST use CREATE INDEX ASYNC in a separate call. Use dsql_lint to validate first.aurora-dsql-loader for CSV/TSV/Parquet. Load data-loading.md for details. Use --dry-run first.CREATE INDEX ASYNC exclusivelytransact(["CREATE TABLE ..."])string_to_array(text, ',') or jsonb_array_elements_text(json::jsonb))Every DDL statement generated in this workflow MUST be validated with dsql_lint(fix=true) before its transact call — applies to step 2 (ADD COLUMN) and step 5 (async index). DML (UPDATE in step 3) does not require linting.
dsql_lint(sql=..., fix=true) — handle diagnostics per dsql-lint.mdtransact(["ALTER TABLE ... ADD COLUMN ..."])dsql_lint(sql=..., fix=true), then create via transactdsql_lint before executingRecovery — batch fails midway: Rows already updated keep their new value (each batch committed independently). Resume by filtering on the unset state (WHERE new_column IS NULL) and continue. Re-running is safe because the filter naturally excludes completed rows.
Use aurora-dsql-loader for CSV, TSV, or Parquet loads. MUST load data-loading.md before advising on throughput or diagnosing slow loads.
--dry-run first--manifest-dir on persistent storage (not /tmp — tmpfs on AL2023, lost on crash) and --header if file has a header row--resume-job-id; for duplicates use --on-conflict do-nothingCREATE INDEX ASYNCINSERT: MUST validate parent exists with readonly_query → throw error if not found → insert child with transact.
DELETE: MUST check dependents with readonly_query COUNT → return error if dependents exist → delete with transact if safe.
safe_query.build() — use allow()/regex() for
values (emits 'v'), ident() for table/column names (emits "v").
See input-validation.mdtenant_id in the WHERE clause; reject cross-tenant access at the application layerMUST load access-control.md for role setup, IAM mapping, and schema permissions.
DSQL does NOT support direct ALTER COLUMN TYPE, DROP COLUMN, DROP CONSTRAINT, or MODIFY PRIMARY KEY. These require the Table Recreation Pattern. This is a destructive workflow that requires user confirmation at each step. Every generated DDL in the pattern (CREATE new, INSERT ... SELECT, DROP old, RENAME) MUST be validated with dsql_lint(sql=..., fix=true) before execution.
MUST load ddl-migrations/overview.md before attempting any of these operations.
MUST load dsql-lint.md before running dsql_lint — it defines diagnostic handling, the three fix_result.status values (fixed, fixed_with_warning, unfixable), and user-confirmation gates.
Run dsql_lint(sql=source_sql, fix=true) to validate and auto-convert PostgreSQL-compatible SQL. dsql_lint uses a PostgreSQL parser, so MySQL dialect syntax that PostgreSQL cannot parse (e.g., PARTITION BY HASH, AUTO_INCREMENT in some positions) surfaces as a parse_error rule rather than individual diagnostics.
ENGINE= clauses and SET(...) column types can pass silently through the PostgreSQL parser.parse_error, fall back to mysql-migrations/type-mapping.md for manual conversion, then re-run dsql_lint on the converted output before executing.Explains why the DSQL optimizer chose a particular plan. Triggered by slow queries, high DPU, unexpected Full Scans, or plans the user doesn't understand. REQUIRES a structured Markdown diagnostic report is the deliverable beyond conversation — run the workflow end-to-end before answering. Use the aurora-dsql MCP when connected; fall back to raw psql with a generated IAM token (see the fallback block below) otherwise.
Phase 0 — Load reference material. Read all four before starting — each has content later phases need verbatim (node-type math, exact catalog SQL, the >30s skip protocol, required report elements):
>30s skip protocolPhase 1 — Capture the plan. ALWAYS run readonly_query("EXPLAIN ANALYZE VERBOSE …") on the user's query verbatim (SELECT form) — ALWAYS capture a fresh plan from the cluster, even when the user describes the plan or reports an anomaly. MAY leverage get_schema or information_schema for schema sanity checks. When EXPLAIN errors (relation does not exist, column does not exist), MUST report the error verbatim — MUST NOT invent DSQL-specific semantics (e.g., case sensitivity, identifier quoting) as the root cause. Extract Query ID, Planning Time, Execution Time, DPU Estimate. SELECT runs as-is. UPDATE/DELETE rewrite to the equivalent SELECT (same join chain + WHERE) — the optimizer picks the same plan shape. INSERT, pl/pgsql, DO blocks, and functions MUST be rejected. MUST NOT use transact --allow-writes for plan capture; it bypasses MCP safety.
Phase 2 — Gather evidence. Using SQL from catalog-queries.md, query pg_class, pg_stats, pg_indexes, COUNT(*), COUNT(DISTINCT). Classify estimation errors per plan-interpretation.md (2x–5x minor, 5x–50x significant, 50x+ severe). Detect correlated predicates and data skew.
Phase 3 — Experiment (conditional). ≤30s: run GUC experiments per guc-experiments.md (default + merge-join-only) plus optional redundant-predicate test. >30s: skip experiments, include the manual GUC testing SQL verbatim in the report, and do not re-run for redundant-predicate testing. Anomalous values (impossible row counts): confirm query results are correct despite the anomalous EXPLAIN, flag as a potential DSQL bug, and produce the Support Request Template from report-format.md.
Phase 4 — Produce the report, invite reassessment. Produce the full diagnostic report per the "Required Elements Checklist" in query-plan/report-format.md — structure is non-negotiable. End with the "Next Steps" block from that reference so the user can ask for a reassessment after applying a recommendation. When the user says "reassess" (or equivalent), re-run Phase 1–2 and append an "Addendum: After-Change Performance" to the original report (before/after table, match against expected impact) rather than producing a new report.
psql fallback (MCP unavailable). Pipe statements into psql via heredoc and check $?; report failures without proceeding on partial evidence:
TOKEN=$(aws dsql generate-db-connect-admin-auth-token --hostname "$HOST" --region "$REGION")
PGPASSWORD="$TOKEN" psql "host=$HOST port=5432 user=admin dbname=postgres sslmode=require" <<<"EXPLAIN ANALYZE VERBOSE <sql>;"
Safety. Plan capture uses readonly_query exclusively — it rejects INSERT/UPDATE/DELETE/DDL at the MCP layer. Rewrite DML to SELECT (Phase 1) rather than asking transact --allow-writes to run it; write-mode transact bypasses all MCP safety checks. MUST NOT run arbitrary DDL/DML or pl/pgsql.
awsknowledge returns no results: Use the default limits in the table above and note that limits should be verified against DSQL documentation.dsql_lint unavailable or timing out: See the Error Handling section of dsql-lint.md. Do not silently skip validation — inform the user and require explicit confirmation before proceeding with manual rules from development-guide.md.development
Deploy to AWS Elastic Beanstalk. Triggers on: elastic beanstalk, EB, managed EC2 platform, web app with managed patching, worker on EC2, Heroku alternative, don't want to manage servers or containers, migrate from Heroku, managed operational lifecycle. Covers Elastic Beanstalk on EC2 for web and worker applications.
testing
Evaluate, configure, and migrate workloads to AWS Lambda Managed Instances (LMI). Triggers on: Lambda Managed Instances, LMI, capacity provider, multi-concurrency Lambda, dedicated instance Lambda, EC2-backed Lambda, cold start elimination, Graviton Lambda, instance type for Lambda, Lambda cost optimization with Reserved Instances or Savings Plans. Also trigger when users describe high-volume predictable workloads seeking cost savings, or compare Lambda vs EC2 for steady-state traffic. For standard Lambda without LMI, use the aws-lambda skill instead.
development
Deploy applications to AWS. Triggers on phrases like: deploy to AWS, host on AWS, run this on AWS, AWS architecture, estimate AWS cost, generate infrastructure. Analyzes any codebase and deploys to optimal AWS services.
tools
Design, build, deploy, test, and debug serverless applications with AWS Lambda. Triggers on phrases like: Lambda function, event source, serverless application, API Gateway, EventBridge, Step Functions, serverless API, event-driven architecture, Lambda trigger. For deploying non-serverless apps to AWS, use deploy-on-aws plugin instead.