skills/migration-risk-analyzer/SKILL.md
Analyzes database migration scripts for lock contention, downtime, rollback strategy, and deployment risk. Triggers on: "analyze this migration", "migration risk", "is this migration safe", "schema change risk", "DDL risk", "rollback strategy", "migration review".
npx skillsauth add mathews-tom/armory migration-risk-analyzerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Systematic risk assessment for database migrations: parse DDL/DML operations, classify lock types and durations, estimate downtime, design rollback strategies, identify irreversible changes, and produce deployment recommendations with pre/post validation queries.
| File | Contents | Load When |
| ---------------------------------- | ------------------------------------------------------- | --------------------------- |
| references/lock-matrix.md | Operation-to-lock-type mapping for PostgreSQL, MySQL | Always |
| references/safe-patterns.md | Online DDL patterns, zero-downtime migration techniques | Risk mitigation needed |
| references/rollback-templates.md | Rollback scripts for common DDL operations | Rollback strategy requested |
| references/validation-queries.md | Pre/post migration validation SQL templates | Always |
Extract all operations from the migration script:
For each operation, determine the lock type and impact:
| Lock Level | Impact | Examples | | -------------- | --------------------------- | ------------------------------------------------ | | No lock | Zero impact | CREATE TABLE, CREATE INDEX CONCURRENTLY (PG) | | Share lock | Blocks writes, allows reads | CREATE INDEX (non-concurrent) | | Exclusive lock | Blocks all access | ALTER TABLE ADD COLUMN (MySQL < 8.0), DROP TABLE | | Row-level lock | Blocks affected rows only | UPDATE with WHERE clause |
Consider:
Estimate based on operation type and table size:
| Operation | Small Table (<100K) | Medium (100K-10M) | Large (>10M) | | ------------------------- | ------------------- | ----------------- | --------------------------- | | ADD COLUMN (nullable) | < 1s | < 1s | < 1s (PG) / minutes (MySQL) | | ADD COLUMN (with default) | < 1s | seconds | minutes (table rewrite) | | CREATE INDEX | < 1s | seconds | minutes-hours | | ADD NOT NULL | seconds | minutes | hours (full scan) | | Backfill UPDATE | seconds | minutes | hours |
For each operation, determine reversibility:
| Operation | Reversible | Rollback | | ------------- | ---------- | ---------------------------- | | ADD COLUMN | Yes | DROP COLUMN | | DROP COLUMN | No | Data is lost | | ADD INDEX | Yes | DROP INDEX | | DROP TABLE | No | Data is lost | | RENAME COLUMN | Yes | RENAME back | | ALTER TYPE | Sometimes | May lose precision | | UPDATE data | Sometimes | Only if old values preserved |
For irreversible operations, recommend backup strategies.
Produce a risk assessment with deployment recommendation.
## Migration Risk Analysis
### Summary
- **Operations:** {N} DDL, {M} DML
- **Tables affected:** {list with row counts}
- **Overall risk:** {High | Medium | Low}
- **Estimated duration:** {range}
- **Requires downtime:** {Yes | No}
### Operation Risk Table
| # | Operation | Risk | Lock Type | Est. Duration | Reversible |
|---|-----------|------|-----------|---------------|------------|
| 1 | {SQL operation} | {High/Med/Low} | {lock type} | {time} | {Yes/No} |
### Lock Analysis
- **Exclusive locks:** {list of operations that block all access}
- **Maximum lock duration:** {estimated time}
- **Affected queries:** {types of queries that will be blocked}
### Rollback Strategy
#### Reversible Operations
```sql
-- Rollback for operation 1: {description}
{rollback SQL}
-- Backup before migration
{backup SQL}
-- Verify structural changes
{validation queries}
-- Verify data integrity
{integrity checks}
Strategy: {Online | Low-Traffic Window | Maintenance Window} Estimated downtime: {time or "None with proper execution"} Rollback time: {time} Risk mitigation: {specific recommendations}
## Calibration Rules
1. **Assume large tables.** If table size is unknown, assume it's large enough for
locks to matter. Overestimating risk is safer than underestimating.
2. **Engine-specific analysis.** PostgreSQL and MySQL handle DDL very differently.
PostgreSQL can add nullable columns without table rewrite; MySQL often cannot.
Always target the specific engine.
3. **Irreversible means irreversible.** DROP COLUMN destroys data. No amount of
rollback scripting recovers it. Flag every irreversible operation prominently.
4. **Test the rollback.** Rollback scripts must be tested in staging before the
migration runs in production. Untested rollback is no rollback.
5. **Sequence matters.** The order of operations affects lock duration. Adding a
column then backfilling then adding NOT NULL is safer than adding a NOT NULL
column with a default.
## Error Handling
| Problem | Resolution |
|---------|------------|
| Database engine not specified | Ask. Lock behavior differs significantly between engines. |
| Table sizes unknown | Analyze without duration estimates. Flag that estimates require row counts. |
| ORM migration format (not raw SQL) | Parse the ORM migration file. Translate operations to SQL equivalents for analysis. |
| Migration has data-dependent logic | Flag conditional operations. Risk depends on data state at migration time. |
| Multiple migrations in sequence | Analyze each independently and as a group. Cross-migration lock accumulation is a risk. |
## When NOT to Analyze
Push back if:
- The migration is for a development/staging database — risk analysis is for production
- The migration only creates new tables (no ALTER, no existing data) — low risk by definition
- The user wants migration execution, not analysis — this skill assesses risk, it doesn't run migrations
## Rationalizations
| Rationalization | Reality |
|---|---|
| "It's backwards compatible" | Backwards compatible at the schema level doesn't mean backwards compatible at the application level — query plans, ORM mappings, and application code all interact |
| "We can roll back" | Rollback is not free — data written after migration may not survive rollback; DROP COLUMN has no rollback without backup |
| "It's a small table" | Table size is one factor — lock duration, concurrent write rate, and replication lag matter more than row count |
| "We've run this migration type before" | Past success doesn't predict future success — different data distribution, different load, different constraints |
| "Downtime window is long enough" | Estimate based on dev data, not production — migration on 10k rows takes seconds; on 50M rows with indexes, it takes hours |
| "The ORM handles it" | ORMs generate SQL, they don't guarantee safety — `ALTER TABLE` locking behavior is engine-specific and ORM-opaque |
## Red Flags
- No estimate of migration duration based on production data volume
- No rollback plan or rollback plan that doesn't account for data written post-migration
- Analyzing migration SQL without checking the current table size and write rate
- No consideration of replication lag in multi-replica setups
- Assuming zero downtime without verifying lock behavior for the specific DDL operation
- Skipping index analysis — adding an index on a large table can lock writes for minutes to hours
## Verification
- [ ] Production table sizes and row counts documented for all affected tables
- [ ] Lock behavior identified for each DDL statement (exclusive lock, no lock, etc.)
- [ ] Migration duration estimated using production-scale data, not dev fixtures
- [ ] Rollback plan documented with specific steps and data preservation guarantees
- [ ] Concurrent write impact assessed — what happens to in-flight transactions during migration
- [ ] Replication lag impact assessed for multi-replica configurations
testing
Create, review, and restyle data visualizations using Edward Tufte principles: high data-ink ratio, direct labels, range-frame axes, small multiples, accessible color, responsive charts, and honest comparisons. Triggers on: "create a chart", "style this chart", "review this graph", "Tufte chart", "data visualization", "Recharts", "Plotly", "matplotlib", "Chart.js", "ECharts", "D3". Use when generating or critiquing charts, dashboards, sparklines, and data tables.
testing
Manages dependent branch stacks and stacked pull requests using safe Git topology rules. Triggers on: "create stacked PRs", "publish this stack", "sync my PR stack", "rebase this stack", "merge the stack", "retarget child PRs", "split this branch into stacked PRs", "validate this stack", "cleanup stacked branches". Use when local branches or one source branch need to become a dependency-ordered PR stack with correct parent bases, validation, synchronization, merge order, and cleanup.
development
Scaffolds per-repository agent context so coding agents share the same issue tracker rules, triage label vocabulary, domain glossary, ADR layout, and handoff conventions. Triggers on: "set up project context", "configure agent docs", "create CONTEXT.md", "setup agent workflow", "agent issue tracker setup", "triage labels", "domain glossary for agents". Use when a repo needs durable context files before planning, triage, debugging, TDD, architecture review, or multi-agent implementation.
testing
Produces phased task boards from feature requests: dependency-mapped work items, parallelization flags, risk flags, edge cases, test matrices. Triggers on: "decompose this feature", "task breakdown with dependencies", "phased implementation plan", "work breakdown structure". NOT for effort estimates, use estimate-calibrator.