Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

curiositech/data-migration-specialist

Name: data-migration-specialist
Author: curiositech

skills/data-migration-specialist/SKILL.md

npx skillsauth add curiositech/windags-skills data-migration-specialist

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Data Migration Specialist

Plan and execute large-scale data migrations with zero-downtime strategies, comprehensive validation, and reliable rollback plans.

Activation Triggers

Activate on: "data migration", "database migration", "zero-downtime migration", "dual-write", "backfill", "cutover", "data validation", "schema migration", "platform migration", "cloud migration"

NOT for: Streaming schema evolution → schema-evolution-manager | API backward compatibility → api-versioning-backward-compatibility | Warehouse optimization → data-warehouse-optimizer

Quick Start

Audit source data — profile row counts, data types, nulls, constraints, and edge cases
Choose strategy — big bang (downtime), dual-write (zero-downtime), CDC-based (continuous sync)
Build validation framework — row counts, checksums, spot checks, and business rule assertions
Run rehearsal — execute full migration on staging with production-scale data
Plan rollback — define rollback triggers, procedure, and maximum acceptable rollback window

Core Capabilities

| Domain | Technologies | |--------|-------------| | Strategies | Dual-write, CDC-based, big bang, strangler fig | | CDC Tools | Debezium, AWS DMS, GCP Datastream, pglogical | | Validation | Great Expectations, custom checksums, row-count reconciliation | | Schema | Flyway, Liquibase, Prisma Migrate, Alembic | | Orchestration | Airflow, Temporal, custom migration scripts |

Architecture Patterns

Zero-Downtime Migration (Dual-Write)

Phase 1: Dual-Write (days/weeks)
──────────────────────────────────
App writes → Old DB (primary)
         → New DB (shadow, async)
Reads from: Old DB only

Phase 2: Shadow Read Validation
──────────────────────────────────
App writes → Old DB + New DB
Reads from: Old DB (primary)
            New DB (shadow, compare results)

Phase 3: Cutover
──────────────────────────────────
App writes → New DB (primary)
         → Old DB (shadow, for rollback)
Reads from: New DB

Phase 4: Cleanup
──────────────────────────────────
App writes → New DB only
Remove old DB writes
Decommission old DB (after rollback window expires)

Validation Framework

class MigrationValidator:
    """Run after each migration phase to verify data integrity."""

    def validate_row_counts(self):
        """Source and target row counts must match within tolerance."""
        for table in self.tables:
            source = self.source_db.count(table)
            target = self.target_db.count(table)
            tolerance = 0.001  # 0.1% tolerance for in-flight writes
            assert abs(source - target) / source < tolerance, \
                f"{table}: source={source}, target={target}"

    def validate_checksums(self):
        """Hash comparison on sampled rows."""
        for table in self.tables:
            sample_ids = self.source_db.sample_ids(table, n=10000)
            for batch in chunked(sample_ids, 1000):
                source_hash = self.source_db.hash_rows(table, batch)
                target_hash = self.target_db.hash_rows(table, batch)
                assert source_hash == target_hash, \
                    f"{table}: checksum mismatch in batch"

    def validate_business_rules(self):
        """Domain-specific invariants."""
        # Example: total revenue must match
        source_rev = self.source_db.query("SELECT SUM(amount) FROM orders")
        target_rev = self.target_db.query("SELECT SUM(amount) FROM orders")
        assert source_rev == target_rev, "Revenue mismatch!"

    def validate_constraints(self):
        """All FKs, unique constraints, and NOT NULLs hold on target."""
        violations = self.target_db.check_constraints()
        assert len(violations) == 0, f"Constraint violations: {violations}"

Rollback Decision Tree

Migration issue detected?
  │
  ├─ Data loss or corruption? → IMMEDIATE ROLLBACK
  │   Switch reads/writes back to old DB
  │   Replay writes from new DB → old DB (if needed)
  │
  ├─ Performance regression? → EVALUATE
  │   ├─ < 2x slower → optimize, do not rollback
  │   └─ > 2x slower → rollback, investigate
  │
  └─ Minor data discrepancy? → FIX FORWARD
      Run reconciliation job to sync
      Do NOT rollback for fixable issues

Rollback window: keep old DB live for 7-14 days post-cutover

Anti-Patterns

No rehearsal — always run the full migration on staging with production-scale data before the real cutover
Validation after cutover only — validate continuously during dual-write phase, not just at the end
No rollback plan — every migration needs a documented rollback procedure with defined triggers
Big bang for large datasets — migrating 1TB+ in one shot risks extended downtime; use CDC or phased approach
Ignoring edge cases — NULLs, empty strings, Unicode, timezone differences, and precision loss cause subtle data corruption

Quality Checklist

[ ] Source data profiled: row counts, types, nulls, edge cases documented
[ ] Migration strategy chosen with justification (dual-write, CDC, big bang)
[ ] Validation framework covers: row counts, checksums, business rules, constraints
[ ] Full rehearsal completed on staging with production-scale data
[ ] Rollback procedure documented with trigger criteria
[ ] Rollback window defined (keep old system live 7-14 days)
[ ] Dual-write conflict resolution strategy defined
[ ] Performance benchmarked: target system meets SLA before cutover
[ ] Communication plan: stakeholders notified of timeline and risks
[ ] Post-migration monitoring: alerts for data discrepancies for 30 days

curiositech/data-migration-specialist

skills/data-migration-specialist/SKILL.md

Large-scale data migrations with validation, rollback, and zero-downtime strategies. Activate on: data migration, database migration, zero-downtime migration, dual-write, backfill, cutover, data validation, schema migration. NOT for: schema evolution in streams (use schema-evolution-manager), API versioning (use api-versioning-backward-compatibility).

development

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add curiositech/windags-skills data-migration-specialist

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 2:11 PM63.7s1 file scanned

SKILL.md

license:: Apache-2.0
name:: data-migration-specialist
description:: Large-scale data migrations with validation, rollback, and zero-downtime strategies. Activate on: data migration, database migration, zero-downtime migration, dual-write, backfill, cutover, data validation, schema migration. NOT for: schema evolution in streams (use schema-evolution-manager), API versioning (use api-versioning-backward-compatibility).
allowed-tools:: Read,Write,Edit,Bash(npm:*,npx:*,python:*,psql:*,docker:*)
category:: Data & Analytics
- skill:: schema-evolution-manager
reason:: Schema changes accompany data migrations

Data Migration Specialist

Plan and execute large-scale data migrations with zero-downtime strategies, comprehensive validation, and reliable rollback plans.

Activation Triggers

Activate on: "data migration", "database migration", "zero-downtime migration", "dual-write", "backfill", "cutover", "data validation", "schema migration", "platform migration", "cloud migration"

NOT for: Streaming schema evolution → schema-evolution-manager | API backward compatibility → api-versioning-backward-compatibility | Warehouse optimization → data-warehouse-optimizer

Quick Start

Audit source data — profile row counts, data types, nulls, constraints, and edge cases
Choose strategy — big bang (downtime), dual-write (zero-downtime), CDC-based (continuous sync)
Build validation framework — row counts, checksums, spot checks, and business rule assertions
Run rehearsal — execute full migration on staging with production-scale data
Plan rollback — define rollback triggers, procedure, and maximum acceptable rollback window

Core Capabilities

Architecture Patterns

Zero-Downtime Migration (Dual-Write)

Phase 1: Dual-Write (days/weeks)
──────────────────────────────────
App writes → Old DB (primary)
         → New DB (shadow, async)
Reads from: Old DB only

Phase 2: Shadow Read Validation
──────────────────────────────────
App writes → Old DB + New DB
Reads from: Old DB (primary)
            New DB (shadow, compare results)

Phase 3: Cutover
──────────────────────────────────
App writes → New DB (primary)
         → Old DB (shadow, for rollback)
Reads from: New DB

Phase 4: Cleanup
──────────────────────────────────
App writes → New DB only
Remove old DB writes
Decommission old DB (after rollback window expires)

Validation Framework

class MigrationValidator:
    """Run after each migration phase to verify data integrity."""

    def validate_row_counts(self):
        """Source and target row counts must match within tolerance."""
        for table in self.tables:
            source = self.source_db.count(table)
            target = self.target_db.count(table)
            tolerance = 0.001  # 0.1% tolerance for in-flight writes
            assert abs(source - target) / source < tolerance, \
                f"{table}: source={source}, target={target}"

    def validate_checksums(self):
        """Hash comparison on sampled rows."""
        for table in self.tables:
            sample_ids = self.source_db.sample_ids(table, n=10000)
            for batch in chunked(sample_ids, 1000):
                source_hash = self.source_db.hash_rows(table, batch)
                target_hash = self.target_db.hash_rows(table, batch)
                assert source_hash == target_hash, \
                    f"{table}: checksum mismatch in batch"

    def validate_business_rules(self):
        """Domain-specific invariants."""
        # Example: total revenue must match
        source_rev = self.source_db.query("SELECT SUM(amount) FROM orders")
        target_rev = self.target_db.query("SELECT SUM(amount) FROM orders")
        assert source_rev == target_rev, "Revenue mismatch!"

    def validate_constraints(self):
        """All FKs, unique constraints, and NOT NULLs hold on target."""
        violations = self.target_db.check_constraints()
        assert len(violations) == 0, f"Constraint violations: {violations}"

Rollback Decision Tree

Migration issue detected?
  │
  ├─ Data loss or corruption? → IMMEDIATE ROLLBACK
  │   Switch reads/writes back to old DB
  │   Replay writes from new DB → old DB (if needed)
  │
  ├─ Performance regression? → EVALUATE
  │   ├─ < 2x slower → optimize, do not rollback
  │   └─ > 2x slower → rollback, investigate
  │
  └─ Minor data discrepancy? → FIX FORWARD
      Run reconciliation job to sync
      Do NOT rollback for fixable issues

Rollback window: keep old DB live for 7-14 days post-cutover

Anti-Patterns

No rehearsal — always run the full migration on staging with production-scale data before the real cutover
Validation after cutover only — validate continuously during dual-write phase, not just at the end
No rollback plan — every migration needs a documented rollback procedure with defined triggers
Big bang for large datasets — migrating 1TB+ in one shot risks extended downtime; use CDC or phased approach
Ignoring edge cases — NULLs, empty strings, Unicode, timezone differences, and precision loss cause subtle data corruption

Quality Checklist

[ ] Source data profiled: row counts, types, nulls, edge cases documented
[ ] Migration strategy chosen with justification (dual-write, CDC, big bang)
[ ] Validation framework covers: row counts, checksums, business rules, constraints
[ ] Full rehearsal completed on staging with production-scale data
[ ] Rollback procedure documented with trigger criteria
[ ] Rollback window defined (keep old system live 7-14 days)
[ ] Dual-write conflict resolution strategy defined
[ ] Performance benchmarked: target system meets SLA before cutover
[ ] Communication plan: stakeholders notified of timeline and risks
[ ] Post-migration monitoring: alerts for data discrepancies for 30 days

Related Skills

curiositech/revisiting-interview-data-analysing-turn

data-ai

VerifiedTrustedCommunity

license: Apache-2.0 NOT for unrelated tasks outside this domain.

8SKILL.mdUpdated Jul 19, 2026

curiositech/revisiting-interview-data-analysing-turn

curiositech/redis-patterns-expert

development

VerifiedTrustedCommunity

Use when designing caching strategies (cache-aside, write-through, write-behind), implementing distributed locks, building rate limiters, leaderboards, real-time streams (XADD/consumer groups), pub/sub, or tuning eviction policies. Triggers: thundering-herd on cache miss, dogpile on key expiry, Redlock vs SET-NX-PX choice, sliding-window rate limiter, hot-key on a single cluster slot, big-key blowup, MULTI/EXEC across slots, KEYS in production. NOT for Redis Cluster operations/admin (different domain), embedded KV (SQLite, leveldb), in-process LRU caches, or Memcached.

8SKILL.mdUpdated Jul 19, 2026

curiositech/redis-patterns-expert

curiositech/react-server-components-boundary

tools

VerifiedTrustedCommunity

Drawing the `'use client'` boundary correctly in React Server Components apps (Next.js App Router, RSC frameworks) — leaf-pushing, slot composition, serialization rules, and environment poisoning prevention. Grounded in react.dev and Next.js 16 docs.

8SKILL.mdUpdated Jul 19, 2026

curiositech/react-server-components-boundary

curiositech/rate-limiting-strategy

development

VerifiedTrustedCommunity

Use when designing rate limiting for an API, choosing between token bucket / sliding window / leaky bucket / fixed window, implementing it in Redis, deciding edge (Cloudflare/Upstash) vs origin enforcement, sizing per-user vs per-IP vs per-endpoint quotas, returning the right 429 response with Retry-After, or fixing the boundary-burst bug in fixed-window limiters. Triggers: 429 too many requests, INCR + EXPIRE, ZADD + ZREMRANGEBYSCORE + ZCARD, X-RateLimit-Remaining header, Cloudflare WAF rate limiting rules, Upstash @upstash/ratelimit, leaky bucket shaping vs policing, distributed rate limiter consistency. NOT for DDoS mitigation specifically (different scale), CAPTCHA / bot management, full WAF design, or per-user quota billing.

8SKILL.mdUpdated Jul 19, 2026

curiositech/rate-limiting-strategy

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/curiositech/windags-skills.git

# Copy into Claude Code skills folder (global)
cp -r windags-skills/skills/data-migration-specialist ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

curiositech/windags-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT