specwright/templates/skills/platform/platform-scalability/SKILL.md
# [SKILL_NAME] - Platform Scalability > **Role:** Platform Scalability Architect > **Domain:** Performance, Load Handling & Growth Strategy > **Created:** [CURRENT_DATE] ## Purpose Design and implement scalability strategies for multi-module platforms. Plan for growth, handle increasing load, optimize performance, and ensure platform can scale horizontally and vertically. ## When to Activate **Use this skill for:** - Scalability strategy planning - Load testing and capacity planning - Datab
npx skillsauth add michsindlinger/specwright specwright/templates/skills/platform/platform-scalabilityInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Role: Platform Scalability Architect Domain: Performance, Load Handling & Growth Strategy Created: [CURRENT_DATE]
Design and implement scalability strategies for multi-module platforms. Plan for growth, handle increasing load, optimize performance, and ensure platform can scale horizontally and vertically.
Use this skill for:
Do NOT use for:
Definition: Add more instances of a module to handle increased load
Architecture:
┌──────────────┐
Clients ────▶│Load Balancer │
└──────┬───────┘
│
┌──────────────┼──────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│Module A │ │Module A │ │Module A │
│Instance1│ │Instance2│ │Instance3│
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
└──────────────┼──────────────┘
│
┌──────▼──────┐
│ Database │
│ (shared) │
└─────────────┘
Implementation (Docker Compose):
version: '3'
services:
nginx-lb:
image: nginx:latest
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
depends_on:
- module-a-1
- module-a-2
- module-a-3
module-a-1:
build: ./services/module-a
environment:
- DATABASE_URL=postgres://db/module_a
- REDIS_URL=redis://cache:6379
module-a-2:
build: ./services/module-a
environment:
- DATABASE_URL=postgres://db/module_a
- REDIS_URL=redis://cache:6379
module-a-3:
build: ./services/module-a
environment:
- DATABASE_URL=postgres://db/module_a
- REDIS_URL=redis://cache:6379
database:
image: postgres:15
volumes:
- db-data:/var/lib/postgresql/data
cache:
image: redis:7
Load Balancer Config (nginx.conf):
upstream module_a_backend {
# Round-robin by default
server module-a-1:3000;
server module-a-2:3000;
server module-a-3:3000;
}
server {
listen 80;
location /api/module-a/ {
proxy_pass http://module_a_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
Stateless Application Pattern:
// ✅ CORRECT: Stateless (can scale horizontally)
class DocumentService {
constructor(
private database: Database,
private cache: RedisCache // Shared cache
) {}
async getDocument(id: string): Promise<Document> {
// Check shared cache first
const cached = await this.cache.get(`document:${id}`);
if (cached) return cached;
// Fetch from database
const doc = await this.database.query('SELECT * FROM documents WHERE id = $1', [id]);
// Store in shared cache
await this.cache.set(`document:${id}`, doc, 3600);
return doc;
}
}
// ❌ WRONG: Stateful (cannot scale horizontally)
class DocumentServiceWrong {
private cache = new Map(); // In-memory cache (instance-specific)
async getDocument(id: string): Promise<Document> {
// This cache is NOT shared across instances
if (this.cache.has(id)) {
return this.cache.get(id);
}
const doc = await this.database.query('SELECT * FROM documents WHERE id = $1', [id]);
this.cache.set(id, doc); // Only cached on THIS instance
return doc;
}
}
When to Use:
Trade-offs:
Definition: Increase resources (CPU, RAM) of existing instances
Before:
┌─────────────────┐
│ Module A │
│ 2 CPU │
│ 4 GB RAM │
└─────────────────┘
After:
┌─────────────────┐
│ Module A │
│ 8 CPU │
│ 16 GB RAM │
└─────────────────┘
When to Use:
Implementation (Kubernetes):
apiVersion: v1
kind: Pod
metadata:
name: module-a
spec:
containers:
- name: module-a
image: module-a:latest
resources:
requests:
memory: "4Gi"
cpu: "2"
limits:
memory: "16Gi" # Increased
cpu: "8" # Increased
Trade-offs:
Definition: Distribute read load across multiple database replicas
Architecture:
Write Operations
│
▼
┌──────────────┐
│Primary (RW) │
│ Database │
└──────┬───────┘
│ Replication
┌───────┼───────┐
│ │ │
┌────▼──┐ ┌─▼───┐ ┌─▼───┐
│Replica│ │Repli│ │Repli│
│1 (RO) │ │ca 2 │ │ca 3 │
└───────┘ └─────┘ └─────┘
▲ ▲ ▲
│ │ │
Read Operations (Load Balanced)
Implementation (PostgreSQL):
// Database connection pool
class DatabasePool {
private primaryPool: Pool; // Write operations
private replicaPools: Pool[]; // Read operations
constructor() {
this.primaryPool = new Pool({
host: 'primary.db.internal',
port: 5432,
max: 20 // Connection limit
});
this.replicaPools = [
new Pool({ host: 'replica1.db.internal', port: 5432, max: 20 }),
new Pool({ host: 'replica2.db.internal', port: 5432, max: 20 }),
new Pool({ host: 'replica3.db.internal', port: 5432, max: 20 })
];
}
// Write operations go to primary
async write(query: string, params: any[]): Promise<any> {
return this.primaryPool.query(query, params);
}
// Read operations load-balanced across replicas
async read(query: string, params: any[]): Promise<any> {
const randomReplica = this.replicaPools[
Math.floor(Math.random() * this.replicaPools.length)
];
return randomReplica.query(query, params);
}
}
// Repository using read replicas
class DocumentRepository {
constructor(private db: DatabasePool) {}
async findById(id: string): Promise<Document> {
// Read from replica
const result = await this.db.read(
'SELECT * FROM documents WHERE id = $1',
[id]
);
return result.rows[0];
}
async save(document: Document): Promise<void> {
// Write to primary
await this.db.write(
'INSERT INTO documents (id, content) VALUES ($1, $2)',
[document.id, document.content]
);
}
}
Replication Lag Handling:
// Handle eventual consistency
class DocumentService {
async createDocument(content: string): Promise<Document> {
const doc = new Document(content);
// Write to primary
await this.repository.save(doc);
// PROBLEM: Read replica may not have it yet (replication lag)
// SOLUTION: Read from primary for fresh data
return this.repository.findById(doc.id, { usePrimary: true });
}
async searchDocuments(query: string): Promise<Document[]> {
// Read from replicas (eventual consistency OK for search)
return this.repository.search(query);
}
}
When to Use:
Trade-offs:
Multi-Level Caching:
┌──────────────────────────────────────────┐
│ Request │
└───────────────┬──────────────────────────┘
│
┌──────▼──────┐
│ CDN Cache │ ← Level 1: Static assets
│ (CloudFront)│
└──────┬──────┘
│ Cache miss
┌──────▼──────┐
│ Application │ ← Level 2: In-memory cache
│ Cache (Node)│
└──────┬──────┘
│ Cache miss
┌──────▼──────┐
│ Redis │ ← Level 3: Distributed cache
│ Cache │
└──────┬──────┘
│ Cache miss
┌──────▼──────┐
│ Database │ ← Level 4: Source of truth
└─────────────┘
Implementation:
class CachedDocumentService {
constructor(
private inMemoryCache: Map<string, Document>, // L2
private redisCache: RedisClient, // L3
private database: Database // L4
) {}
async getDocument(id: string): Promise<Document> {
// L2: Check in-memory cache (fastest)
if (this.inMemoryCache.has(id)) {
console.log('Cache hit: in-memory');
return this.inMemoryCache.get(id);
}
// L3: Check Redis (fast, shared)
const cached = await this.redisCache.get(`doc:${id}`);
if (cached) {
console.log('Cache hit: Redis');
const doc = JSON.parse(cached);
// Populate in-memory cache
this.inMemoryCache.set(id, doc);
return doc;
}
// L4: Fetch from database (slow)
console.log('Cache miss: fetching from DB');
const doc = await this.database.query(
'SELECT * FROM documents WHERE id = $1',
[id]
);
if (doc) {
// Populate Redis (TTL: 1 hour)
await this.redisCache.setex(`doc:${id}`, 3600, JSON.stringify(doc));
// Populate in-memory
this.inMemoryCache.set(id, doc);
}
return doc;
}
async updateDocument(id: string, content: string): Promise<void> {
// Update database
await this.database.query(
'UPDATE documents SET content = $1 WHERE id = $2',
[content, id]
);
// Invalidate caches (Cache-Aside pattern)
this.inMemoryCache.delete(id);
await this.redisCache.del(`doc:${id}`);
}
}
Cache Invalidation Strategies:
Strategy 1: Time-to-Live (TTL)
// Set expiration time
await redis.setex('key', 3600, value); // Expires in 1 hour
Strategy 2: Cache-Aside (Lazy Loading)
// On update, delete cache
await redis.del('key');
// On read, if miss, populate
const value = await redis.get('key');
if (!value) {
const fresh = await database.query(...);
await redis.set('key', fresh);
}
Strategy 3: Write-Through
// On write, update cache immediately
async function updateDocument(id: string, data: any) {
await database.update(id, data);
await redis.set(`doc:${id}`, data); // Keep cache in sync
}
When to Use:
Trade-offs:
Definition: Use message queues to buffer load spikes
Architecture:
High Traffic Spike
│
▼
┌──────────────┐
│ API Server │
│ │
└──────┬───────┘
│ Enqueue job
▼
┌──────────────┐
│ Message Queue│ ← Buffer
│ (RabbitMQ) │
└──────┬───────┘
│ Process at controlled rate
▼
┌──────────────┐
│ Workers │
│ (scalable) │
└──────────────┘
Implementation:
// API Server: Enqueue jobs instead of processing immediately
class DocumentIndexingAPI {
constructor(private queue: MessageQueue) {}
async indexDocument(content: string, metadata: any) {
// Don't process now (blocks user)
// Enqueue for async processing
const jobId = await this.queue.publish('document.index', {
content,
metadata,
createdAt: new Date()
});
// Return immediately
return {
jobId,
status: 'queued',
message: 'Document queued for indexing'
};
}
}
// Worker: Process jobs at controlled rate
class DocumentIndexingWorker {
constructor(
private queue: MessageQueue,
private indexingService: IndexingService
) {}
async start() {
// Process 10 jobs concurrently
this.queue.subscribe('document.index', async (job) => {
try {
await this.indexingService.index(job.content, job.metadata);
await this.queue.ack(job); // Mark as completed
} catch (error) {
await this.queue.nack(job); // Requeue or dead-letter
}
}, { concurrency: 10 });
}
}
Auto-Scaling Workers (Kubernetes):
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: indexing-worker-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: indexing-worker
minReplicas: 2
maxReplicas: 20
metrics:
- type: External
external:
metric:
name: rabbitmq_queue_messages_ready
selector:
matchLabels:
queue: document.index
target:
type: AverageValue
averageValue: "10" # Scale up if >10 messages/worker
When to Use:
Trade-offs:
CPU-Based:
# Kubernetes HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: module-a-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: module-a
minReplicas: 3
maxReplicas: 100
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70 # Scale at 70% CPU
Memory-Based:
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80 # Scale at 80% memory
Request-Based:
metrics:
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "1000" # Scale if >1000 req/s per pod
Different modules scale differently:
# Knowledge Management: CPU-heavy (text parsing, NLP)
knowledge-management:
scaling: horizontal
trigger: cpu > 70%
min_replicas: 5
max_replicas: 50
# Security: Low traffic, critical availability
security:
scaling: vertical
resources:
cpu: 4
memory: 8Gi
min_replicas: 3 # Redundancy only
# Use Cases: Burst traffic
use-cases:
scaling: horizontal + queue
trigger: queue_depth > 100
min_replicas: 2
max_replicas: 100
queue: rabbitmq
# Operations: Scheduled batch jobs
operations:
scaling: scheduled
cron: "0 2 * * *" # 2 AM daily
replicas_during_batch: 20
replicas_idle: 1
Remember: Scalability is not just about handling more load, but doing so cost-effectively and reliably. Measure, optimize, then scale.
tools
Session Handoff: Erstellt eine vollständige Zusammenfassung der aktuellen Session für einen sauberen Kontextwechsel. NUR bei explizitem Aufruf (/session-handoff). NICHT automatisch auslösen. Geeignet wenn der User die Session resetten will, den Kontext aufräumen will, oder bei ~120k Tokens angelangt ist.
development
Pre-Mortem Risk Analysis: Strukturierte Prospective-Hindsight-Übung um launch-blocking Risiken vor Commitment aufzudecken. Team stellt sich vor, das Produkt sei 14 Tage nach Launch gefloppt, und arbeitet rückwärts. Klassifiziert Risiken in Tigers (echt), Paper Tigers (hypothetisch), Elephants (unausgesprochen). Nutze diesen Skill vor Build-Commitment, bei zu hoher Stakeholder-Confidence, vor Major-Releases, oder wenn das Team vage Sorgen nicht artikulieren kann. Trigger: /pre-mortem, 'pre-mortem', 'risk analysis', 'was könnte schiefgehen', 'risiken vor launch'.
testing
Six-Sigma Atomicity Validator for create-spec stories
tools
UX pattern definition guidance for navigation, user flows, interactions, and accessibility