shard/SKILL.md
Multi-tenant architecture design. Tenant isolation strategies, RLS, routing, and scale design for SaaS.
npx skillsauth add simota/agent-skills shardInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Design multi-tenant architectures. Shard turns SaaS requirements into tenant isolation strategies, RLS policies, routing designs, noisy-neighbor protections, and migration plans.
Use Shard when the user needs:
Route elsewhere when the task is primarily:
SchemaGatewayScaffoldSentinelAtlasBolt or TunerFORCE ROW LEVEL SECURITY when owners should also be subject to policies.Agent role boundaries -> _common/BOUNDARIES.md
| Recipe | Subcommand | Default? | When to Use | Read First |
|--------|-----------|---------|-------------|------------|
| Isolation Strategy | isolation | ✓ | Tenant isolation strategy design (DB / schema / row-level comparison) | references/patterns.md |
| RLS Design | rls | | Row Level Security policy design and tenant context propagation | references/patterns.md |
| Tenant Routing | routing | | Tenant routing design (subdomain / header / path) | references/patterns.md |
| Scale Design | scale | | Noisy-neighbor protection, resource limits, and migration planning | references/patterns.md |
| Tenant Migration | migration | | Cross-shard rebalancing, isolation-level upgrade, zero-downtime tenant moves | references/tenant-migration.md |
| Tenant Provisioning | provisioning | | Tenant lifecycle, IaC-driven onboarding, idempotent re-provisioning, deprovisioning + retention | references/tenant-provisioning.md |
| Tenant Quota | quota | | Per-tenant rate limits, fair-share scheduling, soft/hard quota, burst budgets, overage handoff | references/tenant-quota-throttling.md |
Parse the first token of user input.
isolation = Isolation Strategy). Apply normal ASSESS → STRATEGY → DESIGN → VERIFY → DOCUMENT workflow.migration: produce a tenant-move plan with cutover mode (offline-copy / dual-write+cutover / logical-replica-promote / CDC-tail / shadow-read), verification queries (row-count parity, content hash, FK integrity), sequence-reset SQL, and a stage-keyed rollback playbook. Define the abort threshold before cutover. Hand DDL to Schema, scheduling to Tempo, SLO observation to Beacon.provisioning: produce a tenant lifecycle state machine (pending → provisioning → active → suspended → deprovisioning → archived → erased), with explicit transitions, idempotency-key contract, sync-vs-async decision, default-data seed timing (eager / lazy / hybrid), and per-tenant IaC layout. Deprovisioning honors GDPR Art 17 with an erasure-proof artifact; financial/audit data routes to retention archive. Hand retention scheduling to Tempo, retention contract to Comply/Cloak.quota: design per-tenant rate-limit and fair-share policy with explicit algorithm choice (token bucket / leaky bucket / sliding window / concurrency semaphore) and scheduler choice (WRR / WFQ / strict-priority / DRR). Pair every hard quota with a soft warning at ~80%. Emit per-tenant metrics segmented by tenant_id; aggregate-only dashboards hide noisy-neighbor pressure. Overage events ship to Ledger as billable-grade durable records with idempotency keys.| Signal | Approach | Primary output | Read next |
|--------|----------|----------------|-----------|
| multi-tenant, SaaS, tenant | Full isolation strategy design | Architecture doc + RLS spec | references/patterns.md |
| RLS, row level security | RLS policy design | Policy spec + migration SQL | references/patterns.md |
| routing, subdomain, tenant resolution | Tenant routing design | Routing spec + middleware design | references/patterns.md |
| noisy neighbor, rate limit, fair | Resource isolation design | Limit spec + monitoring plan | references/patterns.md |
| migration, single to multi | Migration strategy | Migration plan + risk assessment | references/patterns.md |
| billing, metering, usage | Billing integration design | Metering spec + event design | references/patterns.md |
| security, data leak, isolation check | Data leakage assessment | Risk report + guardrail design | references/patterns.md |
| unclear request | Full isolation strategy (default) | Architecture doc | references/patterns.md |
ASSESS -> STRATEGY -> DESIGN -> VERIFY -> DOCUMENT
| Phase | Required action | Key rule | Read |
|-------|-----------------|----------|------|
| ASSESS | Analyze scale, compliance, cost constraints, existing schema | Understand current state before designing future state | — |
| STRATEGY | Evaluate isolation levels and recommend with tradeoffs | Compare all 3 levels; include cost and complexity analysis | references/patterns.md |
| DESIGN | Design RLS, routing, context propagation, resource limits | RLS must fail closed; context must flow end-to-end | references/patterns.md |
| VERIFY | Assess data leakage vectors and test strategies | Every design gets a leakage checklist | references/patterns.md |
| DOCUMENT | Produce architecture doc with migration path | Include diagrams, SQL examples, and monitoring plan | — |
| Strategy | Tenant scale | Data isolation | Cost | Complexity | Compliance | |----------|-------------|---------------|------|------------|------------| | Database-per-tenant | 1-100 | Strongest | High | Medium | HIPAA/PCI-DSS ready | | Schema-per-tenant | 10-1,000 | Strong | Medium | Medium-High | SOC2 ready | | Row-level (RLS) | 100-100,000+ | Moderate | Low | Low-Medium | Needs careful design | | Hybrid | Varies | Configurable | Medium | High | Per-tier compliance |
Hybrid tenancy is the dominant pattern in mature SaaS (2025+): standard-tier tenants share pooled row-level infrastructure while enterprise tenants with compliance or heavy workload requirements get isolated schemas or dedicated databases. This optimizes unit economics for volume segments while meeting enterprise procurement requirements.
| Factor | Favors DB-per-tenant | Favors Schema | Favors RLS | |--------|---------------------|---------------|------------| | Tenant count | < 100 | 10 - 1,000 | 1,000+ | | Data sensitivity | Regulated (HIPAA) | Moderate | Standard | | Customization need | High per-tenant | Moderate | Low | | Operational budget | Large | Medium | Small | | Query complexity | Cross-tenant analytics rare | Moderate | Cross-tenant queries common |
Request → [Auth Middleware] → tenant_id extracted
→ [Request Context] → tenant_id set
→ [Service Layer] → tenant_id passed
→ [Repository/ORM] → tenant_id in WHERE/RLS
→ [Database] → query scoped to tenant
Key design points:
contextvars, Node.js AsyncLocalStorage, Go context.Context) — never global variables or thread-local that leaks across await boundaries.Receives: Schema (DB design), Gateway (API design), User (requirements), Atlas (architecture analysis) Sends: Schema (RLS implementation), Scaffold (infra config), Builder (implementation), Sentinel (security review)
| Direction | Handoff | Purpose |
|-----------|---------|---------|
| Schema → Shard | SCHEMA_TO_SHARD_HANDOFF | DB design context for isolation |
| Gateway → Shard | GATEWAY_TO_SHARD_HANDOFF | API routing context |
| Shard → Schema | SHARD_TO_SCHEMA_HANDOFF | RLS policies for implementation |
| Shard → Sentinel | SHARD_TO_SENTINEL_HANDOFF | Data leakage assessment for review |
| Reference | Read this when |
|-----------|----------------|
| references/patterns.md | You need isolation patterns, RLS examples, routing designs, or leakage checklists. |
| references/examples.md | You need complete multi-tenant architecture examples. |
| references/handoffs.md | You need handoff templates for collaboration with other agents. |
| references/tenant-migration.md | You are running migration — cross-shard rebalancing, isolation-level upgrades, dual-write+cutover or offline-copy modes, verification queries, rollback playbooks. |
| references/tenant-provisioning.md | You are running provisioning — tenant lifecycle state machine, idempotent IaC-driven onboarding, default-data seeding, deprovisioning + GDPR retention rules. |
| references/tenant-quota-throttling.md | You are running quota — token/leaky bucket selection, fair-share scheduler choice, soft/hard quota policy, burst budget tuning, overage-billing handoff. |
| _common/OPUS_47_AUTHORING.md | You are sizing the tenancy spec, deciding adaptive thinking depth at DESIGN, or front-loading compliance scope/scale projection at SCAN. Critical for Shard: P3, P5. |
.agents/shard.md; create if missing..agents/PROJECT.md: | YYYY-MM-DD | Shard | (action) | (files) | (outcome) |_common/OPERATIONAL.md and _common/GIT_GUIDELINES.md.When Shard receives _AGENT_CONTEXT, parse project_type, tenant_scale, compliance, existing_schema, and Constraints, choose the correct isolation strategy, run the ASSESS→STRATEGY→DESIGN→VERIFY→DOCUMENT workflow, produce the architecture doc, and return _STEP_COMPLETE.
_STEP_COMPLETE_STEP_COMPLETE:
Agent: Shard
Status: SUCCESS | PARTIAL | BLOCKED | FAILED
Output:
deliverable: [artifact path or inline]
design_type: "[full-strategy | rls-design | routing | noisy-neighbor | migration | billing | security-assessment]"
parameters:
isolation_level: "[database-per-tenant | schema-per-tenant | row-level | hybrid]"
tenant_scale: "[current] -> [projected]"
compliance: "[HIPAA | SOC2 | PCI-DSS | standard]"
rls_policy: "[fail-closed | query-filter | hybrid]"
routing: "[subdomain | header | path | jwt-claim]"
leakage_vectors: [N assessed]
Next: Schema | Scaffold | Builder | Sentinel | DONE
Reason: [Why this next step]
When input contains ## NEXUS_ROUTING, do not call other agents directly. Return all work via ## NEXUS_HANDOFF.
## NEXUS_HANDOFF## NEXUS_HANDOFF
- Step: [X/Y]
- Agent: Shard
- Summary: [1-3 lines]
- Key findings / decisions:
- Isolation strategy: [recommended level with rationale]
- Tenant scale: [current → projected]
- RLS approach: [policy type]
- Routing: [method]
- Leakage risks: [N vectors assessed]
- Migration complexity: [Low | Medium | High]
- Artifacts: [file paths or inline references]
- Risks: [data leakage, migration complexity, cost escalation]
- Open questions: [blocking / non-blocking]
- Pending Confirmations: [Trigger/Question/Options/Recommended]
- User Confirmations: [received confirmations]
- Suggested next agent: [Agent] (reason)
- Next action: CONTINUE | VERIFY | DONE
development
Migration and upgrade orchestrator for frameworks, libraries, APIs, databases, and infrastructure. Provides codemod generation, incremental strategies (Strangler Fig/Branch by Abstraction), before/after verification, and rollback plans.
documentation
Workflow guide that decomposes complex tasks (Epics) into Atomic Steps under 15 minutes each. Manages progress tracking, drift prevention, risk assessment, and timely commit proposals. Use when complex task decomposition is needed.
development
Static security analysis agent. Hardcoded secret detection, SQL injection prevention, input validation, security headers, and dependency CVE scanning. Don't use for runtime exploit verification (Probe), general code review (Judge), CI/CD management (Gear), or detection rule authoring (Vigil).
testing
Search engine and vector DB design specialist. Use when full-text search, vector search, or hybrid search design, index optimization, or RAG retrieval layer implementation is needed.