Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

cubetiq/system-design

Name: system-design
Author: cubetiq

workflows/workflows/agent-environment-setup/platforms/claude/skills/system-design/SKILL.md

npx skillsauth add cubetiq/cubis-foundry system-design

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

System Design

Purpose

Provide rigorous, production-tested guidance for designing distributed systems that are scalable, reliable, and maintainable. Covers horizontal scaling strategies, data partitioning, caching layers, load balancing topologies, messaging infrastructure, consistency models, and failure handling. Enables engineers to make informed trade-off decisions backed by established distributed systems theory and industry practice.

When to Use

Designing a new system or service from scratch with scale requirements
Evaluating trade-offs between consistency, availability, and partition tolerance
Selecting caching strategies (write-through, write-behind, cache-aside)
Planning load balancing topologies and health check configurations
Designing data partitioning and sharding schemes
Architecting messaging and event-driven communication between services
Preparing for system design interviews or architecture reviews
Reviewing an existing system for reliability and scalability gaps

Instructions

Start with requirements, not solutions — Capture functional requirements (what the system does), non-functional requirements (latency, throughput, availability targets), and constraints (budget, team size, existing infrastructure) before proposing any architecture so that the design is driven by needs rather than preferences.
Estimate scale quantitatively — Calculate expected QPS, storage growth rate, peak-to-average traffic ratios, and read/write ratios so that capacity planning is grounded in numbers rather than intuition. See references/scalability.md.
Apply the CAP theorem to partition decisions — For every data store, explicitly choose between CP (consistent under partition) and AP (available under partition) and document the reasoning so that the team understands what happens when the network splits.
Design for failure as the default state — Assume every network call can fail, every server can crash, and every disk can corrupt. Build in retries with exponential backoff, circuit breakers, bulkheads, and graceful degradation so that partial failures do not cascade into total outages. See references/reliability.md.
Select caching strategy by access pattern — Use cache-aside for read-heavy workloads with tolerance for stale data, write-through for strong consistency needs, and write-behind for write-heavy workloads with eventual consistency tolerance so that the cache layer matches the actual data flow. See references/caching-strategies.md.
Layer load balancing at DNS, L4, and L7 — Use DNS-based balancing for geographic distribution, L4 (TCP) balancing for raw throughput, and L7 (HTTP) balancing for content-aware routing so that each layer handles what it is optimized for.
Partition data by access pattern, not by table — Choose hash-based partitioning for uniform distribution, range-based partitioning for scan-heavy workloads, and composite keys for multi-tenant isolation so that hot spots are prevented by design. See references/data-partitioning.md.
Choose messaging semantics explicitly — Decide between at-most-once, at-least-once, and exactly-once delivery guarantees and design idempotent consumers for at-least-once systems so that message processing is predictable under failure conditions. See references/messaging.md.
Define SLOs before building observability — Set concrete Service Level Objectives (latency p50/p95/p99, availability percentage, error budget) so that monitoring, alerting, and on-call priorities are driven by user-visible impact rather than arbitrary thresholds.
Design APIs as contracts with versioning — Every inter-service API must have a schema, versioning strategy, and backward compatibility guarantee so that services can be deployed independently without coordinated releases.
Use read replicas to scale read-heavy paths — Separate read traffic from write traffic using leader-follower replication and route queries to the appropriate tier so that write latency is not degraded by read load.
Plan for data migration from day one — Design schemas with evolution in mind: additive changes only, no column renames in-place, dual-write during migration windows so that schema changes do not require downtime.
Implement distributed tracing across service boundaries — Propagate trace IDs through all inter-service calls so that end-to-end request flows are visible and latency bottlenecks can be localized to specific services.
Document every architectural decision — Record decisions in Architecture Decision Records (ADRs) with context, options considered, decision, and consequences so that future engineers understand why the system is shaped the way it is.
Validate the design with back-of-envelope calculations — Before finalizing, verify that storage estimates, bandwidth requirements, and compute costs align with the budget and scaling timeline so that the design is economically viable.

Output Format

Deliver:

Architecture diagram description — Component topology with data flow directions, protocols, and failure domains
Scale estimates — QPS, storage, bandwidth calculations with growth projections
Trade-off analysis — Explicit CAP/PACELC choices with justification for each data store
Failure mode catalog — What fails, what degrades, what stays available, and how recovery works
Technology recommendations — Specific tools and services with rationale tied to requirements

References

Load only what the current task requires.

| File | Load when | | --- | --- | | references/scalability.md | Task involves capacity planning, horizontal scaling, or traffic estimation. | | references/reliability.md | Task involves fault tolerance, circuit breakers, retry strategies, or chaos engineering. | | references/caching-strategies.md | Task involves cache selection, invalidation policies, or cache consistency patterns. | | references/messaging.md | Task involves message queues, event buses, pub/sub, or delivery guarantees. | | references/data-partitioning.md | Task involves sharding, replication, consistent hashing, or multi-tenant data isolation. |

cubetiq/system-design

workflows/workflows/agent-environment-setup/platforms/claude/skills/system-design/SKILL.md

System design and architecture guidance covering distributed systems, scalability, reliability, CAP theorem, load balancing, and caching strategies for production infrastructure.

1 stars

testing

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add cubetiq/cubis-foundry system-design

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 10:00 AM4.5s13 files scanned

SKILL.md

name:: system-design
description:: System design and architecture guidance covering distributed
whenToUse:: When designing a new system or service from scratch with scale requirements.
priority:: primary

System Design

Purpose

When to Use

Designing a new system or service from scratch with scale requirements
Evaluating trade-offs between consistency, availability, and partition tolerance
Selecting caching strategies (write-through, write-behind, cache-aside)
Planning load balancing topologies and health check configurations
Designing data partitioning and sharding schemes
Architecting messaging and event-driven communication between services
Preparing for system design interviews or architecture reviews
Reviewing an existing system for reliability and scalability gaps

Instructions

Start with requirements, not solutions — Capture functional requirements (what the system does), non-functional requirements (latency, throughput, availability targets), and constraints (budget, team size, existing infrastructure) before proposing any architecture so that the design is driven by needs rather than preferences.
Estimate scale quantitatively — Calculate expected QPS, storage growth rate, peak-to-average traffic ratios, and read/write ratios so that capacity planning is grounded in numbers rather than intuition. See references/scalability.md.
Apply the CAP theorem to partition decisions — For every data store, explicitly choose between CP (consistent under partition) and AP (available under partition) and document the reasoning so that the team understands what happens when the network splits.
Design for failure as the default state — Assume every network call can fail, every server can crash, and every disk can corrupt. Build in retries with exponential backoff, circuit breakers, bulkheads, and graceful degradation so that partial failures do not cascade into total outages. See references/reliability.md.
Select caching strategy by access pattern — Use cache-aside for read-heavy workloads with tolerance for stale data, write-through for strong consistency needs, and write-behind for write-heavy workloads with eventual consistency tolerance so that the cache layer matches the actual data flow. See references/caching-strategies.md.
Layer load balancing at DNS, L4, and L7 — Use DNS-based balancing for geographic distribution, L4 (TCP) balancing for raw throughput, and L7 (HTTP) balancing for content-aware routing so that each layer handles what it is optimized for.
Partition data by access pattern, not by table — Choose hash-based partitioning for uniform distribution, range-based partitioning for scan-heavy workloads, and composite keys for multi-tenant isolation so that hot spots are prevented by design. See references/data-partitioning.md.
Choose messaging semantics explicitly — Decide between at-most-once, at-least-once, and exactly-once delivery guarantees and design idempotent consumers for at-least-once systems so that message processing is predictable under failure conditions. See references/messaging.md.
Define SLOs before building observability — Set concrete Service Level Objectives (latency p50/p95/p99, availability percentage, error budget) so that monitoring, alerting, and on-call priorities are driven by user-visible impact rather than arbitrary thresholds.
Design APIs as contracts with versioning — Every inter-service API must have a schema, versioning strategy, and backward compatibility guarantee so that services can be deployed independently without coordinated releases.
Use read replicas to scale read-heavy paths — Separate read traffic from write traffic using leader-follower replication and route queries to the appropriate tier so that write latency is not degraded by read load.
Plan for data migration from day one — Design schemas with evolution in mind: additive changes only, no column renames in-place, dual-write during migration windows so that schema changes do not require downtime.
Implement distributed tracing across service boundaries — Propagate trace IDs through all inter-service calls so that end-to-end request flows are visible and latency bottlenecks can be localized to specific services.
Document every architectural decision — Record decisions in Architecture Decision Records (ADRs) with context, options considered, decision, and consequences so that future engineers understand why the system is shaped the way it is.
Validate the design with back-of-envelope calculations — Before finalizing, verify that storage estimates, bandwidth requirements, and compute costs align with the budget and scaling timeline so that the design is economically viable.

Output Format

Deliver:

Architecture diagram description — Component topology with data flow directions, protocols, and failure domains
Scale estimates — QPS, storage, bandwidth calculations with growth projections
Trade-off analysis — Explicit CAP/PACELC choices with justification for each data store
Failure mode catalog — What fails, what degrades, what stays available, and how recovery works
Technology recommendations — Specific tools and services with rationale tied to requirements

References

Load only what the current task requires.

Related Skills

cubetiq/deep-research

tools

VerifiedTrustedCommunity

Use when investigating latest vendor behavior, comparing tools or platforms, verifying claims beyond the repo, or gathering external evidence before implementation.

1SKILL.mdUpdated Apr 4, 2026

cubetiq/deep-research

cubetiq/database-design

documentation

VerifiedTrustedCommunity

Use when designing database schemas, normalization strategies, indexing plans, query optimization, and migration workflows for relational, document, or hybrid data stores.

1SKILL.mdUpdated Apr 4, 2026

cubetiq/database-design

cubetiq/csharp-best-practices

development

VerifiedTrustedCommunity

Use when writing, reviewing, or refactoring modern C#/.NET code, including minimal APIs, records, async streams, pattern matching, DI lifetimes, and memory-efficient performance tuning.

1SKILL.mdUpdated Apr 4, 2026

cubetiq/csharp-best-practices

cubetiq/code-review

development

VerifiedTrustedCommunity

Use when conducting code reviews, building review checklists, calibrating review depth, providing structured feedback, or establishing team review practices. Covers review methodology, feedback patterns, automated checks, and batch review strategies.

1SKILL.mdUpdated Apr 4, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/cubetiq/cubis-foundry.git

# Copy into Claude Code skills folder (global)
cp -r cubis-foundry/workflows/workflows/agent-environment-setup/platforms/claude/skills/system-design ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

cubetiq/cubis-foundry

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT