skills/container-resource-tuning/SKILL.md
Size container memory and CPU limits, diagnose OOM kills and CPU throttling, and recommend resource adjustments by ecosystem. Use when containers are being OOM-killed, running slowly, or when setting initial resource limits for a deployment.
npx skillsauth add nixopus/agent container-resource-tuningInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Starting points by ecosystem. Adjust based on actual usage.
| Ecosystem | Memory limit | CPU shares | Notes |
|---|---|---|---|
| Node.js | 512MB | 0.5 | V8 GC is memory-hungry; Next.js SSR needs more |
| Node.js (Next.js SSR) | 1024MB | 1.0 | Server-side rendering is CPU and memory intensive |
| Python (Django/Flask) | 512MB | 0.5 | Per-worker; multiply by worker count |
| Python (FastAPI) | 256MB | 0.5 | Async, lower per-process memory |
| Go | 256MB | 0.5 | Static binary, efficient memory use |
| Rust | 128MB | 0.25 | Minimal runtime overhead |
| Java (Spring Boot) | 1024MB | 1.0 | JVM needs headroom; set -Xmx to 75% of limit |
| PHP (FrankenPHP) | 512MB | 0.5 | Per-request memory; depends on payload |
| Ruby (Rails) | 512MB | 0.5 | Per-worker; Puma workers multiply this |
| Elixir (Phoenix) | 256MB | 0.5 | BEAM VM is efficient; handles concurrency well |
| .NET (ASP.NET) | 512MB | 0.5 | Similar to Node.js profile |
| Static (Caddy/nginx) | 64MB | 0.25 | Minimal; just serving files |
When container_inspect shows oom_killed: true:
container_inspect → memory limitcontainer_stats → memory usage and limitcontainer_exec ["ps", "aux", "--sort=-%mem"] → top processescontainer_exec ["node", "-e", "console.log(process.memoryUsage())"]| Ecosystem | Cause | Fix |
|---|---|---|
| Node.js | V8 heap exceeds limit | Set NODE_OPTIONS=--max-old-space-size=<MB> to 75% of container limit |
| Node.js | Memory leak (heap grows unbounded) | Profile with --inspect; check for event listener leaks, unbounded caches |
| Java | JVM default heap exceeds container limit | Set -Xmx to 75% of container memory limit |
| Python | Large dataset loaded into memory | Use streaming/chunked processing; increase limit if data size is fixed |
| Any | Too many worker processes | Reduce worker count: Gunicorn --workers, Puma workers, PM2 instances |
container_stats for 10 minutesWhen the app is slow but not OOM-killed:
container_stats → CPU percentageget_machine_stats → system load averagecontainer_exec ["ps", "aux", "--sort=-%cpu"] → top CPU consumers| Symptom | Cause | Fix | |---|---|---| | CPU at 100% of limit | App is compute-bound | Increase CPU shares or optimize hot paths | | CPU at 100%, response times spike | Not enough CPU for request volume | Scale horizontally (more instances) or increase CPU | | Low CPU but slow responses | Waiting on I/O (database, external API) | Not a CPU issue — check database latency | | Host load > 2x cores | Server overloaded | Multiple containers competing — reduce total load or upgrade server |
Java apps need explicit JVM flags to respect container limits:
JAVA_TOOL_OPTIONS=-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0
UseContainerSupport (default since Java 10): JVM reads cgroup memory limitsMaxRAMPercentage=75.0: heap uses 75% of container memory, leaving room for native memory and GCNODE_OPTIONS=--max-old-space-size=384
For a 512MB container, set old space to ~75% (384MB). V8 needs headroom for GC, native code, and buffers.
For production, also set:
UV_THREADPOOL_SIZE=4 (default) — increase for I/O-heavy appsNODE_CLUSTER_WORKERS — if using cluster mode, each worker needs its own memory budgetGunicorn workers multiply memory usage:
gunicorn app:app --workers 2 --worker-class uvicorn.workers.UvicornWorker
Rule of thumb: workers = (2 * CPU cores) + 1, but in containers with limited CPU, use 2-4 workers max.
Each worker uses roughly the same memory as a single process. 4 workers × 256MB = 1GB total.
services:
app:
deploy:
resources:
limits:
memory: 512M
cpus: '0.5'
reservations:
memory: 256M
cpus: '0.25'
limits: hard ceiling — container is OOM-killed if exceededreservations: guaranteed minimum — Docker ensures this is availableAfter adjusting resources:
container_stats — check memory and CPU usage over timeget_container_logs — scan for OOM warnings or performance errorshttp_probe — verify response times are acceptablerestart_count drops to 0 and memory stays below 80%: tuning is correctpost-deploy-verification — Check container stability after resource changesfailure-diagnosis — Exit code 137 (OOM kill) diagnosiscompose-setup — Resource limits in docker-compose.ymltools
Compressed catalog of all Nixopus API operations for the nixopus_api() tool
development
Deploy static file sites — Caddy/nginx serving, Staticfile config, and Dockerfile patterns. Use when deploying a static HTML site with no server-side runtime, or when index.html or a Staticfile is detected at the project root.
devops
Deploy shell script applications — interpreter detection, setup scripts, and Dockerfile patterns. Use when deploying a shell script project, or when start.sh is detected.
development
Self-healing loop for failed deployments — diagnose, fix, redeploy up to 3 attempts, then escalate or rollback. Load when a deployment fails or build errors occur.