skills/docker-containerization/SKILL.md
Write production-grade Dockerfiles, docker-compose configurations, multi-stage builds, and container optimization. Activate on "Dockerfile", "docker", "docker-compose", "container", "multi-stage build", "docker image", "container optimization", "docker security". NOT for Kubernetes orchestration, cloud-specific container services (ECS, Cloud Run), or CI/CD pipelines (use github-actions-pipeline-builder).
npx skillsauth add curiositech/windags-skills docker-containerizationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Write production-grade Dockerfiles with multi-stage builds, security hardening, and size optimization. Covers docker-compose for local development, image layer caching, health checks, and the patterns that separate a 2GB image from a 50MB one.
Use for:
NOT for:
github-actions-pipeline-builder)flowchart TD
Start[What are you building?] --> Lang{Language/runtime?}
Lang -->|Node.js/TypeScript| Node[Node pattern]
Lang -->|Python| Python[Python pattern]
Lang -->|Go| Go[Go pattern]
Lang -->|Rust| Rust[Rust pattern]
Lang -->|Static site| Static[Static pattern]
Node --> NQ{Need build step?}
NQ -->|Yes, TypeScript/bundler| MultiNode[Multi-stage: build + runtime]
NQ -->|No, plain JS| SingleNode[Single stage with slim base]
Python --> PQ{Package manager?}
PQ -->|pip| PipPattern[pip + venv pattern]
PQ -->|uv| UvPattern[uv pattern — fastest]
PQ -->|poetry| PoetryPattern[poetry export pattern]
Go --> GoMulti[Multi-stage: build + scratch/distroless]
Rust --> RustMulti[Multi-stage: build + debian-slim]
Static --> StaticMulti[Multi-stage: build + nginx/caddy]
# Stage 1: Dependencies
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production
# Stage 2: Build (TypeScript/bundler)
FROM node:22-alpine AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 3: Production
FROM node:22-alpine AS production
WORKDIR /app
ENV NODE_ENV=production
# Security: non-root user
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001
COPY --from=deps /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist
COPY package.json ./
USER nextjs
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s \
CMD wget -qO- http://localhost:3000/health || exit 1
CMD ["node", "dist/index.js"]
FROM python:3.12-slim AS base
# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
WORKDIR /app
# Install dependencies (cached layer)
COPY pyproject.toml uv.lock ./
RUN uv sync --frozen --no-dev --no-editable
# Copy application code
COPY . .
# Non-root user
RUN useradd -r -s /bin/false appuser
USER appuser
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=3s \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"
CMD ["uv", "run", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
# Build stage
FROM golang:1.22-alpine AS build
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /server ./cmd/server
# Production: distroless (no shell, no package manager, minimal attack surface)
FROM gcr.io/distroless/static-debian12
COPY --from=build /server /server
EXPOSE 8080
USER nonroot:nonroot
ENTRYPOINT ["/server"]
flowchart TD
subgraph "Slow to change (cache hit)"
A[Base image] --> B[System packages]
B --> C[Language runtime deps]
end
subgraph "Medium change frequency"
C --> D[Application dependencies]
end
subgraph "Fast changing (cache miss OK)"
D --> E[Application code]
E --> F[Build step]
end
Rule: Order Dockerfile instructions from least-frequently-changed to most-frequently-changed. Each instruction creates a layer. When a layer changes, all subsequent layers are rebuilt.
Novice:
COPY . . # ← Busts cache on ANY file change
RUN npm install # ← Reinstalls everything every build
Expert:
COPY package.json package-lock.json ./ # ← Only busts on dependency changes
RUN npm ci # ← Cached when deps unchanged
COPY . . # ← Only app code changes trigger rebuild
Timeline: This has been best practice since Docker layer caching was introduced, but LLMs trained on older tutorials still generate the wrong order.
# docker-compose.yml
services:
app:
build:
context: .
dockerfile: Dockerfile
target: development # Use a dev-specific stage
ports:
- "${PORT:-3000}:3000"
volumes:
- .:/app # Hot reload via bind mount
- /app/node_modules # Anonymous volume: don't override node_modules
environment:
- NODE_ENV=development
- DATABASE_URL=postgresql://postgres:postgres@db:5432/myapp
depends_on:
db:
condition: service_healthy
develop:
watch: # Docker Compose Watch (2024+)
- action: sync
path: ./src
target: /app/src
- action: rebuild
path: package.json
db:
image: postgres:16-alpine
volumes:
- pgdata:/var/lib/postgresql/data
environment:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: myapp
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 5s
retries: 5
ports:
- "5432:5432"
redis:
image: redis:7-alpine
ports:
- "6379:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
volumes:
pgdata:
Novice: Relies on depends_on alone — but that only waits for the container to START, not for the service to be READY.
Expert: Always add healthcheck to database/cache services and use condition: service_healthy in depends_on. A Postgres container that has started but hasn't finished WAL recovery will crash your app.
| Base Image | Size | Use When |
|-----------|------|----------|
| node:22 | ~1.1 GB | Never in production |
| node:22-slim | ~200 MB | Need apt packages |
| node:22-alpine | ~130 MB | Default choice |
| distroless | ~20 MB | Go/Rust compiled binaries |
| scratch | 0 MB | Fully static binaries |
| chainguard/* | ~10-30 MB | Security-hardened alternatives |
# 1. Use --no-cache for apk/apt
RUN apk add --no-cache curl
# 2. Combine RUN commands to reduce layers
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*
# 3. Use .dockerignore aggressively
# .dockerignore:
node_modules
.git
*.md
.env*
dist
coverage
.next
# 1. Non-root user (MANDATORY)
RUN addgroup -g 1001 -S appgroup && \
adduser -S appuser -u 1001 -G appgroup
USER appuser
# 2. Read-only filesystem (in compose)
# docker-compose.yml:
# read_only: true
# tmpfs:
# - /tmp
# 3. No new privileges
# docker run --security-opt no-new-privileges ...
# 4. Pin image digests for reproducibility
FROM node:22-alpine@sha256:abc123...
# 5. Scan for vulnerabilities
# docker scout quickview myimage:latest
# trivy image myimage:latest
Novice: Skips the USER instruction. Everything runs as root.
Expert: Running as root inside a container means a container escape gives the attacker root on the host. Always create and switch to a non-root user. Only use root for package installation in build stages.
Detection: docker inspect --format='{{.Config.User}}' image:tag — if empty, it's root.
Docker HEALTHCHECK answers one question: "Is this process alive and minimally functional?" It does NOT answer "Are all dependencies reachable?" — that's readiness (a Kubernetes concept). Conflating them causes cascading restarts: DB goes down → every API container "fails" health check → orchestrator restarts them all → thundering herd on DB recovery.
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
CMD wget -qO- http://localhost:${PORT}/health || exit 1
The /health endpoint should:
// Minimal /health endpoint
app.get('/health', (req, res) => res.status(200).json({ status: 'ok' }));
If you need a richer health check for monitoring dashboards (DB status, queue depth, cache hit rate), expose it on /health/detailed and do NOT wire it to Docker HEALTHCHECK.
Compose equivalent:
healthcheck:
test: ["CMD", "wget", "-qO-", "http://localhost:3000/health"]
interval: 30s
timeout: 3s
start_period: 10s
retries: 3
Workers don't serve HTTP. Use a heartbeat file pattern:
HEALTHCHECK --interval=30s --timeout=5s --start-period=15s --retries=3 \
CMD test $(find /tmp/worker-heartbeat -mmin -1 2>/dev/null | wc -l) -gt 0 || exit 1
The worker writes a timestamp file on each successful job loop iteration:
// Inside your worker loop
await processJob();
fs.writeFileSync('/tmp/worker-heartbeat', Date.now().toString());
If the heartbeat file is older than 1 minute, the worker is stuck. Checks: process is alive, event loop is not blocked, jobs are being dequeued.
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD wget -qO- http://localhost:80/ || exit 1
Short start period — static servers boot fast. Just check it serves a page. No /health endpoint needed.
Use the database's native client for health checks, not HTTP:
# PostgreSQL
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
start_period: 30s # DBs are slow to start — generous grace period
retries: 5
# Redis
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 3s
retries: 5
# MySQL
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost"]
interval: 10s
timeout: 5s
start_period: 30s
retries: 5
| Parameter | Guidance |
|-----------|----------|
| interval | 30s for apps, 10s for databases. Lower = more CPU overhead and log noise. |
| timeout | 3-5s. If your health check takes longer, it's too expensive. |
| start_period | How long until the first check. 5s for static, 10s for APIs, 30s for databases, 60s+ for JVM apps. |
| retries | 3 for apps, 5 for databases. Too low = restarts on transient blips. |
This skill produces:
references/multi-stage-patterns.md — Consult for complex multi-stage builds: build caching with BuildKit, cross-compilation, monorepo Dockerfiles, Bun/Deno patternsreferences/compose-patterns.md — Consult for advanced docker-compose: profiles, extends, override files, networking, secrets management, GPU passthroughtools
Building resilient distributed systems with circuit breakers, retries with full-jitter exponential backoff, retry budgets (per-request 3-attempt + per-client 10% ratio per Google SRE), deadline propagation, and the cascading-failure math (4 layers × 3 retries = 64x amplification). Grounded in Resilience4j, Microsoft Cloud Patterns, AWS Architecture Blog (Marc Brooker), and Google SRE Book.
testing
Designing HTTP cache headers that work correctly across browsers, CDNs, and shared proxies — `Cache-Control` directives per RFC 9111, `stale-while-revalidate` and `stale-if-error` per RFC 5861, the Vary header for varying responses, and surrogate keys for tag-based purging. Grounded in IETF RFCs and Cloudflare/Fastly docs.
development
Use when designing or fixing a Content Security Policy on a real site, choosing between nonce-based and hash-based CSP, adding strict-dynamic, debugging "Refused to execute inline script" errors, deploying CSP in report-only mode first, configuring report-to / report-uri, or auditing an existing policy for unsafe-inline / unsafe-eval / wildcards. Triggers: "CSP blocks legitimate inline script", strict-dynamic, nonce-{RANDOM}, sha256-{HASH}, object-src none, base-uri none, frame-ancestors, Trusted Types, X-Content-Security-Policy obsolete, report-only vs enforced. NOT for general HTTP security headers (HSTS, COOP/COEP), Trusted Types deep dive, CORS configuration, or building a WAF.
tools
Choosing and operating an HTTP API versioning strategy that doesn't break clients — Stripe's date-based pinned versions, the Deprecation/Sunset header pair (RFC 9745 + RFC 8594), URI vs header vs media-type approaches, and the version-transformer pattern. Grounded in Stripe's published architecture and IETF RFCs.