skills/k8s/SKILL.md
Kubernetes and container orchestration skill. Helm charts, deployment strategies (rolling, canary, blue-green), pod health, resource limits, scaling. Triggers on: /godmode:k8s, "deploy to kubernetes", "helm chart", "pod crashing", "OOMKilled".
npx skillsauth add arbazkhan971/godmode k8sInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
/godmode:k8s# Gather cluster info
kubectl cluster-info
kubectl get deployments,services,ingresses \
-n <namespace>
helm list -n <namespace>
# Check resource usage
kubectl top pods -n <namespace>
kubectl top nodes
KUBERNETES CONTEXT:
Cluster: <name>, Context: <kubectl context>
Namespace: <target>, Registry: <URL>
Workloads: <N> Deployments, <N> StatefulSets
Services: <N>, Ingresses: <N>
Helm releases: <list>
IF no cluster: generate manifests for local (minikube)
IF no namespace: create with resource quotas
IF no Helm: use raw manifests for simple apps
# Dry-run validation
kubectl apply --dry-run=server -f manifests/
# Lint with kubeval
kubeval manifests/*.yaml --strict
# Security scan
kubesec scan manifests/deployment.yaml
CHART STRUCTURE:
<chart>/
Chart.yaml, values.yaml, values-{env}.yaml
templates/
deployment.yaml, service.yaml, ingress.yaml,
hpa.yaml, pdb.yaml, configmap.yaml, secret.yaml
helm lint <chart-dir>
helm template <release> <chart> -f values-prod.yaml
| Strategy | When to Use | Rollback |
|-------------|------------------------|-----------|
| Rolling | Standard, backward-compat| Automatic|
| Canary | High-risk changes | Auto at % |
| Blue-Green | Need instant rollback | Instant |
ROLLING UPDATE CONFIG:
maxSurge: 25%
maxUnavailable: 0 (zero downtime)
CANARY RAMP:
5% → 20% → 50% → 80% → 100%
Gate: error rate < baseline + 0.5%
Gate: p95 latency < baseline + 10%
THRESHOLDS:
IF error rate > 5% at any stage: auto-rollback
IF p95 latency > 2x baseline: auto-rollback
IF high-risk change: always use canary
RESOURCE SIZING:
| Metric | Recommended |
|-----------|--------------------------|
| CPU req | P95 usage + 20% buffer |
| CPU limit | 2x request (allow burst) |
| Mem req | P95 usage + 20% buffer |
| Mem limit | Peak + GC overhead |
| Pod count | min 2 for HA |
RULES:
Never set CPU limit == request (causes throttling)
Memory limit must accommodate GC overhead
Requests = P95 + 20%, Limits = 2x requests
PROBE CONFIG:
Liveness: detect deadlocked processes
path: /healthz, period: 10s, threshold: 3
Readiness: gate traffic to healthy pods
path: /ready, period: 5s, threshold: 1
Startup: slow-starting containers
period: 5s, failureThreshold: 30 (= 150s max)
HPA:
Min replicas: 2 (HA), Max: based on budget
CPU target: 70%, scale up if exceeded
Scale-down stabilization: 300s (prevent flapping)
# Quick diagnostics
kubectl describe pod <pod> -n <ns>
kubectl logs <pod> -n <ns> --previous
kubectl top pods -n <ns>
kubectl get events -n <ns> --sort-by='.lastTimestamp'
| Symptom | First Check |
|-------------------|--------------------------|
| CrashLoopBackOff | logs --previous, probes |
| OOMKilled | increase memory limit |
| ImagePullBackOff | image name, credentials |
| Pending | resources, affinity |
| Evicted | disk pressure, quotas |
| 502/503 | readiness probe, backend |
helm upgrade --install <release> <chart> \
-f values-<env>.yaml -n <ns> \
--wait --timeout 5m
# Verify
kubectl rollout status deployment/<name> -n <ns>
kubectl get pods -n <ns>
DEPLOYMENT RESULT:
<service> in <namespace>: 3/3 Ready
Health: liveness OK, readiness OK
No error logs in last 60 seconds
Commit: "k8s: <service> — <strategy> (<N> replicas)"
Never ask to continue. Loop autonomously until done.
latest tag. Pin SHA or semver.latest tag — pin SHA or semver.1. kubectl context, cluster-info
2. Manifests: k8s/, manifests/, deploy/
3. Helm: charts/, Chart.yaml, values*.yaml
4. App: Dockerfile, docker-compose.yml
Print: K8s: {resources} resources. Health: {status}. Scaling: {min}-{max}. Verdict: {verdict}.
iteration namespace resources health security status
KEEP if: validation passes AND pods Ready
AND no error logs in 60s
DISCARD if: validation fails OR pods crash
OR readiness probe fails
Rollback: helm rollback or kubectl rollout undo
STOP when ANY of:
- All pods Ready, passing probes
- Deployment strategy configured and tested
- User requests stop
- Rollback triggered (investigate first)
<!-- tier-3 -->
development
Web performance optimization. Lighthouse, bundle analysis, code splitting, image optimization, critical CSS, fonts, service workers, CDN.
development
Webhook design, delivery, retry, HMAC verification, event subscriptions, dead letter queues.
development
Vue.js mastery. Composition API, Pinia, Vue Router, Nuxt SSR/SSG, Vite optimization, testing.
development
Evidence gate. Run command, read full output, confirm or deny claim. No trust, only proof.