skills/community/kubernetes-specialist/SKILL.md
Use when deploying or managing Kubernetes workloads. Invoke to create deployment manifests, configure pod security policies, set up service accounts, define network isolation rules, debug pod crashes, analyze resource limits, inspect container logs, or right-size workloads. Use for Helm charts, RBAC policies, NetworkPolicies, storage configuration, performance optimization, GitOps pipelines, and multi-cluster management.
npx skillsauth add pedronauck/skills kubernetes-specialistInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
kubectl rollout status, kubectl get pods -w, and kubectl describe pod <name> to confirm health; roll back with kubectl rollout undo if neededLoad detailed guidance based on context:
| Topic | Reference | Load When |
| ----------------- | --------------------------------- | ---------------------------------------------------------------- |
| Workloads | references/workloads.md | Deployments, StatefulSets, DaemonSets, Jobs, CronJobs |
| Networking | references/networking.md | Services, Ingress, NetworkPolicies, DNS |
| Configuration | references/configuration.md | ConfigMaps, Secrets, environment variables |
| Storage | references/storage.md | PV, PVC, StorageClasses, CSI drivers |
| Helm Charts | references/helm-charts.md | Chart structure, values, templates, hooks, testing, repositories |
| Troubleshooting | references/troubleshooting.md | kubectl debug, logs, events, common issues |
| Custom Operators | references/custom-operators.md | CRD, Operator SDK, controller-runtime, reconciliation |
| Service Mesh | references/service-mesh.md | Istio, Linkerd, traffic management, mTLS, canary |
| GitOps | references/gitops.md | ArgoCD, Flux, progressive delivery, sealed secrets |
| Cost Optimization | references/cost-optimization.md | VPA, HPA tuning, spot instances, quotas, right-sizing |
| Multi-Cluster | references/multi-cluster.md | Cluster API, federation, cross-cluster networking, DR |
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: my-namespace
labels:
app: my-app
version: "1.2.3"
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
version: "1.2.3"
spec:
serviceAccountName: my-app-sa # never use default SA
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
containers:
- name: my-app
image: my-registry/my-app:1.2.3 # never use latest
ports:
- containerPort: 8080
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
envFrom:
- secretRef:
name: my-app-secret # pull credentials from Secret, not ConfigMap
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-app-sa
namespace: my-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: my-app-role
namespace: my-namespace
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list"] # grant only what is needed
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: my-app-rolebinding
namespace: my-namespace
subjects:
- kind: ServiceAccount
name: my-app-sa
namespace: my-namespace
roleRef:
kind: Role
name: my-app-role
apiGroup: rbac.authorization.k8s.io
# Deny all ingress and egress by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: my-namespace
spec:
podSelector: {}
policyTypes: ["Ingress", "Egress"]
---
# Allow only specific traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-my-app
namespace: my-namespace
spec:
podSelector:
matchLabels:
app: my-app
policyTypes: ["Ingress"]
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
After deploying, verify health and security posture:
# Watch rollout complete
kubectl rollout status deployment/my-app -n my-namespace
# Stream pod events to catch crash loops or image pull errors
kubectl get pods -n my-namespace -w
# Inspect a specific pod for failures
kubectl describe pod <pod-name> -n my-namespace
# Check container logs
kubectl logs <pod-name> -n my-namespace --previous # use --previous for crashed containers
# Verify resource usage vs. limits
kubectl top pods -n my-namespace
# Audit RBAC permissions for a service account
kubectl auth can-i --list --as=system:serviceaccount:my-namespace:my-app-sa
# Roll back a failed deployment
kubectl rollout undo deployment/my-app -n my-namespace
When implementing Kubernetes resources, provide:
tools
Plans real-user QA deliverables: personas, journey maps, exploratory charters, persona/journey/tour/CFR test cases, regression suites, Figma validation checks, automation intent, and user-impact bug reports. Writes artifacts under <qa-output-path>/qa/ for qa-execution to consume. Use when planning QA before execution, documenting journey-driven test strategy, marking flows that need E2E follow-up, or filing structured bug reports. Do not use for live execution, AI implementation audits, CI gate ownership, or technical integration/security/performance suites; use qa-execution or agent-output-audit instead.
development
Executes real-user QA sessions through public interfaces using personas, journeys, exploratory charters, test tours, edge-case probes, CFR checks, and browser evidence. Reads qa-report artifacts from <qa-output-path>/qa/ when present, captures issues/screenshots/reports under the same output tree, and classifies bugs by user impact. Use when validating a release candidate, migration, refactor, or user-facing change against production-like behavior. Do not use for AI implementation audits, task-status reconciliation, CI gate runs, integration/security/performance templates, or flaky-test triage; use agent-output-audit for those.
development
Transform outside-of-diff review files into properly formatted issue files for a given PR. Use when converting review files from ai-docs/reviews-pr-<PR>/outside/ into issue format in ai-docs/reviews-pr-<PR>/issues/. Automatically determines starting issue number and preserves all metadata (file path, date, status) from original review files. Don't use for inline-diff review files, non-PR review artifacts, or creating GitHub issues directly.
development
Enforce root-cause fixes over workarounds, hacks, and symptom patches in all software engineering tasks. Use when debugging issues, fixing bugs, resolving test failures, planning solutions, making architectural decisions, or reviewing code changes. Activates gate functions that detect and reject common workaround patterns such as type assertions, lint suppressions, error swallowing, timing hacks, and monkey patches. Don't use for trivial formatting changes or documentation-only edits.