skills/k8s-debug-pods/SKILL.md
Debug Kurtosis pods on Kubernetes. Diagnose why pods are Pending, CrashLoopBackOff, ImagePullBackOff, or Evicted. Check node taints, tolerations, resource pressure, and pod events. Use when kurtosis engine start fails or pods aren't coming online.
npx skillsauth add kurtosis-tech/kurtosis k8s-debug-podsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Diagnose and fix issues with Kurtosis pods on Kubernetes.
# See all kurtosis-related pods across namespaces
kubectl get pods -A | grep kurtosis
# Check for problem pods (not Running)
kubectl get pods -A | grep kurtosis | grep -v Running
# Get events for a specific pod
kubectl describe pod <POD_NAME> -n <NAMESPACE> | tail -30
The pod can't be scheduled because of node taints, resource pressure, or affinity rules.
# Check node taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
# Check node conditions (DiskPressure, MemoryPressure, etc.)
kubectl get nodes -o custom-columns=NAME:.metadata.name,CONDITIONS:.status.conditions[*].type
Fix: Add tolerations to the kurtosis config at ~/Library/Application Support/kurtosis/kurtosis-config.yml or fix the node condition.
The image tag doesn't exist on the registry.
# Check which image is failing
kubectl describe pod <POD_NAME> -n <NAMESPACE> | grep -A5 "Image:"
# Verify image exists on Docker Hub
docker manifest inspect <IMAGE>:<TAG>
Fix: Push the correct image tag, or fix the image reference in the code.
The container starts but crashes immediately.
# Check container logs
kubectl logs <POD_NAME> -n <NAMESPACE>
kubectl logs <POD_NAME> -n <NAMESPACE> --previous
The node evicted the pod due to resource pressure.
# Check which nodes have pressure
kubectl get nodes -o custom-columns=NAME:.metadata.name,STATUS:.status.conditions[-1].type
# Clean up evicted pods
kubectl get pods -A | grep Evicted | awk '{print $2 " -n " $1}' | xargs -L1 kubectl delete pod
| Pod pattern | Component | Image source |
|-------------|-----------|-------------|
| kurtosis-engine-* | Engine server | engine/server/Dockerfile |
| kurtosis-api (in kt-* namespaces) | API Container (APIC) | core/server/Dockerfile |
| kurtosis-logs-collector-* | Fluentbit DaemonSet | Pulled from registry |
| kurtosis-logs-aggregator-* | Vector deployment | Pulled from registry |
| remove-dir-pod-* | Fluentbit cleanup pods | busybox |
| files-artifact-expander (init container) | Files artifacts | core/files_artifacts_expander/Dockerfile |
If kurtosis engine start fails:
kubectl get ns | grep kurtosiskubectl get ns | grep kurtosis | awk '{print $1}' | xargs -r kubectl delete nsThe logs collector is a DaemonSet that runs on every node. If some nodes are unhealthy:
# Check DaemonSet status
kubectl get ds -A | grep kurtosis
# See which pods are not running
kubectl get pods -A | grep logs-collector | grep -v Running
Nodes with DiskPressure or other taints may not schedule collector pods — this is expected and the engine should start with a warning about partially degraded collection.
development
Develop and debug Kurtosis Starlark packages. Create packages from scratch, understand the plan-based execution model, use print() debugging, handle future references, and test packages locally. Use when writing or troubleshooting .star files.
data-ai
Manage services in Kurtosis enclaves. Add, inspect, stop, start, remove, update services. View logs, shell into containers, and execute commands. Use when you need to interact with running services.
content-media
Run Starlark scripts and packages with kurtosis run. Covers all flags including dry-run, args-file, parallel execution, image download modes, verbosity levels, and production mode. Use when executing Kurtosis packages locally or from GitHub.
testing
Manage Kurtosis Portal for remote context access. Start, stop, and check status of the Portal daemon that enables communication with remote Kurtosis servers. Use when working with remote Kurtosis contexts.