
Full cluster state audit. Use when asked for cluster status, health check, or infrastructure audit. Produces comprehensive report of all nodes, pods, services, PVCs, GPUs, and issues.
Deploy Kaizen components with safety checks and validation
Check GPU allocation and utilization across all nodes. Use when asked about GPUs, VRAM, or model capacity.
Development workflow for Kaizen - branching, commits, testing
Run E2E memory pipeline test — embed text, store in Qdrant, search and retrieve. Use to validate the memory system is working.
Resume from last checkpoint — show progress, cluster state, next actions
Check Kaizen system health across all nodes and services
Run the full integration test suite
Emergency rollback for Kaizen deployments
Deploy or redeploy an inference model on the cluster. Use when asked to deploy, update, or restart a model.
--- name: kaizen-infrastructure description: Kaizen infrastructure management. Use when working on Kubernetes manifests, Talos configuration, Flux GitOps, or cluster operations. Triggers on: k8s, kubernetes, flux, talos, cluster, node, deployment, helm, kustomize. allowed-tools: Read, Write, Bash, Grep, Glob --- # Kaizen Infrastructure Skill ## When to Use - Creating or modifying Kubernetes manifests - Configuring Talos Linux nodes - Setting up Flux GitOps resources - Deploying applications to
Launch autonomous overnight build session with Ralph Wiggum
Pre-commit validation suite for manifests, scripts, and configs