.claude/skills/deploy-model/SKILL.md
Deploy or redeploy an inference model on the cluster. Use when asked to deploy, update, or restart a model.
npx skillsauth add Dirty13itch/kaizen deploy-modelInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Deploy or manage inference models on Kaizen. Argument should be one of: reasoning, embedding, heavy, or a manifest path.
kubectl get nodeskubectl describe node <node> | grep -A3 nvidia.com/gpukubectl get pods -n inferencek8s/apps/inference/ — All inference deploymentsk8s/apps/inference/sglang-core.yaml — Heavy model (Qwen2.5-72B, TP=4, CORE)kubectl apply --dry-run=client -f <manifest>kubectl apply -f <manifest>kubectl rollout status -n inference deploy/<name> --timeout=300scurl http://10.10.10.10:<port>/v1/modelskubectl rollout restart -n inference deploy/<name>
kubectl rollout status -n inference deploy/<name>
testing
Pre-commit validation suite for manifests, scripts, and configs
testing
Run the full integration test suite
testing
Check Kaizen system health across all nodes and services
testing
Resume from last checkpoint — show progress, cluster state, next actions