kubernetes-plugin/skills/helm-release-recovery/SKILL.md
Recover from failed Helm deployments — rollback, fix stuck states (pending-install/upgrade), atomic deployments. Use when the user mentions rollback, failed Helm upgrade, or stuck releases.
npx skillsauth add laurigates/claude-plugins helm-release-recoveryInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Comprehensive guidance for recovering from failed Helm deployments, rolling back releases, and managing stuck or corrupted release states.
| Use this skill when... | Use <sibling> instead when... |
|---|---|
| Rolling back a failed helm upgrade or restoring a previous revision | Use helm-release-management for normal install/upgrade/uninstall flows that aren't broken |
| Releasing a stuck pending-install or pending-upgrade state | Use helm-debugging when the underlying template, values, or rendered manifest is the actual root cause |
| Cleaning up corrupted release secrets or partial deployments | Use kubectl-debugging when recovery requires inspecting individual pods or nodes |
Use this skill automatically when:
# Rollback to previous revision (most recent successful)
helm rollback <release> --namespace <namespace>
# Rollback to specific revision number
helm rollback <release> 3 --namespace <namespace>
# Rollback with wait and atomic behavior
helm rollback <release> \
--namespace <namespace> \
--wait \
--timeout 5m \
--cleanup-on-fail
# Rollback without waiting (faster but less safe)
helm rollback <release> \
--namespace <namespace> \
--no-hooks
Key Flags:
--wait - Wait for resources to be ready--timeout - Maximum time to wait (default 5m)--cleanup-on-fail - Delete new resources on failed rollback--no-hooks - Skip running rollback hooks--force - Force resource updates through deletion/recreation--recreate-pods - Perform pods restart for the resource if applicable# View all revisions
helm history <release> --namespace <namespace>
# View detailed history (YAML format)
helm history <release> \
--namespace <namespace> \
--output yaml
# Limit number of revisions shown
helm history <release> \
--namespace <namespace> \
--max 10
History Output Fields:
# Check current release status
helm status <release> --namespace <namespace>
# Show deployed resources
helm status <release> \
--namespace <namespace> \
--show-resources
# Get status of specific revision
helm status <release> \
--namespace <namespace> \
--revision 5
Symptoms:
Recovery Steps:
# 1. Check release status
helm status myapp --namespace production
# 2. View history to identify good revision
helm history myapp --namespace production
# Output example:
# REVISION STATUS CHART DESCRIPTION
# 1 superseded myapp-1.0.0 Install complete
# 2 superseded myapp-1.1.0 Upgrade complete
# 3 deployed myapp-1.2.0 Upgrade "myapp" failed
# 3. Rollback to previous working revision (2)
helm rollback myapp 2 \
--namespace production \
--wait \
--timeout 5m
# 4. Verify rollback
helm history myapp --namespace production
# 5. Verify application health
kubectl get pods -n production -l app.kubernetes.io/instance=myapp
helm status myapp --namespace production
Symptoms:
helm list -n production
# NAME STATUS CHART
# myapp pending-upgrade myapp-1.0.0
Recovery Steps:
# 1. Check what's actually deployed
kubectl get all -n production -l app.kubernetes.io/instance=myapp
# 2. Check release history
helm history myapp --namespace production
# Option A: Rollback to previous working revision
helm rollback myapp <previous-working-revision> \
--namespace production \
--wait
# Option B: Force new upgrade to unstick
helm upgrade myapp ./chart \
--namespace production \
--force \
--wait \
--atomic
# Option C: If rollback fails, delete and reinstall
# WARNING: This will cause downtime
helm uninstall myapp --namespace production --keep-history
helm install myapp ./chart --namespace production --atomic
# 3. Verify recovery
helm status myapp --namespace production
For additional recovery scenarios (partial deployments, corrupted history, failed rollbacks, cascading failures), history management, atomic deployment patterns, recovery best practices, troubleshooting, and CI/CD integration, see REFERENCE.md.
| Context | Command |
|---------|---------|
| Release history (JSON) | helm history <release> -n <ns> --output json |
| Release status (JSON) | helm status <release> -n <ns> -o json |
| Revision values (JSON) | helm get values <release> -n <ns> --revision <N> -o json |
| Pod status (compact) | kubectl get pods -n <ns> -l app.kubernetes.io/instance=<release> -o wide |
| Helm secrets (list) | kubectl get secrets -n <ns> -l owner=helm,name=<release> -o json |
tools
Scaffold a new ComfyUI custom-node repo (pyproject, CI, release-please, vitest+pytest, JS extension skeleton) in the picker/gesture vein. Use when bootstrapping or init-ing a comfyui node pack.
tools
Orchestrate a ComfyUI node pack from idea to registry: scaffold, create + seed the repo, open the gitops adoption PR. Use when releasing or spinning up a new comfyui node pack.
testing
macOS EndpointSecurity/EDR high CPU & battery drain. Use when Kandji ESF / XProtect pegs a core; trace the exec storm via powermetrics + eslogger.
development
odiff pixel-by-pixel image diffing. Use when comparing screenshots, detecting visual regressions, diffing before/after PNGs, asserting golden images.