plugin/skills/infra-change/SKILL.md
Use this skill when changing infrastructure config (Terraform, Helm, K8s) and a destructive or production-touching operation is involved — to run the infrastructure change workflow (Terraform plan/apply, Helm diff/upgrade, Kubernetes manifest changes) with mandatory approval gates. Applies DevOps role. Safe-by-default with plan review before any mutation.
npx skillsauth add avav25/ai-assets infra-changeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Safe infrastructure change workflow for Terraform, Helm, and Kubernetes. Every mutation requires explicit user approval after reviewing the plan/diff. Applies Agent(devops-engineer) for all steps.
⚠️ SAFETY: No apply, upgrade, delete, or scale command runs without explicit user APPROVE.
Read CLAUDE.md (or AGENTS.md) at the project root to identify:
cloud-platforms skill for platform-specific commandsBefore running terraform apply or helm upgrade directly, check whether the repo's infra changes flow through a controller. If so, the change is a git PR, not an imperative command.
GitOps platform detection (Argo / Flux / Atlantis / HCP / Spacelift / env0) — see @gitops-detection.
OpenTofu fork detection + command substitution rules — see @terraform-procedures.
If .conftest/ (Conftest), .opa/ (raw OPA Rego), .tflint.hcl (TFLint), or tfsec.yml (tfsec) / .checkov.yml (Checkov) configs are present, run them as a pre-plan gate. A plan that violates policy never advances to apply.
Ask the user (or extract from parent workflow context):
If Risk = HIGH or Environment = production:
⚠️ HIGH-RISK INFRASTRUCTURE CHANGE
Destructive actions require explicit "APPROVE" before execution.
Affected: [list resources/services]
Apply Agent(devops-engineer). For production changes, also apply Agent(sre-engineer) for SLO impact assessment. For cloud infrastructure design (landing zones, networking topology, IAM), consult Agent(cloud-architect). For CI/CD pipeline architecture changes, consult Agent(devops-architect).
Before making changes, capture the current state:
Terraform state operations + plan/apply lifecycle + OpenTofu compatibility — see @terraform-procedures.
Helm diff + atomic upgrade + rollback procedures — see @helm-procedures.
// turbo
kubectl get deployments,services,ingress -n <namespace>
kubectl get pods -n <namespace> -o wide
Record: Current resource counts, versions, replicas, config values as baseline.
Make the infrastructure code changes following Agent(devops-engineer) standards:
For Terraform:
.tf files as needed// turbo
terraform fmt -recursive
terraform validate
For Helm:
values.yaml or chart templates// turbo
helm lint <chart-path>
For raw Kubernetes manifests:
kubectl apply --dry-run=client -f <manifest>
Generate the plan/diff for the tool in use and present a summary table to the user.
@terraform-procedures for terraform plan -out=tfplan and the plan-summary format (Add / Change / Destroy table, destroy/replace highlight, data-loss flag).@helm-procedures for helm diff upgrade and the diff-summary format (per-object Action / Key Changes table, critical-changes highlight, downtime flag).kubectl diff -f <manifest>
WARNING — STOP. Present plan/diff summary and request APPROVE before proceeding to Step 6.
Only after the user explicitly approves:
terraform apply tfplan (see @terraform-procedures)helm upgrade ... --atomic --timeout 5m --wait (see @helm-procedures)kubectl apply -f <manifest>Rules:
After applying, verify the changes took effect:
// turbo
kubectl get pods -n <namespace> -o wide
kubectl get events -n <namespace> --sort-by='.lastTimestamp' --field-selector type!=Normal
Check:
If production — monitor SLIs for 5–10 minutes after apply.
Document the rollback before considering the change complete:
.tf files and re-apply — see @terraform-procedureshelm rollback <release> <previous-revision> -n <namespace> — see @helm-procedureskubectl rollout undo deployment/<name> -n <namespace>## Infrastructure Change Summary
- **Change**: [what was changed]
- **Tool**: [Terraform / Helm / kubectl]
- **Environment**: [dev / staging / production]
- **Risk**: [low / medium / high]
- **Plan reviewed**: [yes — summary of add/change/destroy]
- **Applied**: [yes/no — with APPROVE]
- **Verification**: [pass/fail]
- **Rollback plan**: [documented above]
- **Next steps**: [monitoring, follow-up changes]
For multi-step infra changes that may need iterative reconciliation (e.g., Terraform plans that depend on prior apply outputs), this skill MAY run inside /ralph per ralph-budget.md rule. Default per-workflow caps: 4 iter / 200K tokens / 45 min. Mandatory --kill-on oracle-pass (oracle: terraform plan -detailed-exitcode returns 0 = no diff).
Agent(devops-engineer) (primary), Agent(sre-engineer) (review), Agent(cloud-architect) (cloud design review), Agent(devops-architect) (CI/CD pipeline architecture)@terraform-procedures, @helm-procedures, @gitops-detection, @cloud-platforms/plan (infra work stream), /architecture (cloud architecture design)/deploy-staging, /deploy-productionralph-budget (caps for iterative reconciliation per "RALF Loop" section)development
Use this skill when running the recurring (daily) knowledge-base rescan for a repo that already has knowledge/.knowledge-sync.yml — the main-thread dispatcher that reads the config, computes the git delta since last_scanned_sha, maps changed paths to affected doc areas, early-exits cheaply when nothing changed, then fans out one Agent(content-writer) per affected area, applies the propose/direct update policy, advances the baseline only on success, and writes an L4 run log — all with the G1 untrusted-content choke-point, secret-scan, deny-list, and budget controls woven in. For first-time setup use /knowledge-sync-init.
development
Use this skill when bootstrapping scheduled knowledge-base sync for a repo that has no knowledge/.knowledge-sync.yml yet — to run one-time setup that detects the knowledge_root from CLAUDE.md/AGENTS.md, maps doc areas to source globs, records opt-in external sources (Linear/Notion/WebFetch, all disabled by default), captures a baseline last_scanned_sha, sets the per-area update policy, generates or seeds knowledge/CONVENTIONS.md, provisions the L4 memory dir, and offers to register the daily routine. Routes ongoing recurring sync operations to /knowledge-sync.
tools
Use this skill when bootstrapping a target repository to be ai-skills-aware — on the first run of any ai-skills workflow in a fresh repo, when adopting the ai-skills plugin in an existing repo, or after upgrading to a plugin version that adds new memory paths or templates, including when the user does not say "init" but asks to "set up" or "onboard" the repo — to detect codebase type, create CLAUDE.md + AGENTS.md scaffolding, initialize the .ai-skills-memory/ directory tree from L1 templates, and configure .gitignore. Idempotent — safe to re-run. Accepts `--codebase-type <type>` and `--overwrite`. Not for re-initializing only memory — use `/memory-init` instead.
tools
Use this skill when extending, repairing, or improving plugin assets, when ingesting a `/feedback` report as a fix-cycle backlog, or when you do not remember which lower-level command is right for the job — the umbrella workflow for ai-skills plugin-asset authoring and maintenance: creating, auditing, fixing, improving, refactoring, and migrating skills, agents, rules, hooks, prompts, schemas, and rubrics inside the plugin. Auto-classifies the request, loads the right knowledge skills (`@prompt-engineering`, `@context-engineering`, `@team-protocols`), and spawns the right subagents (`prompt-engineer`, `system-architect`, `python-engineer`, `software-engineer`, `qa-engineer`, `eval-judge`) via the `Agent` tool.