skills/terraform-skill/SKILL.md
Use when writing, reviewing, or debugging Terraform/OpenTofu modules, tests, CI, scans, or state ops — diagnoses failure mode (identity churn, secrets, blast radius, CI drift, state corruption) with version-aware guards.
npx skillsauth add mgajewskik/opencode-config terraform-skillInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Diagnose-first guidance for Terraform and OpenTofu. Core file is a workflow; depth lives in references loaded on demand.
Every Terraform/OpenTofu response must include:
terraform or tofu), exact version, providers, state backend, execution path (local/CI/Cloud/Atlantis), environment criticality. State assumptions explicitly if the user did not provide them.fmt -check, validate, plan -out, policy check) tailored to runtime and risk tier.Never recommend direct production apply without a reviewed plan artifact and approval.
moved, import), CI changes, policy rules.| Failure category | Symptoms | Primary references |
|------------------|----------|--------------------|
| Identity churn | Resource addresses shift after refactor, count index churn, missing moved blocks | Code Patterns: count vs for_each, Code Patterns: moved blocks, Code Patterns: LLM mistakes |
| Secret exposure | Secrets in defaults, state, logs, CI artifacts | Security & Compliance, Code Patterns: write-only, State Management |
| Blast radius | Oversized stacks, shared prod/non-prod state, unsafe applies | State Management, Module Patterns |
| CI drift | Local plan ≠ CI plan, apply without reviewed artifact, unpinned versions | CI/CD Workflows, Code Patterns: versions |
| Compliance gaps | Missing policy stage, no approval model, no evidence retention | Security & Compliance, CI/CD Workflows |
| Testing blind spots | Plan-only validation of computed values, set-type indexing, mock/real confusion | Testing Frameworks |
| State corruption / recovery | Stuck lock, backend migration, drift reconciliation | State Management |
| Provider upgrade risk | Breaking-change provider bump, unpinned modules | Code Patterns: versions, Module Patterns |
| Provider lifecycle | Removing a provider with resources still in state, orphaned resources, removed block usage | State Management: Provider Removal |
Activate when: creating or reviewing Terraform/OpenTofu configurations or modules, setting up or debugging tests, structuring multi-environment deployments, implementing IaC CI/CD, choosing module patterns or state organization, configuring or migrating remote state backends.
Don't use for: basic HCL syntax questions Claude already knows, provider API reference (link to docs), cloud-platform questions unrelated to Terraform/OpenTofu.
| Type | When to Use | Scope | |------|-------------|-------| | Resource module | Single logical group of connected resources | VPC + subnets, SG + rules | | Infrastructure module | Collection of resource modules for a purpose | Multiple resource modules in one region/account | | Composition | Complete infrastructure | Spans multiple regions/accounts |
Flow: resource → resource module → infrastructure module → composition.
environments/ # prod/ staging/ dev/ — per-env configurations
modules/ # networking/ compute/ data/ — reusable modules
examples/ # minimal/ complete/ — docs + integration fixtures
Separate environments from modules. Use examples/ as both documentation and test fixtures. Keep modules small and single-responsibility.
See Module Patterns for architecture principles, naming conventions, variable/output contracts.
aws_instance.web_server, not aws_instance.main)this for genuine singleton resources onlyvpc_cidr_block, not cidr)main.tf, variables.tf, outputs.tf, versions.tfSee Module Patterns: Variable Naming and Code Patterns: Block Ordering for examples.
Resource blocks: count/for_each first → arguments → tags → depends_on → lifecycle.
Variable blocks: description → type → default → validation → nullable → sensitive.
See Code Patterns: Block Ordering & Structure for the full rules and examples.
| Situation | Approach | Tools | Cost |
|-----------|----------|-------|------|
| Quick syntax check | Static analysis | validate, fmt | Free |
| Pre-commit validation | Static + lint | validate, tflint, trivy, checkov | Free |
| Terraform 1.6+, simple logic | Native test framework | terraform test | Free-Low |
| Pre-1.6, or Go expertise | Integration testing | Terratest | Low-Med |
| Security/compliance focus | Policy as code | OPA, Sentinel | Free |
| Cost-sensitive workflow | Mock providers (1.7+) | Native tests + mocks | Free |
| Multi-cloud, complex | Full integration | Terratest + real infra | Med-High |
Before writing test code: validate resource schemas via Terraform MCP so assertions target real attributes.
command = plan — fast, for input-derived values onlycommand = apply — required for computed values (ARNs, generated names) and set-type nested blocks[0] — use for expressions or materialize via command = applySee Testing Frameworks for static-analysis pipelines, native-test patterns, Terratest integration, mock providers, and the full LLM-mistake checklist.
| Scenario | Use | Why |
|----------|-----|-----|
| Boolean condition (create / don't) | count = condition ? 1 : 0 | Optional singleton toggle |
| Items may be reordered or removed | for_each = toset(list) | Stable resource addresses |
| Reference by key | for_each = map | Named access |
| Multiple named resources | for_each | Better identity stability |
Never use list index as long-lived identity — removing a middle element reshuffles every address after it. For the decision matrix, safe migration playbook, moved block patterns, and known-at-plan failure cases, see Code Patterns: count vs for_each.
Using try() in a local to prefer a conditional resource's attribute over its parent is a specialized but high-value pattern — it forces correct deletion order without explicit depends_on. Common use: VPC + secondary CIDR associations + subnets.
See Code Patterns: Locals for Dependency Management for the full pattern and worked example.
Standard layout:
my-module/
├── README.md # Usage documentation
├── main.tf # Primary resources
├── variables.tf # Typed inputs with descriptions
├── outputs.tf # Output values
├── versions.tf # required_version + required_providers
├── examples/
│ ├── minimal/
│ └── complete/
└── tests/
└── module_test.tftest.hcl # or Go for Terratest
Variable contracts: always description, always explicit type, use validation for complex constraints, use sensitive = true for secrets, prefer optional() with typed defaults (1.3+) over untyped map(any).
Output contracts: always description, mark sensitive outputs, expose stable subsets (not whole provider objects).
See Module Patterns for the full contract patterns, module release checklist, and LLM-mistake checklist.
Pipeline stages: validate → test → plan → apply (with environment protection).
Cost control: mock providers on PR validation, real-cloud integration only on main or scheduled, tag test resources, auto-cleanup.
Drift prevention: pin runtime and providers, commit .terraform.lock.hcl, apply the reviewed plan artifact from the plan stage (do not re-run plan inside the apply job), run policy/security stage on every path to apply.
See CI/CD Workflows for GitHub Actions, GitLab CI, and Atlantis templates plus the LLM-mistake checklist.
Essential checks:
trivy config .
checkov -d .
Don't: store secrets in variables or .tfvars, use default VPC, skip encryption, open security groups to 0.0.0.0/0, use inline ingress/egress blocks in aws_security_group.
Do: source secrets from AWS Secrets Manager / Parameter Store or use write_only arguments on 1.11+, create dedicated VPCs, enforce encryption at rest and TLS, least-privilege SGs, use separate aws_vpc_security_group_{ingress,egress}_rule resources (AWS provider v5+).
Marking a variable sensitive = true masks display only — the value still lives in state. Use write_only / *_wo on 1.11+, or keep secret material out of Terraform entirely via runtime lookups.
See Security & Compliance for trivy/checkov pipelines, state-file hardening, compliance mappings, and the LLM-mistake checklist.
Never use local state in teams or production. Remote backends provide automatic locking, encryption, versioning, audit logging, and safe collaboration.
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "prod/vpc/terraform.tfstate"
region = "us-east-1"
encrypt = true
use_lockfile = true # Native S3 locking, 1.10+
}
}
On Terraform < 1.10, use dynamodb_table = "terraform-state-lock" instead of use_lockfile. Azure Storage, GCS, and Terraform Cloud all offer built-in locking — see the State Management reference for syntax.
| Pattern | Use When | Example Path |
|---------|----------|--------------|
| Per environment | Different teams per env | prod/terraform.tfstate, staging/... |
| Per component | Independent lifecycles | prod/vpc/, prod/eks/, prod/rds/ |
| Hybrid (recommended) | Both benefits | prod/networking/, prod/compute/, staging/networking/ |
Split state when: different teams, different update cadences, or >500 resources. Combine when: tightly coupled resources, <100 resources, same lifecycle.
See State Management for locking, migration, multi-team isolation, disaster recovery, and the LLM-mistake checklist.
| Component | Strategy | Example |
|-----------|----------|---------|
| Terraform runtime | Pin minor | required_version = "~> 1.9" |
| Providers | Pin major | version = "~> 5.0" |
| Modules (prod) | Pin exact | version = "5.1.2" |
| Modules (dev) | Allow patch | version = "~> 5.1" |
Commit .terraform.lock.hcl intentionally. Keep provider/runtime upgrades in a separate PR from functional changes. See Code Patterns: Version Management for constraint syntax and upgrade workflow.
| Feature | Min version | Common use |
|---------|-------------|------------|
| try() | 0.13+ | Safe fallbacks, replaces element(concat()) |
| nullable = false | 1.1+ | Prevent null silently overriding defaults |
| moved blocks | 1.1+ | Refactor without destroy/recreate |
| optional() with defaults | 1.3+ | Typed object attributes |
| import blocks | 1.5+ | Declarative imports, reviewable in VCS |
| check blocks | 1.5+ | Runtime assertions |
| Native terraform test | 1.6+ | Built-in test framework |
| Mock providers | 1.7+ | Cost-free unit testing |
| removed blocks | 1.7+ | Declarative resource removal |
| Provider-defined functions | 1.8+ | Provider-specific transformations (requires provider to declare functions) |
| Cross-variable validation | 1.9+ | Reference other var.* in validation blocks |
| write_only arguments | 1.11+ | Secrets never stored in state |
| S3 native lock-file | 1.10+ | State locking without DynamoDB |
Before emitting a feature, verify the runtime floor. See Code Patterns: Feature Guard Table for the full table with common LLM error patterns per feature.
terraform test / tofu test available — migrate simple unit tests, keep Terratest for complex integration.use_lockfile) is the correct default for new configurations — DynamoDB locking is no longer required.write_only arguments for secret handling keep credentials out of state.Progressive disclosure — essentials here, depth on demand:
terraform_remote_state rules, release checklistcount/for_each deep dive, modern features, version management, localsApache License 2.0. See LICENSE for full terms.
Copyright © 2026 Anton Babenko
documentation
Create senior-level deep research dossiers and roadmap companions. Use when the user asks for a dossier, senior research, deep research, in-depth research, mental models for a topic, senior perspective on a topic, how something actually works, ramp up on a topic, architectural deep dive, tradeoffs, failure modes, or what a senior would notice. Produces current-directory research-* and roadmap-* markdown artifacts, not a tutorial or short summary.
development
Senior-level Knative and OpenShift Serverless guidance for Serving, Eventing, Functions, autoscaling, scale-to-zero, CloudEvents, RabbitMQ/Kafka sources, Lambda migration, Harbor/OCI images, debugging, operations, and production rollout. Use when working with Knative Service, Revision, Route, KPA, activator, queue-proxy, Broker, Trigger, Source, Sink, kn func, OpenShift Serverless, Kourier, eventing-rabbitmq, Knative Kafka, or serverless workloads on Kubernetes/OpenShift.
development
Senior-level RHEL-family Linux operations. Use when running, debugging, hardening, patching, installing, upgrading, or operating Red Hat Enterprise Linux, Rocky Linux, AlmaLinux, CentOS Stream, Fedora-as-upstream, or related enterprise Linux hosts: systemd, RPM/DNF, SELinux, NetworkManager, firewalld, storage, kernel/kdump, FIPS/STIG, Satellite, IdM, Podman, bootc, air-gapped fleets.
development
Senior-level Proxmox VE guidance for VM creation, templates, storage, ZFS, Ceph, networking, clusters, HA, PBS backups, debugging, upgrades, security, and production/homelab operations. Use when working with Proxmox, PVE, Proxmox VE, qm, pct, pvesm, pvecm, pmxcfs, HA manager, Proxmox Backup Server, VM migration, Proxmox incidents, or Ceph/ZFS/Corosync/VLAN bridges in a Proxmox VE context.