skills/openshift/SKILL.md
Senior-level OpenShift and OKD guidance for installing, operating, debugging, upgrading, and securing clusters. Use when working with OpenShift Container Platform, OKD, MicroShift, oc, openshift-install, CVO, MCO, OLM, SCCs, Routes, OVN-Kubernetes, RHCOS/SCOS, disconnected or air-gapped installs, OperatorHub, ODF, OpenShift Virtualization, monitoring, logging, GitOps, Pipelines, Service Mesh, etcd, certificates, or day-2 cluster incidents.
npx skillsauth add mgajewskik/opencode-config openshiftInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Production-first guidance for OpenShift Container Platform, OKD, and related OpenShift-family distributions. Optimize for mechanism-level diagnosis, operator ownership, version-aware procedures, safe day-2 operations, and public-safe examples. Skip generic Kubernetes tutorials unless the user asks.
This skill distills a senior OpenShift operator corpus into progressive references. Load only the nearest reference for the task.
Classify the request first, then read the smallest useful reference.
oc get co, oc get clusterversion, oc get mcp, events, and relevant operator status before workload-level guessing.example.com, registry.internal.example.com, idp.example.com, and placeholder names.ClusterOperator, find the CR it watches, change the source of truth.Ask focused questions before final guidance when any are true:
Upgradeable=False, EUS, paused MCPs, quorum, restore, or node replacementHigh-value questions:
oc get co, oc get clusterversion, oc get mcp, relevant events/logs?Choose one primary mode and at most one secondary mode.
| Mode | Use when | Load |
|---|---|---|
| model | learning, architecture, Kubernetes-to-OpenShift translation, mental-model checks | references/mental-models.md |
| architecture | control/data plane, request flow, operator ownership, API groups, state boundaries | references/architecture-lifecycle.md |
| distribution | OCP vs OKD vs managed OpenShift vs MicroShift, support/lifecycle decisions | references/distributions-versions.md |
| install | IPI, UPI, ABI, Assisted, SNO, compact, HCP, install-config risk | references/install-airgap.md |
| airgap | oc-mirror, IDMS/ITMS, signatures, CatalogSource, OSUS, mirror registry | references/install-airgap.md |
| upgrade | CVO, channels, EUS, Upgradeable=False, conditional updates, MCO rollout | references/cvo-mco-rhcos.md + references/day2-runbooks.md for runbook detail |
| nodes | RHCOS/SCOS, MachineConfig, MCP, KubeletConfig, ContainerRuntimeConfig, node drift | references/cvo-mco-rhcos.md |
| operators | OLM, OperatorHub, Subscription, InstallPlan, CSV, catalog issues | references/olm-operators.md |
| network | OVN-K, DNS, NetworkPolicy, ANP/BANP, EgressIP, MetalLB | references/networking-routes.md |
| ingress | Routes, Ingress, router, wildcard certs, route 503s, Gateway API | references/networking-routes.md |
| security | SCC, PSA, OAuth, IdP, kubeadmin, service-account tokens, RBAC | references/security-identity-scc.md |
| storage | StorageClasses, LSO, LVMS, ODF, registry storage, snapshots | references/storage-virtualization.md |
| virtualization | OpenShift Virtualization, KubeVirt, VM storage/networking/live migration | references/storage-virtualization.md |
| addons | monitoring, logging, audit forwarding, Pipelines, GitOps, Service Mesh | references/platform-addons.md |
| incident | degraded CO, stuck MCP, image pull, etcd, certs, must-gather, node debug | references/day2-runbooks.md |
Common combinations:
incident + suspected subsystem (security, network, operators, nodes, storage)install + airgapupgrade + airgapsecurity + operators for operator pod admission failuresnetwork + ingress for route/LB/VIP/DNS failuresstorage + virtualization for VM live migration and PV designDefault response shape:
Verdict - likely layer or recommended pathWhy - mechanism-level OpenShift reasonSmallest safe path - probes first, then minimal change if warrantedRisks / edge cases - version, support, security, data, network, reboot, and operator ownership caveatsValidation - exact observations that prove convergenceRollback / next step - how to revert or the next probeMode-specific additions:
incident: add Likely owner, Read-only probes, Do not do yet, Stop conditionupgrade: add Preflight, Go/no-go gates, Operator compatibility, Etcd backup, Worker/MCP planairgap: add Mirror artifacts, Signatures, Catalogs, IDMS/ITMS, OSUS/update graphsecurity: add Subject, SA/SCC/RBAC path, Least-privilege alternative, Audit concernreview: use Verdict, Blockers, Risks, Evidence, Suggested fixes, Smallest next stepprivileged, granting SCCs to broad groups, or binding anyuid to a namespace default ServiceAccount as casual fixes.rendered-* MachineConfigs, operator-owned Deployments, static-pod manifests, or node files when a source CR exists.oc adm upgrade --force, --to-image, --allow-not-recommended, etcd restore, quorum restore, or manual static-pod surgery without explicit break-glass framing and approval.emptyDir, automatic InstallPlan approval, or an unbounded catalog mirror is production-safe.podman system prune guidance on OpenShift nodes; prefer documented CRI-O cleanup workflows and support docs.kubeadmin without a verified alternate cluster-admin and break-glass path.Pass when all are true:
Fail when any are true:
| Scenario | Detection | Fallback |
|---|---|---|
| Version unclear | No OCP/OKD minor/z-stream provided | Ask for oc version, oc get clusterversion, and relevant operator CSV versions |
| Vague incident | Only symptom provided | Start with oc get co, events sorted by timestamp, oc get mcp, and scoped oc adm inspect |
| Owner unclear | Multiple operators could own the surface | Find API group, namespace, ownerReferences, ClusterOperator status, and source CR before changing anything |
| Risky mutation requested | Upgrade, network, SCC, storage, node OS, pull-secret, cert, etcd, or identity change | Require preflight, approval, rollback/restore, and post-checks |
| Public artifact requested | Skill/doc/runbook content could expose context | Use generic examples and run anti-leak checks from day2-runbooks |
| External docs needed | Local source/version evidence is insufficient for high-impact behavior | Prefer versioned Red Hat docs, local oc explain, installed CLI help, or upstream source; label uncertainty |
documentation
Create senior-level deep research dossiers and roadmap companions. Use when the user asks for a dossier, senior research, deep research, in-depth research, mental models for a topic, senior perspective on a topic, how something actually works, ramp up on a topic, architectural deep dive, tradeoffs, failure modes, or what a senior would notice. Produces current-directory research-* and roadmap-* markdown artifacts, not a tutorial or short summary.
development
Senior-level Knative and OpenShift Serverless guidance for Serving, Eventing, Functions, autoscaling, scale-to-zero, CloudEvents, RabbitMQ/Kafka sources, Lambda migration, Harbor/OCI images, debugging, operations, and production rollout. Use when working with Knative Service, Revision, Route, KPA, activator, queue-proxy, Broker, Trigger, Source, Sink, kn func, OpenShift Serverless, Kourier, eventing-rabbitmq, Knative Kafka, or serverless workloads on Kubernetes/OpenShift.
development
Senior-level RHEL-family Linux operations. Use when running, debugging, hardening, patching, installing, upgrading, or operating Red Hat Enterprise Linux, Rocky Linux, AlmaLinux, CentOS Stream, Fedora-as-upstream, or related enterprise Linux hosts: systemd, RPM/DNF, SELinux, NetworkManager, firewalld, storage, kernel/kdump, FIPS/STIG, Satellite, IdM, Podman, bootc, air-gapped fleets.
development
Senior-level Proxmox VE guidance for VM creation, templates, storage, ZFS, Ceph, networking, clusters, HA, PBS backups, debugging, upgrades, security, and production/homelab operations. Use when working with Proxmox, PVE, Proxmox VE, qm, pct, pvesm, pvecm, pmxcfs, HA manager, Proxmox Backup Server, VM migration, Proxmox incidents, or Ceph/ZFS/Corosync/VLAN bridges in a Proxmox VE context.