sandbox-cli/skills/cluster-rotate/SKILL.md
This skill should be used when the user asks to "rotate a cluster", "replace a cluster", "swap clusters", "offboard old and onboard new cluster", "cluster rotation", "replace an old cluster with a new one", or "refresh cluster".
npx skillsauth add rhpds/rhdp-skills-marketplace sandbox-cli:cluster-rotateInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Name: Sandbox Cluster Rotation Description: Offboard an older OCP shared cluster and onboard a new replacement cluster in a single workflow.
Full end-to-end workflow for rotating a shared cluster: safely offboard an existing cluster and onboard its replacement. This is the most common operational workflow for maintaining shared cluster infrastructure on RHDP.
Check that sandbox-cli and oc are installed:
which sandbox-cli
which oc
If sandbox-cli is not installed, tell the user to run /sandbox-cli:sandbox-setup first and stop.
CRITICAL: Always verify VPN connectivity before any sandbox-cli operation.
host squid.redhat.com
If the DNS resolves (returns an IP address like 10.x.x.x), the user is on VPN. Proceed.
If it fails with NXDOMAIN, not found, or connection timed out, STOP and tell the user:
You are NOT connected to the Red Hat VPN. The sandbox API is IP-restricted and all commands will fail with EOF errors. Please connect to the Red Hat VPN before proceeding.
Do NOT proceed until VPN is confirmed.
sandbox-cli status
If not authenticated or token expired, tell the user to re-login:
sandbox-cli login --server <SERVER_URL> --token <TOKEN>
Ask the user for:
Old cluster to offboard:
sandbox-cli cluster list)New cluster to onboard:
https://api.cluster-xxxxx.dynamic.redhatworkshops.io:6443)Cluster config for the new cluster:
cluster-config.json, cluster-config-cnv.json)sandbox-cli cluster list
Help the user identify the old cluster to offboard. Note:
oc login --token=<OLD_CLUSTER_TOKEN> --server=<OLD_CLUSTER_API_URL>
sandbox-cli cluster offboard <OLD_CLUSTER_NAME>
If the old cluster is unreachable (VALID = NO):
sandbox-cli cluster offboard <OLD_CLUSTER_NAME> --force
Wait for completion. Expected output:
==> Offboarding cluster '<OLD_CLUSTER_NAME>'...
Offboard started for cluster <OLD_CLUSTER_NAME>. N placement(s) to process.
==> Waiting for offboard to complete...
Offboard completed successfully.
sandbox-cli cluster list
Confirm the old cluster is no longer in the list.
oc login --token=<NEW_CLUSTER_TOKEN> --server=<NEW_CLUSTER_API_URL>
Accept insecure TLS if prompted.
Verify:
oc whoami
If the user doesn't have a config file, create one. Use the same annotations as the old cluster for a like-for-like replacement.
Example for CNV dedicated (lab-specific):
{
"annotations": {
"cloud": "cnv-dedicated-shared",
"purpose": "dev",
"lab": "<lab-annotation>"
},
"deployer_admin_sa_token_ttl": "48h",
"deployer_admin_sa_token_refresh_interval": "24h",
"deployer_admin_sa_token_target_var": "cluster_admin_agnosticd_sa_token",
"skip_quota": true
}
Example for general shared cluster:
{
"annotations": {
"cloud": "cnv-dedicated-shared",
"purpose": "events",
"virt": "yes"
},
"deployer_admin_sa_token_ttl": "1h",
"deployer_admin_sa_token_refresh_interval": "30m",
"deployer_admin_sa_token_target_var": "cluster_admin_agnosticd_sa_token",
"max_placements": 30,
"settings": {
"provision_rate_limit": 50,
"provision_rate_window": "10m"
}
}
sandbox-cli cluster onboard <NEW_CLUSTER_NAME> --config <CONFIG_FILE>
The cluster name is optional -- extracted from the API URL if omitted.
Expected output:
==> Checking cluster access...
API URL: https://api.cluster-xxxxx:6443
Ingress: apps.cluster-xxxxx.example.com
Name: cluster-xxxxx
==> Creating service account...
Creating namespace 'rhdp-serviceaccounts'...
Creating service account 'sandbox-api-manager'...
Granting cluster-admin to 'sandbox-api-manager'...
Creating long-lived token (~10 years)...
Token created successfully.
==> Registering cluster with sandbox API...
OCP shared cluster configuration created
==> Cluster registered successfully.
AgnosticV cloud_selector:
lab: <lab-annotation>
purpose: dev
sandbox-cli cluster get <NEW_CLUSTER_NAME>
Verify:
valid is trueannotations are correctapi_url and ingress_domain look rightsandbox-cli placement dry-run --selector 'lab=<LAB>,purpose=<PURPOSE>'
Or with an AgnosticV catalog file:
sandbox-cli placement dry-run -f <path-to-common.yaml>
Expected:
Result: MATCH
Matching clusters: 1
- cluster-xxxxx
sandbox-cli cluster list
The new cluster should appear with VALID = yes and 0 placements.
Report to the user:
<OLD_CLUSTER_NAME> (N placements cleaned up)<NEW_CLUSTER_NAME>--force flag for offboard should only be used when the old cluster is permanently unreachable.deployer_admin_sa_token_* fields are required if workloads need cluster-admin access.tools
Writes validate.yml playbooks using the validation_check Ansible plugin. Takes the content-reader task report and solve-writer actions as input, producing checks that verify student progress without manual steps or navigation instructions.
tools
Writes solve.yml playbooks from the structured task report produced by ftl:content-reader. Uses the automation priority ladder (k8s_exec → k8s → uri → wait_for → Playwright) to generate Ansible tasks that replicate what the student does in the lab.
development
Pushes solve.yml and validate.yml to a live RHDP showroom, restarts the pod, and runs the full test cycle (fresh validate → solve → validate again → idempotency check). Reports pass/fail per task with full output for debugging.
tools
AsciiDoc reader agent for the FTL lab validator. Reads a showroom .adoc module file, extracts executable code blocks (role="execute"), classifies each step by automation type, and outputs a structured task report for the solve-writer and validate-writer agents.