skills/kubeblocks-switchover/SKILL.md
Perform planned primary-secondary switchover for KubeBlocks database clusters via OpsRequest. Promotes a replica to primary with minimal downtime. Use when the user wants to promote a replica, switch primary, change leader, perform a planned failover, or do maintenance on the current primary node. NOT for unplanned failover recovery (handled automatically by HA middleware like Patroni, Orchestrator, or Sentinel) or restarting all pods (see kubeblocks-cluster-lifecycle).
npx skillsauth add apecloud/kubeblocks-skills kubeblocks-switchoverInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A switchover is a planned operation that promotes a secondary replica to primary and demotes the current primary to secondary. This is useful for maintenance, load rebalancing, or pre-failover testing.
Switchover is only available for addons with primary/secondary replication roles:
Official docs: https://kubeblocks.io/docs/preview/user_docs/handle-an-exception/switchover
Before proceeding, verify the cluster is healthy and no other operation is running:
# Cluster must be Running
kubectl get cluster <cluster-name> -n <namespace> -o jsonpath='{.status.phase}'
# No pending OpsRequests
kubectl get opsrequest -n <namespace> -l app.kubernetes.io/instance=<cluster-name> --field-selector=status.phase!=Succeed
If the cluster is not Running or has a pending OpsRequest, wait for it to complete before proceeding.
Verify the cluster has 2+ replicas and check current roles:
kubectl get pods -n <namespace> -l app.kubernetes.io/instance=<cluster-name> \
-o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.labels.kubeblocks\.io/role}{"\n"}{end}'
Switchover requires at least 2 replicas. If only 1 replica exists, scale out first (see horizontal-scaling).
- [ ] Step 1: Check current roles
- [ ] Step 2: Perform switchover via OpsRequest
- [ ] Step 3: Verify new roles
Identify which pod is the current primary:
kubectl get pods -n <ns> -l app.kubernetes.io/instance=<cluster> \
-o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.labels.kubeblocks\.io/role}{"\n"}{end}'
Example output:
mycluster-mysql-0 primary
mycluster-mysql-1 secondary
mycluster-mysql-2 secondary
apiVersion: operations.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
name: <cluster>-switchover
namespace: <ns>
spec:
clusterName: <cluster>
type: Switchover
switchover:
- componentName: <component>
instanceName: <current-primary-pod>
componentName: the component name from the Cluster spec (e.g. mysql, postgresql)instanceName: the pod name of the current primary that should be demotedBefore applying, validate with dry-run:
kubectl apply -f switchover-ops.yaml --dry-run=server
If dry-run reports errors, fix the YAML before proceeding.
Apply it:
kubectl apply -f switchover-ops.yaml
kubectl get ops <cluster>-switchover -n <ns> -w
Success condition:
.status.phase=Succeed| Typical: 1-3min | If stuck >5min:kubectl describe ops <cluster>-switchover -n <ns>
Status will progress: Pending → Running → Succeed
After the OpsRequest succeeds, confirm the roles have changed:
kubectl get pods -n <ns> -l app.kubernetes.io/instance=<cluster> \
-o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.labels.kubeblocks\.io/role}{"\n"}{end}'
Expected: a different pod should now have the primary role.
Also verify replication health:
# MySQL: check replication status
kubectl exec -it <new-secondary-pod> -n <ns> -- mysql -u root -p -e "SHOW REPLICA STATUS\G"
# PostgreSQL: check replication
kubectl exec -it <primary-pod> -n <ns> -- psql -U postgres -c "SELECT * FROM pg_stat_replication;"
A switchover is initiated by the operator (you) when both primary and secondaries are healthy — for example, before node maintenance, to rebalance load, or to test your HA setup. Because the current primary is still running, it can gracefully finish in-flight transactions and hand off leadership cleanly, resulting in minimal downtime (seconds).
A failover happens automatically when the primary crashes or becomes unreachable. HA middleware (Patroni for PostgreSQL, Orchestrator for MySQL, Sentinel for Redis, or the built-in MongoDB election) detects the failure and promotes a secondary without human intervention. You typically don't need to trigger a failover manually — the HA stack handles it.
While switchover is designed to be safe, creating a backup beforehand provides an extra safety net — especially if your application is sensitive to replication lag or you're performing switchover as part of a larger maintenance window.
Switchover OpsRequest fails:
Roles not updated after switchover:
kubectl describe ops <name> -n <ns>Replication lag after switchover:
kubectl exec -it <pod> -n <ns> -- <db-specific-replication-check>For per-engine switchover behaviors, HA middleware details (Orchestrator, Patroni, Sentinel), and complete replication health check commands for MySQL/PostgreSQL/Redis/MongoDB, see reference.md.
For general agent safety conventions (dry-run, status confirmation, production protection), see safety-patterns.md.
devops
Expand persistent volume storage for KubeBlocks database clusters via OpsRequest. Requires the StorageClass to support volume expansion (allowVolumeExpansion=true). Use when the user needs more disk space, wants to increase storage, expand volumes, or resize PVCs. NOT for changing CPU/memory (see vertical-scaling) or adding more replicas (see horizontal-scaling). Note that volume shrinking is not supported by Kubernetes.
data-ai
Scale CPU and memory resources for KubeBlocks database clusters via OpsRequest (vertical scaling). Supports in-place updates when the feature gate is enabled. Use when the user wants to change, increase, decrease, resize, or adjust CPU or memory resources of a database cluster. NOT for adding/removing replicas or shards (see horizontal-scaling) or expanding disk storage (see volume-expansion).
data-ai
Upgrade the KubeBlocks operator itself via Helm. Covers update operator, upgrade to v1.0, update kubeblocks version, and CRD updates. Use when the user wants to upgrade KubeBlocks, update the operator, or upgrade to a new KubeBlocks release. NOT for upgrading database engine versions (see minor-version-upgrade).
development
Diagnostic guide for KubeBlocks-managed database clusters. Use when the user reports troubleshoot, debug, diagnose, not working, error, failed, stuck, CrashLoopBackOff, cluster exception, or similar problems with their database cluster. This skill guides the agent through diagnostic steps — it does NOT perform actions.